149 30 24MB
English Pages 396 [392] Year 2020
Thomas Glotzbach
Navigation of Autonomous Marine Robots Novel Approaches Using Cooperating Teams
Navigation of Autonomous Marine Robots
Thomas Glotzbach
Navigation of Autonomous Marine Robots Novel Approaches Using Cooperating Teams
Thomas Glotzbach Gießen, Germany Habilitationsschrift Technische Universität Ilmenau, 2018
ISBN 978-3-658-30108-8 ISBN 978-3-658-30109-5 (eBook) https://doi.org/10.1007/978-3-658-30109-5 © Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer Vieweg imprint is published by the registered company Springer Fachmedien Wiesbaden GmbH part of Springer Nature. The registered company address is: Abraham-Lincoln-Str. 46, 65189 Wiesbaden, Germany
Acknowledgements A habilitation is a very extensive project to accomplish. It requires laborious work of the habilitation candidate, but also support of many other people. It seems indispensable for me to thank everyone who supported me on my way. At first, I would like to thank Professor Christoph Ament, University of Augsburg, who was my supervisor during his stay at the Technische Universität (TU) Ilmenau. Therefore, he was my main supporter on my way to the habilitation from a scientific point of view. Being the responsible professor for the research project MORPH, he granted me the freedom to represent TU Ilmenau within the project, which allowed me to gain a lot of experience in the areas of research and scientific management. On the other hand, he helped me with words and deeds, whenever this was necessary. Also, he supported me in the progress of gaining teaching experience on university level, amongst others by allowing me to teach my own lecture, which helped to enhance my teaching profile. I would also like to thank Prof. António Pascoal, Instituto Superior Tecnico (IST), Technical University of Lisbon. He was my supervisor during my stay at IST in the framework of my Marie Curie Intra European Fellowship project CONMAR. Before that, we cooperated in the course of the research project GREX, and afterwards we continued our cooperation in the research project MORPH. Prof. Pascoal introduced me into the area of navigation for marine robots and therefore supported my necessary change of research topics between PhD dissertation and habilitation. I also learnt a lot from him about the international cooperation in our field of research, and he supported me a lot in building international networks. For a time period of 14.5 years I was with the Technische University Ilmenau. I would like to give props to all colleagues which I collaborated with during this time. In the framework of this habilitation thesis, I especially thank Dipl.‐Ing. Sebastian Eckstein, who was a member of my research team within the MORPH‐project. He supported me in the execution of hardware in the loop‐simulations to validate my algorithms. Thanks are also due to the colleagues of the Instituto Superior Tecnico, Dynamical Systems and Ocean Robotics Lab (DSOR), for the wonderful cooperation during my stay as well as during the other mentioned research projects. The close cooperation with this group gave me the possibility to gain a lot of experience in the actual operation of marine robots in the real world – experience that is often difficult to obtain in the university environment. I would like to mention especially two persons: Mohammadreza Bayat supported the validation of my work in the framework of the CONMAR project by adapting my algorithms to the real marine robots. Naveena Crasta, PhD, who also was in my research team in TU Ilmenau for a while during the MORPH project, supported me in difficult mathematical questions. I would also like to gratefully acknowledge financial support of the European Commission in the framework of the research projects GREX (FP6‐IST‐2005‐2.5.3, project ID: 035223), funded by the Sixth Framework Programme, CONMAR (FP7‐PEOPLE, project ID: 255216), and MORPH (FP7‐ICT‐2011‐7, Project ID: 288704), both funded by the Seventh Framework Programme. The research I could perform during the latter two projects is the base for this habilitation thesis. Finally, it is very important to me to express my heartfelt thanks to my family and close friends for their neverending support. I would like to mention my parents, Erika and Bernhard Glotzbach, who have always been there for my family and me. And finally, a large Thank You
VI
Acknowledgements
from the bottom of my heart goes out to my wife Birgit and my two daughters Laura and Vanessa, for all of your moral assistance, for your support of my career, and for the acceptance of the fact that my working activities filled many weekends. I would not have been able to finalize the project ‘Habilitation’ without your support. Pohlheim, 19.02.2020 Prof. Dr.‐Ing. habil. Thomas Glotzbach
Danksagung Ein so umfangreiches Projekt wie eine Habilitation erfordert harte Arbeit des Habilitanden, aber auch Unterstützung vieler anderer Menschen. Es ist mir ein Bedürfnis, an dieser Stelle ein Dankeschön an diejenigen Personen zu richten, welche mich auf meinem Weg unterstützt haben. Mein erster Dank geht an Herrn Professor Christoph Ament, Universität Augsburg, welcher während seiner Tätigkeit an der Technischen Universität Ilmenau mein Fachvorgesetzter war. Entsprechend war er es, der mich aus wissenschaftlicher Sicht am meisten auf meinem Weg zur Habilitation begleitet hat. Als verantwortlicher Hochschullehrer für das Forschungsprojekt MORPH ließ er mir viele Freiheiten, die TU Ilmenau in diesem Projekt zu vertreten und entsprechend Erfahrungen in den Bereichen Forschung und wissenschaftliches Management zu sammeln, war aber auch immer mit Rat und Tat zur Stelle, wenn es nötig war. Auch unterstützte er mich im Sammeln von Lehrerfahrung, unter anderem durch Überlassen einer Vorlesung, um mein Lehrprofil schärfen zu können. Ebenso danke ich Herrn Professor António Pascoal, Instituto Superior Tecnico (IST), Technical University of Lisbon. Er war mein Betreuer während meines Aufenthaltes am IST im Rahmen meines Marie Curie Intra European Fellowships und dem dabei bearbeiteten Projekt CONMAR. Zuvor hatten wir bereits im Rahmen des Forschungsprojektes GREX eng zusammengearbeitet, und später setzten wir die sehr gute Kooperation im Forschungsprojekt MORPH fort. Professor Pascoal führte mich in das Themengebiet der Navigation maritimer Roboter ein und unterstützte mich dadurch bei der nötigen Veränderung des Forschungsschwerpunktes zwischen Promotion und Habilitation. Von ihm habe ich auch viel gelernt über internationale Zusammenarbeit in unserem Forschungsfeld, und er hat mich intensiv unterstützt beim Aufbau internationaler Netzwerke. Mein Dank geht auch an die Mitarbeiter an der Technischen Universität Ilmenau, mit denen ich während meiner insgesamt 14,5‐jährigen Tätigkeit an dieser Institution zusammenarbeiten konnte. Im Rahmen dieser Habilitation sei namentlich Herr Dipl.‐Ing. Sebastian Eckstein genannt, welcher Projektmitarbeiter im MORPH‐Projekt war. Ihm danke ich für seine Unterstützung bei der Durchführung der Hardware in the Loop‐Simulation. Auch den Kollegen des Instituto Superior Tecnico, Dynamical Systems and Ocean Robotics Lab (DSOR), Lissabon, danke ich herzlich für die stets wunderbare Zusammenarbeit während meines Forschungsaufenthaltes sowie während der anderen genannten Projekte. Die enge Kooperation mit den Mitgliedern dieser Gruppe gab mir die Möglichkeit, eine Menge Erfahrungen zu sammeln im Bereich des tatsächlichen Betriebes maritimer Roboter unter realen Bedingungen – Erfahrungen, welche man im universitären Umfeld oft nur schwer machen kann. Besonders zwei Personen möchte ich erwähnen: Mohammadreza Bayat unterstützte die Validierung meiner Arbeiten im Rahmen des CONMAR‐Projektes durch Adaption meiner Algorithmen an die realen Roboter. Naveena Crasta, PhD, der auch Mitglied meiner Forschungsgruppe an der TU Ilmenau während des MORPH‐Projektes war, unterstützte mich in schwierigen mathematischen Fragen. Für die finanzielle Unterstützung meiner Forschungstätigkeiten danke ich der Europäischen Kommission im Rahmen der Forschungsprojekte GREX (FP6‐IST‐2005‐2.5.3, project ID: 035223), unterstützt durch das Sixth Framework Programme, CONMAR (FP7‐PEOPLE, project ID: 255216), und MORPH (FP7‐ICT‐2011‐7, Project ID: 288704), beide unterstützt durch das
VIII
Danksagung
Seventh Framework Programme. Die Forschungstätigkeiten, welche ich im Rahmen der beiden letztgenannten Projekte durchführte, bilden die Basis dieser Habilitationsschrift. Zuletzt geht mein Dank an mein privates Umfeld, an Familie und Freunde, die mich unterstützt haben. Besonders möchte ich meine Eltern erwähnen, Erika und Bernhard Glotzbach, welche immer in jeglicher Hinsicht für mich und meine Familie da waren. Schließlich danke ich von ganzem Herzen meiner Frau Birgit und meinen Töchtern Laura und Vanessa, für die moralische Unterstützung, für die Unterstützung meiner beruflichen Karriere und für das Hinnehmen vieler Überstunden und durchgearbeiteter Wochenenden. Ohne Eure Hilfe hätte ich das Projekt Habilitation nicht abschließen können. Pohlheim, 19.02.2020 Prof. Dr.‐Ing. habil. Thomas Glotzbach
Contents 1
Introduction ........................................................................................................................... 1 1.1
Autonomous Systems in Land, Air, and Water ............................................................... 1
1.2
Scope and Structure of This Thesis ................................................................................. 3
1.3
Single‐ and Team‐Oriented Approaches for Autonomous Systems ............................... 6
1.4
Review of Selected European Research Projects in Cooperative Marine Robotics ...... 15
1.4.1
GREX ...................................................................................................................... 15
1.4.2
CONMAR ................................................................................................................ 19
1.4.3
MORPH .................................................................................................................. 22
1.5 2
Contribution of This Thesis to the State of the Art ....................................................... 26
Navigation in Marine Robotics: Methods, Classification and State of the Art .................... 29 2.1
The Term ‘Navigation’ in Marine Robotics and Other Domains ................................... 29
2.2
Structure of Navigation Data in Marine Robotics ......................................................... 32
2.2.1
Inertial Reference Frame for Description of Position ............................................ 32
2.2.2
Body‐Fixed Frame for Description of Velocities and Forces/ Moments ................ 35
2.2.3
Coordination Transformations .............................................................................. 37
2.2.4
Physical Meaning of the iz‐ Coordinate ................................................................. 39
2.2.5
Difference between Heading and Course Angle ................................................... 40
2.2.6
Topological Navigation .......................................................................................... 42
2.3
Navigation, Guidance and Control in the Autonomous Control for Marine Robots ........................................................................................................................... 43
2.3.1
Model of the Marine Robot .................................................................................. 44
2.3.2
Navigation System ................................................................................................. 44
2.3.3
Guidance and Control System ............................................................................... 46
2.3.4
Example and Literature Study on Guidance and Control ...................................... 49
2.3.5
Requirements of the Navigation System for Guidance and Control ..................... 52
2.3.6
Summary of the Discussions on Navigation, Guidance, and Control .................... 54
2.4
Sensors and Methods for Navigation of Marine Robots .............................................. 55
2.4.1
Sensors With Direct Access to Navigation Data .................................................... 56
2.4.2
Navigation Based On Distance and/or Bearing Measurements to External Objects ................................................................................................................... 58
2.4.3
Mapping Based Methods ...................................................................................... 60
2.4.4
A Review of Filtering Techniques ........................................................................... 62
2.4.5
Cooperative Navigation ......................................................................................... 64
X
2.4.6
Introduction to the Problem of Optimal Sensor Placement (OSP) ....................... 64
2.4.7
Summary of Discussions on Navigation Procedures and Methods ....................... 65
2.5
3
4
Navigation Employing Acoustic Measurements ........................................................... 66
2.5.1
Long Baseline (LBL) ................................................................................................ 66
2.5.2
Single‐Beacon Navigation ...................................................................................... 68
2.5.3
Short Baseline (SBL) ............................................................................................... 68
2.5.4
Ultra‐Short Baseline (USBL) ................................................................................... 69
2.5.5
GPS Intelligent Buoys (GIB) .................................................................................... 72
Problem Formulation and Definitions for the Discussions to Follow .................................. 75 3.1
Two Different Concepts: Internal vs. External Navigation ............................................ 75
3.2
Problem Formulation .................................................................................................... 76
3.3
Benchmark Scenarios ................................................................................................... 82
3.3.1
Benchmark Scenario I: Supervision of a Diving Agent .......................................... 83
3.3.2
Benchmark Scenario II: Aided Navigation Within a Small Robot Pack .................. 83
3.3.3
Benchmark Scenario III: Range‐Based Navigation Within a Robot Pack With a Minimal Number of Members ............................................................................ 85
Mathematical Tools Used From the Areas of Control and Systems Engineering ................ 87 4.1
Basic Ideas and Concepts.............................................................................................. 87
4.1.1
The Terms ‘Signal’, ‘System’, and ‘Model’ and Their Most Important Features ................................................................................................................. 87
4.1.2
State Space Representation .................................................................................. 94
4.1.3
Time Discretization .............................................................................................. 104
4.2
Evaluation of Observability in State Space ................................................................. 109
4.2.1
Observability and Controlability of Linear Systems ............................................ 110
4.2.2
Design of Linear Observers .................................................................................. 117
4.2.3
Observability of Nonlinear Systems .................................................................... 122
4.2.4
Observability Gramian Matrix ............................................................................. 129
4.3
Parameter and Variable Estimation ............................................................................ 133
4.3.1
Basics of Stochastic Variables and Signals ........................................................... 133
4.3.2
Estimation Theory ............................................................................................... 153
4.3.3
State Estimation .................................................................................................. 167
4.4 5
Contents
Comparison Between Observation and Estimation ................................................... 203
Methods for Cooperative Navigation ................................................................................ 207 5.1
Static Navigation Problem .......................................................................................... 208
5.1.1
Problem Formulation .......................................................................................... 209
Contents
XI
5.1.2
On Parameter Estimation .................................................................................... 210
5.1.3
Position Estimation Based on Squared Range Measurements ........................... 217
5.1.4
Position Estimation by Minimizing the Maximum Likelihood Function .............. 227
5.1.5
Comparison and Evaluation................................................................................. 231
5.2
External Navigation: Supervision of a Diver by Three Surface Robots ....................... 237
5.2.1
General Setup ...................................................................................................... 237
5.2.2
Solution Copied from The GIB Concept............................................................... 238
5.2.3
Necessary Advances Beyond the GIB Concept .................................................... 246
5.2.4
Simulative Validation ........................................................................................... 251
5.2.5
Validation in Sea Trials ......................................................................................... 259
5.2.6
Conclusions and Further Paths ............................................................................ 264
5.3
6
Internal Navigation: Relative Position Estimation Within a Marine Robot Team ....... 264
5.3.1
Basic Mission Scenario Under Discussion ........................................................... 266
5.3.2
Modeling of the Acoustic Communication .......................................................... 267
5.3.3
Modeling of the USBL Measurements ................................................................ 269
5.3.4
Description of the Several Navigation Filters ...................................................... 271
5.3.5
Linear Kalman Filter for Velocity Estimation ....................................................... 272
5.3.6
Modeling and Estimation for a Linearized Approach .......................................... 273
5.3.7
Modeling and Estimation for a Nonlinear Approach Using an Unscented Kalman Filter ........................................................................................................ 280
5.3.8
Conclusions of Cooperative Navigation ............................................................... 284
Optimal Sensor Placement in Marine Robotics ................................................................ 287 6.1
The Concept of Optimal Sensor Placement ................................................................ 287
6.2
Optimal Angular Configuration for Distance Measuring Sensors ............................... 289
6.2.1
Scenario Under Discussion .................................................................................. 289
6.2.2
Computation of the Determinant of the Fisher Information Matrix (FIM) ......... 290
6.2.3
Optimal Angular Configuration of the ROs.......................................................... 296
6.3
Finding the Optimal Range for Distance Sensors With the Likelihood‐Function ....... 301
6.3.1
Overall Set‐Up ..................................................................................................... 301
6.3.2
Computation of the Optimal Range .................................................................... 302
6.3.3
Numerical Validation ........................................................................................... 307
6.4
Investigation On Observable States and Optimal Trajectory Based On Gramians ..... 309
6.4.1
Checking the Observability of Different Systems States in a Setup With Several ROs .......................................................................................................... 310
XII
6.4.2
6.5 7
Determining of an Optimal Trajectory for a Single Reference Object Based on Gramians ........................................................................................................ 317
Conclusion on the Research in Optimal Sensor Placement ........................................ 326
Combination of Cooperative Navigation and Optimal Sensor Placement ........................ 327 7.1
Basic Idea .................................................................................................................... 327
7.2
Simple Approach – Optimal Positioning of ROs to Maximize the Fisher Information ................................................................................................................. 328
7.2.1
Scenario Under Discussion .................................................................................. 329
7.2.2
Guidance Controller ............................................................................................ 329
7.2.3
Simulative Validation ........................................................................................... 331
7.3
8
Contents
STAP – Simultaneous Trajectory Planning and Position Estimation ........................... 335
7.3.1
Scenario Under Discussion .................................................................................. 335
7.3.2
Estimation Method and Guidance Controller ..................................................... 336
7.3.3
Simulative Validation ........................................................................................... 338
Conclusion and Outlook .................................................................................................... 343
Bibliography ............................................................................................................................... 347
List of Figures Figure 1‐1: (Total) autonomous and semi‐autonomous systems (Glotzbach, 2004a) ............... 7 Figure 1‐2: Adaptive Autonomy The level of autonomy can be changed (Glotzbach et al., 2007) ................................................................................................................... 8 Figure 1‐3: Interpretation of the suggested concept: a standard control loop with cascade control (Glotzbach et al., 2010) .................................................................. 9 Figure 1‐4: The Rational Behavior Model (RBM), (Glotzbach and Wernstedt, 2006) ................ 9 Figure 1‐5: Perfect Situation: A Team of AUVs in a coordinated formation (Glotzbach et al., 2007) ................................................................................................................. 10 Figure 1‐6: AUVs in a disturbed formation (Glotzbach et al., 2007) ........................................ 10 Figure 1‐7: A team of AUVs is avoiding an obstacle (Glotzbach et al., 2007) ........................... 11 Figure 1‐8: Accident of a single vehicle – New Mission Plans (Glotzbach et al., 2007) ........... 12 Figure 1‐9: Hierarchical realization of the team instance (Glotzbach and Wernstedt, 2006) ...................................................................................................................... 12 Figure 1‐10: Peripheral realization of the team instance (Glotzbach and Wernstedt, 2006) ...................................................................................................................... 13 Figure 1‐11: Different approaches for teams in the area of unmanned vehicles (Glotzbach et al., 2007) .......................................................................................... 14 Figure 1‐12: Relation of number of team members and boundary of autonomy in packs and swarms (according to Glotzbach, 2004b) ....................................................... 15 Figure 1‐13: Two of the GREX mission scenarios: Fish data download (left), Marine habitat mapping (right) .......................................................................................... 16 Figure 1‐14: Overview of the GREX main components .............................................................. 17 Figure 1‐15: GREX Final Trials in Sesimbra ................................................................................. 17 Figure 1‐16: Cooperative Path Following (CPF) .......................................................................... 18 Figure 1‐17: Cooperative Line of Sight Target Tracking (CLOSTT) ............................................... 18 Figure 1‐18: The basic inspiration of the research projects Co3‐AUVs and CONMAR ............... 20 Figure 1‐19: The general mission scenario of the CONMAR project .......................................... 21 Figure 1‐20: Maximizing the amount of information gained from range measurements by organizing the surface robots in an optimal manner ........................................ 21 Figure 1‐21: Bathymetry map and picture of vertical wall underwater ..................................... 22 Figure 1‐22: The MORPH Supra Vehicle at different cliff walls with the project logo on top .......................................................................................................................... 23 Figure 1‐23: Vehicles used in the sea trials of the MORPH projects .......................................... 24 Figure 1‐24: The MORPH supra vehicle; top view during sea trials ........................................... 25
XIV
List of Figures
Figure 1‐25: Video data of vertical wall, obtained during final trial (top), and derived 3D reconstruction (bottom) ........................................................................................ 25 Figure 2‐1: The Earth‐Centered Earth‐Fixed (ECEF)‐frame XEYEZE and the Geocentric Coordinate system , , r ....................................................................................... 33 Figure 2‐2: Difference between geocentric latitude and geodetic latitude displayed by a cut through the mean zero‐median of a sphere and an ellipsoid .................. 34 Figure 2‐3: The Earth‐Centered Earth‐Fixed (ECEF)‐frame XEYEZE, the Geodetic Coordinate system , , h, and the local North‐East‐Down (NED)‐frame XNYNZN ..................................................................................................................... 35 Figure 2‐4: Display of the defined frames and motion parameters for a marine robot according to the xyz‐convention for the rotation .................................................. 36 Figure 2‐5: The different meaning of the terms 'depth' and 'altitude' in marine robotics ...... 39 Figure 2‐6: Introduction of sideslip and crab angle ................................................................. 40 Figure 2‐7: Desired and real behavior of two marine robots while collecting video data in an area with strong sea currents (Eckstein et al., 2015) .................................... 42 Figure 2‐8: Scheme of the automatic control of an unmanned marine robot ........................ 45 Figure 2‐9: Configurations of the series or cascade compensation (a), feedforward compensation with additional trajectory generator (b), and cascade control (c) ........................................................................................................................... 46 Figure 2‐10: Typical Lawnmower maneuver to cover a defined area, e. g. in a mapping mission ................................................................................................................... 47 Figure 2‐11: Possible structure of a controlled vehicle behavior model in a simulation of an AUV (Schneider et al., 2007a) ............................................................................ 50 Figure 2‐12: Global/ Absolute navigation: Estimating a robot's position in a global reference ................................................................................................................ 52 Figure 2‐13: Relative Navigation: Estimating a robot’s position in a body‐fixed reference of another robot ..................................................................................................... 53 Figure 2‐14: Long Baseline (LBL) Navigation .............................................................................. 67 Figure 2‐15: Single Beacon Navigation ....................................................................................... 68 Figure 2‐16: Short Baseline (SBL) Navigation ............................................................................. 69 Figure 2‐17: Ultra‐Short Baseline (USBL) navigation .................................................................. 70 Figure 2‐18: Range r, bearing angle , and altitude angle obtained by an USBL system carried by vehicle i (yellow) ................................................................................... 71 Figure 2‐19: Comparison of the acoustic baseline systems ....................................................... 71 Figure 2‐20: GPS Intelligent Buoys (GIB) Navigation .................................................................. 72 Figure 3‐1: Internal Navigation: The pose/ velocity of the robot is measured/ estimated inside the vehicle ................................................................................................... 75 Figure 3‐2: External Navigation: The pose/ velocity of the robot is measured/ estimated from outside the vehicle ....................................................................... 76
List of Figures
XV
Figure 3‐3: Problem formulation: Range‐based navigation ..................................................... 77 Figure 3‐4: Problem formulation: Range‐ and bearing‐based navigation ................................ 80 Figure 3‐5: Benchmark Scenario I: Global and external navigation for target vehicle 0 by three surface reference objects, denoted as vehicles 1 ‐ 3 ................................... 83 Figure 3‐6: Benchmark Scenario II: Relative and internal navigation for vehicles 0 and 2 with respect to the surface vehicle 1 ..................................................................... 84 Figure 3‐7: Benchmark Scenario III: Global and external navigation for vehicle 0 and simultaneously trajectory planning for the surface vehicle 1 ................................ 85 Figure 4‐1: The system with its interfaces to the environment ............................................... 88 Figure 4‐2: Continuous time (left) and discrete time (right) output signal .............................. 89 Figure 4‐3: A stochastic signal: White noise with mean of 0 and variance of 1....................... 89 Figure 4‐4: Input (dark) and output (bright) signal of a causal (left) and of an anticausal system right ............................................................................................................ 91 Figure 4‐5: A ball on a floor as example for stable and unstable systems ............................... 92 Figure 4‐6: Input signal (left) as a combination of two parts and output signals of a linear system .......................................................................................................... 93 Figure 4‐7: For a time‐invariant system, an identical input signal and identical initial conditions will always result in an identical output signal, independent of the time t ................................................................................................................ 94 Figure 4‐8: Comparison of problem solution by state space description (solid arrows) and by transfer into the frequency domain (dashed arrows) ................................ 95 Figure 4‐9: Block diagram of the state space representation of a LTI‐system ......................... 97 Figure 4‐10: Block diagram of a state space representation in the controller canonical form ...................................................................................................................... 102 Figure 4‐11: Block diagram of a state space representation in the observer canonical form ...................................................................................................................... 103 Figure 4‐12: Continuous time and discrete time signals .......................................................... 104 Figure 4‐13: Definition of the derivative via the difference quotient ...................................... 105 Figure 4‐14: Mechanical system ............................................................................................... 108 Figure 4‐15: Step responses of the continuous‐time system and the two derived discrete‐time models ........................................................................................... 109 Figure 4‐16: System split into subsystems to demonstrate (un)observable/ (un)controllable parts .......................................................................................... 115 Figure 4‐17: States and output of a selected system for different initial conditions ............... 116 Figure 4‐18: Block diagram of the linear state observer (Luenberger observer) ..................... 117 Figure 4‐19: Observation error displayed as autonomous system .......................................... 119
XVI
List of Figures
Figure 4‐20: Observation (bright lines) of state 1 (solid line) and state 2 (dashed line) and real states (dark lines) of the example system for different pole positions of the observer ..................................................................................... 120 Figure 4‐21: Assessment of the pole placement of the observer ............................................ 121 Figure 4‐22: Indistinguishable States ....................................................................................... 123 Figure 4‐23: Overview of the different concepts of nonlinear observability and their implications .......................................................................................................... 125 Figure 4‐24: Energy transfer from states to outputs ................................................................ 130 Figure 4‐25: Venn diagrams of intersection, union and complement of events ...................... 135 Figure 4‐26: Thought experiment for the appearance of stochastic variables and their random variates ................................................................................................... 137 Figure 4‐27: Probability and distribution functions of discrete and continuous stochastic variables ............................................................................................................... 139 Figure 4‐28: PDF of two normally distributed stochastic variables with different variance .... 144 Figure 4‐29: Thought experiment for the appearance of stochastic signals ............................ 149 Figure 4‐30: Stochastic signals and corresponding autocorrelations ...................................... 152 Figure 4‐31: Model of the estimation process ......................................................................... 153 Figure 4‐32: Typical cost functions for the Bayes estimation: mean‐square error (left), absolute error (middle), uniform cost function (right), based on Van Trees, 2001...................................................................................................................... 155 Figure 4‐33: Results of the three introduced Bayes estimators at a distinct a posteriori PDF ....................................................................................................................... 158 Figure 4‐34: State space representation including process and measurement noise ............. 169 Figure 4‐35: Different estimates of a discrete time variable .................................................... 170 Figure 4‐36: A priori and a posteriori estimates of a Kalman filter with typical course of error variance ....................................................................................................... 171 Figure 4‐37: Simplified approach for the derivation of the a posteriori estimation ................ 175 Figure 4‐38: Block diagram of the linear discrete Kalman filter ............................................... 178 Figure 4‐39: Algorithm of the linear discrete time Kalman filter ............................................. 179 Figure 4‐40: States 1 (left) and 2 (right) of example systems in a numerical simulation with and without process noise ........................................................................... 180 Figure 4‐41: Original States and Kalman filter estimation for optimal filter parameters ........ 181 Figure 4‐42: Original States and Kalman filter estimation, too large values for matrix Q ....... 181 Figure 4‐43: Original States and Kalman filter estimation, too large values for matrix R ........ 181 Figure 4‐44: Original States and Kalman filter estimation, wrong initialization; color scheme is the same as in the previous figures .................................................... 182 Figure 4‐45: Block diagram of the linear continuous Kalman filter ......................................... 190
List of Figures
XVII
Figure 4‐46: Original States and Extended Kalman filter estimation for the nonlinear system; color scheme is the same as in Figure 4‐41 to Figure 4‐43 .................... 195 Figure 4‐47: The unscented transformation as employed for the a priori estimation of the UKF ................................................................................................................. 198 Figure 4‐48: Original States and Unscented Kalman filter estimation for the nonlinear system; color scheme is the same as in Figure 4‐41 to Figure 4‐43 .................... 202 Figure 4‐49: The principle of observation: Employ knowledge on model, inputs and outputs to obtain information on states, mainly in terms of unknown initial states .................................................................................................................... 204 Figure 4‐50: The principle of estimation: Employ knowledge on model, inputs and measurements to obtain information on states, mainly in terms of the influences of the process noise ............................................................................ 205 Figure 4‐51: Observation and Estimation and their usage within this thesis .......................... 206 Figure 5‐1: Introduction to chapter 5 ..................................................................................... 207 Figure 5‐2: The process of parameter estimation .................................................................. 210 Figure 5‐3: Probability density functions, based on numerical simulations, for real and simplified squared range measurement errors, with different relations between range r and single range measurement error variance r2 ................... 220 Figure 5‐4: Position of RO (red triangles) and target (green dot) for scenario 1, right: zoom into the area around the target with position estimations employing different approaches ............................................................................................ 234 Figure 5‐5: Position of RO (red triangles) and target (green dot) for scenario 2, right: zoom into the area around the target with position estimations employing different approaches ............................................................................................ 234 Figure 5‐6: Setup of ROs (red triangles) and target (green dot) and display of the cost functions with contour map below of the three different ML cost functions for scenario 3 ....................................................................................................... 235 Figure 5‐7: Setup of ROs (red triangles) and target (green dot) and display of the cost functions with contour map below of the three different ML cost functions for scenario 4 ....................................................................................................... 236 Figure 5‐8: Scenario for a diver assistant system (Glotzbach et al., 2012) ............................ 238 Figure 5‐9: Discrete‐time kinematic target model ................................................................. 240 Figure 5‐10: Emission and reception times of acoustic ping .................................................... 242 Figure 5‐11: Back and forward estimation approach (Glotzbach et al., 2012) ........................ 243 Figure 5‐12: Principle of the simplistic measurement model, employing a Medusa vehicle from Instituto Superior Técnico as reference .......................................... 247 Figure 5‐13: Principle of the advanced measurement model (Glotzbach et al., 2012) ........... 249 Figure 5‐14: User interface of the developed simulation tool (Glotzbach and Pascoal, 2011a) .................................................................................................................. 252 Figure 5‐15: GUI of the visualization tool (Glotzbach and Pascoal, 2011a) ............................. 253
XVIII
List of Figures
Figure 5‐16: Visualization tool, specific information about one time step (Glotzbach and Pascoal, 2011a) .................................................................................................... 254 Figure 5‐17: Zoom into the simulation results; dashed line: target path; solid line: position estimations (Glotzbach et al., 2012) ...................................................... 255 Figure 5‐18: Mean value [m] and variance [m2] of the absolute 2D position estimation error of the a priori estimations for the simulation runs according to Table 5‐1 and Table 5‐2, no communication losses (Glotzbach and Pascoal, 2011a) ... 256 Figure 5‐19: Mean value [m] and variance [m2] of the absolute 2D position estimation error of the a priori estimations for the simulation runs according to Table 5‐3 and Table 5‐4, 20% communication losses (Glotzbach and Pascoal, 2011a) .................................................................................................................. 258 Figure 5‐20: Mean number of performed a posteriori estimations for different simulation runs (Glotzbach and Pascoal, 2011a) ................................................. 258 Figure 5‐21: The marine robots of the Medusa type ............................................................... 260 Figure 5‐22: Mechanical specifications of the Diver Assistance System (DAS) (Glotzbach and Pascoal, 2011d) ............................................................................................. 261 Figure 5‐23: DAS mounted on the diver vest (Glotzbach and Pascoal, 2011d) ....................... 261 Figure 5‐24: a) Diver before went in the water ‐ b) Inside view of the mask – c) Complete DAS unit (Glotzbach and Pascoal, 2011d) ............................................................ 262 Figure 5‐25: Results from Sea Trials: MEDUSA paths (dash lines) and diver path (solid line) (Glotzbach et al., 2012) ................................................................................ 263 Figure 5‐26: Results from Sea Trials (Zoom) (Glotzbach et al., 2012) ...................................... 263 Figure 5‐27: Estimation Error (Distance between estimated and measured position) (Glotzbach et al., 2012) ........................................................................................ 263 Figure 5‐28: The MOPRH supra vehicle is approaching a wall ................................................. 265 Figure 5‐29: The upper MORPH segment ................................................................................ 266 Figure 5‐30: Communication cycle (Glotzbach et al., 2015a) .................................................. 267 Figure 5‐31: Details of a single communication interval .......................................................... 268 Figure 5‐32: Modeling of the USBL measurements (Glotzbach et al., 2015a) ......................... 269 Figure 5‐33: Results of the HIL simulation in the Mission Viewer (Glotzbach et al., 2015a) ... 279 Figure 5‐34: Position estimation error (Glotzbach et al., 2015a) ............................................. 279 Figure 5‐35: Paths of the vehicles for simulative validation (top view); SSV (#1, red), GCV (#0, blue), LSV (#2, green), axes represent x‐ and y‐ coordinates in m (Glotzbach et al., 2016) ........................................................................................ 282 Figure 5‐36: Performance comparison of Linear Kalman Filter (LKF) and Unscented Kalman Filter (UKF); Filter no. 1, Estimation error of horizontal position estimation for vehicle 1 ........................................................................................ 283 Figure 5‐37: Performance comparison of Linear Kalman Filter (LKF) and Unscented Kalman Filter (UKF); Filter no. 1, Estimation error of inertial velocity estimation for vehicle 1 ........................................................................................ 284
List of Figures
XIX
Figure 5‐38: Results of filter no. 1: Relative position of vehicle 1 (red) in the body‐fixed frame of vehicle 0 (blue); trace of true values (red) and estimates (black) by employment of linear Kalman filter (left) and Unscented Kalman filter (right), display of x‐y‐plane (top view), units of axes: m (Grebner, 2016) ........... 285 Figure 5‐39: Results of filter no. 3: Relative position of vehicle 2 (green) in the body‐ fixed frame of vehicle 0 (blue); trace of true values (green) and estimates (black) by employment of linear Kalman filter (left) and Unscented Kalman filter (right), display of x‐y‐plane (top view), units of axes: m (Grebner, 2016) ... 285 Figure 5‐40: Results of filter no. 4: Relative position of vehicle 1 (red) in the body‐fixed frame of vehicle 2 (green); trace of true values (red) and estimates (black) by employment of linear Kalman filter (left) and Unscented Kalman filter (right), display of x‐y‐plane (top view), units of axes: m (Grebner, 2016) ........... 285 Figure 6‐1: Introduction to chapter 6 ..................................................................................... 288 Figure 6‐2: Scenario for the Optimal Sensor Placement problem discussed in Martinéz and Bullo, 2006 .................................................................................................... 296 Figure 6‐3: Set‐up and parameter description for the 2D case (left) and the 3D case (right), only one RO is shown (Glotzbach et al., 2013) ........................................ 302 Figure 6‐4: Determinant of the Fisher Information as function of common range d of the ROs (Glotzbach et al., 2013) .......................................................................... 306 Figure 6‐5: Ranges used for the positioning of the ROs, top view ......................................... 307 Figure 6‐6: Estimation Error Mean (left) and Variance (right) for the 2D case (Glotzbach et al., 2013) .......................................................................................................... 309 Figure 6‐7: Estimation Error Mean (left) and Variance (right) for the 3D case (z0=‐10 m) and dashed control line at predicted optimum range dopt (Glotzbach et al., 2013) .................................................................................................................... 309 Figure 6‐8: Reference trajectory (closed ellipse) and trajectory executed by target (open ellipse) (Kästner, 2013) .............................................................................. 311 Figure 6‐9: Target Trajectory (ellipse) and positions of the Reference Objects (in the corners) (Kästner, 2013) ....................................................................................... 314 Figure 6‐10: Variation of the smallest eigenvalue of WO as a function of trajectory radius and number of ROs (Kästner, 2013) ..................................................................... 315 Figure 6‐11 a‐c (top to bottom, left to right): Optimized trajectory for one RO; static target position marked by star (Glotzbach et al., 2014b) .................................... 320 Figure 6‐12 a – c (left to right, top to bottom): Optimised trajectory for one RO; trace of moving target (in positive x direction) marked by circles (Glotzbach et al., 2014b) .................................................................................................................. 322 Figure 6‐13: Computation of the course angle ........................................................................ 323 Figure 6‐14: Possible trajectories for RO around the target position (marked by the red star) (Glotzbach et al., 2014b) .............................................................................. 324 Figure 6‐15: Trajectory after the final speed is reached (Glotzbach et al., 2014b) .................. 325 Figure 7‐1: Introduction to chapter 7 (Glotzbach et al., 2014b) ............................................ 328
XX
List of Figures
Figure 7‐2: The two different functions midpoint (f0.5) and midpoint Voronoi (f0.25), according to Martinez and Bullo, 2006 ................................................................ 330 Figure 7‐3: Scenario under discussion, top view (Glotzbach and Pascoal, 2011c) ................ 332 Figure 7‐4: The initialization process of the scenarios with the controlled ROs (Glotzbach and Pascoal, 2011c) ........................................................................... 333 Figure 7‐5: Mean and Variance of the error for the 2D a priori position estimation for the 2 x 10 simulations (Glotzbach and Pascoal, 2011c) ....................................... 335 Figure 7‐6: Mission Scenario under discussion (Glotzbach et al., 2015b) ............................. 336 Figure 7‐7: Discrete control options for RO in each interval (Glotzbach et al., 2015b) ......... 338 Figure 7‐8: Top view at mission begin (Glotzbach et al., 2015b) ........................................... 339 Figure 7‐9: Top view at a medium stage (Glotzbach et al., 2015b) ........................................ 339 Figure 7‐10: The influence of estimation errors to RO trajectory (Glotzbach et al., 2015b) ... 340 Figure 7‐11: Top view after almost one round (Glotzbach et al., 2015b) ................................ 340 Figure 7‐12: A priori estimation error of simulation number 1 (Glotzbach et al., 2015b) ....... 341
List of Tables Table 2‐1: Notation of movement and parameters for marine objects (SNAME, 1950) .......... 36 Table 2‐2: Typical sensors used for marine robot navigation, based on Kinsey et al., 2006 plus additional data ......................................................................................... 56 Table 4‐1: Overview of the introduced Bayes estimators ....................................................... 158 Table 5‐1: Simulation results (without communication losses) for the new simplistic measurement model .............................................................................................. 255 Table 5‐2: Simulation results (without communication losses) for the new advanced measurement model .............................................................................................. 256 Table 5‐3: Simulation results (with 20% communication losses) for the new simplistic measurement model .............................................................................................. 257 Table 5‐4: Simulation results (with 20% communication losses) for the new advanced measurement model .............................................................................................. 257 Table 5‐5: Overview of the equipment of the vehicles, the filters and the necessary communication load ............................................................................................... 271 Table 5‐6: Filter no. 1, error means and standard deviation in five independent runs; linear Kalman filter ................................................................................................. 283 Table 5‐7: Filter no. 1, error means and standard deviation in five independent runs; UKF ......................................................................................................................... 283 Table 6‐1: Possible ways to arrange three scalars in sequence .............................................. 296 Table 6‐2: Influence of system states on the observability properties with for four ROs (Kästner, 2013) ........................................................................................................ 316 Table 6‐3: Static target; Largest Minimum Eigenvalues of Empirical Gramian at selected time instances ......................................................................................................... 321 Table 6‐4: Moving target; Largest Minimum Eigenvalues of Empirical Gramian at selected time instances .......................................................................................... 322 Table 6‐5: Speed investigation; Largest Minimum Eigenvalues of Empirical Gramian at selected time instances .......................................................................................... 325 Table 7‐1: Simulation results for the reference scenario with randomly moving ROs ........... 334 Table 7‐2: Simulation results for the scenario with controlled ROs ........................................ 334 Table 7‐3: Simulation results for the STAP‐scenario ............................................................... 341
Abbreviations ABS
Absolute (error/ cost function)
ADCP
Acoustic Doppler Current Profiler
AHRS
Attitude Heading Reference System
ASV
Autonomous Surface Vehicle
AUV
Autonomous Underwater Vehicle
AWGN
Additive White Gaussian Noise
BIBO
Bounded‐input, Bounded‐output
C1V, C2V
Camera Vehicle no. 1 and 2
CDF
Cumulative Distribution Function
CG
Center of Gravity
CLOSTT
Cooperative Line of Sight Target Tracking
CLT
Central Limit Theorem
CMRE
Centre for Maritime Research and Experimentation
CML
Concurrent Mapping and Localization
CNR
Consiglio Nazionale delle Richerche
CONMAR
Cognitive Robotics: Cooperative Control and Navigation of Multiple Marine Robots for Assisted Human Diving Operations
COTS
Components Of The Shelf
CPF
Cooperative Path Following
CPU
Central Processing Unit
CRB
Cramer‐Rao Bound
CVL
Correlation Velocity Log
DEKF
Delayed Extended Kalman Filter
DMAHTC
Defense Mapping Agency Hydrographic Topographic Center
DOF
Degree Of Freedom
XXIV
Abbreviations
DR
Dead Reckoning
DTM
Digital Terrain Map
DVL
Doppler Velocity Log
ECEF
Earth‐Centered Earth‐Fixed (reference frame)
ECI
Earth‐Centered Inertial (reference frame)
EKF
Extended Kalman Filter
EOD
Explosive Ordnance Disposal
FIM
Fisher Information Matrix
FOG
Fibre Optic Gyroscop
GCV
Global Navigation & Navigation Vehicle
GIB
GPS Intelligent Buoys
GNSS
Global Navigation Satellite System
GPS
Global Positioning System
ID
Iterative Descent
IMAR
Institute of Marine Research
IMU
Inertial Measurement Unit
INS
Inertial Navigation System
IRF
Impulse Response Function
ISME
Interuniversity Center Integrated Systems for Marine Environment
ISR
Institute for Systems and Robotics
IST
Instituto Superior Técnico
KF
Kalman Filter
LBL
Long Baseline
LED
Light Emitting Diode
LS
Least Squares
LC‐C
Centered Least Square
Abbreviations
LKF
Linear Kalman Filter
LS‐CW
Centered Weighted Least Square
LS‐U
Unconstrained Least Square
LS‐UW
Unconstrained Weighted Least Square
LSV
Leading Sonar Vehicle
LTI
Linear and Time‐Invariant
MAP
Maximum A Posteriori (Estimator)
MIMO
Multiple Input, Multiple Output
MISO
Multiple Input, Single Output
ML
Maximum Likelihood (Estimator)
ML‐R
Maximum Likelihood with Ranges
ML‐SR
Maximum Likelihood with Squared Ranges
ML‐CSR
Maximum Likelihood with Centered Squared Ranges
MEMS
Microelectromechanical Systems
MORPH
Marine robotic system of self‐organizing, logically linked physical nodes
MS
Mean‐Square (error/ cost function)
NAVSTAR GPS
Navigational Satellite Timing and Ranging – Global Positioning System
NED
North‐East‐Down (reference frame)
NIMA
National Imagery and Mapping Agency
NLS
Nonlinear Least Squares
ODE
Ordinary Differential Equation
OSP
Optimal Sensor Placement
OWTT
One‐Way Travel Time navigation
PDE
Partial Differential Equation
PDF
Probability Density Function
PF
Particle Filter
XXV
XXVI
PMF
Probability Mass Function
RBM
Rational Behavior Model
RLG
Ring Laser Gyroscope
RMSE
Root Mean Square Error
RO
Reference Object
ROS
Robot Operating System
ROV
Remotely Operated Vehicle
RSS
Received Signal Strength
RWCA
Random Walk with Constant Acceleration
RWCTR
Random Walk with Constant Turning Rate
SBL
Short Baseline
SISO
Singe Input, Single Output
SIMO
Singe Input, Multiple Output
SLAM
Simultaneously Mapping and Localization
SNAME
Society of Naval Architects and Marine Engineers
SSV
Surface Support Vehicle
STAP
Simultaneous Trajectory Planning and Position Estimation
SVD
Singular Value Decomposition
TDOA
Time Difference Of Arrival
TIV
Time‐Invariant
TOA
Times Of Arrival
T‐SLAM
Topological Simultaneously Mapping and Localization
UKF
Unscented Kalman Filter
UNF
Uniform (cost function)
USBL
Ultra‐Short Baseline
Abbreviations
Abstract We are currently witnessing an increasing interest in the use of marine robots and their scientifical and commercial applications. Different users, from marine biology, geology, underwater archeology, different fields of industry, and the security area, are interested to employ both remotely operated and autonomous systems in order to fulfill their challenging tasks. In general, it can be stated that at the core of the different applications are the classical, but still actual motives for the use of robots: to preserve human beings from the necessity of performing laborious work in potentially life‐threatening areas, and to relieve them from monotonous and dull activities. However, there is yet a long way to go in terms of research and development, before the usage of autonomous marine robotic vehicles becomes a standard in the maritime environment. It is straightforward to state that specific conditions of the underwater environment cause several tremendous problems which are different from the ones in land and air robotics. The major problem is the lack of a broadband, reliable communications underwater and the impossibility to use GPS signals. Standard radio communication does not work at all, and the acoustic communication equipment currently available suffers from a lot of problems, such as multipath‐propagation and acoustic ray bending, which results in significant communication losses and reduced communication throughput. Every control concept must consider this fact explicitely from the beginning , at the design phase. This is especially problematic if the objective is to operate teams of robotic vehicles that must necessarilly communicate with each other. Another important requirement for the control of autonomous vehicles is the existence of a navigation solution. In marine robotics, this notation refers to the task of estimating position, orientation, and velocity parameters of the robots vehicles. This is an important difference in regards to land and air robotics, where the same notation may be used with a different meaning. The described problems in the underwater environment have caused the different interpretation of the word navigation. Underwater navigation also suffers from the described communication problems. There is no access to a Global Positioning System like GPS which is a standard for land and air systems. In addition, classical methods like visual‐based mapping procedures cannot easily be transferred from land to marine robotics, due to problems with visibility under water, and the potential absence of significant features for mapping purposes. Marine robots may carry different sensors to measure navigation‐related quantities, such as velocity or acceleration, but the computation of position requires mathematical integration. Due to the inevitable measurement errors, the position estimation will also exhibit an error that grows with time. External range measurements might be based on acoustics; this means that the described problems with acoustic communication will also influence the overall quality of the positioning process. Methods from the control and systems theory may help to improve the quality of the navigation solution. It is important to use every available information in an optimal manner to provide good and accurate data for the control algorithms, or for other mission related tasks like the creation of geo‐referenced maps. With the state space framework, control theory offers a powerful tool were there is a distinction between states and the information that might be accessible (outputs) to estimate them. A stochastic approach may be adopted to capture the non‐deterministic nature of typical measurement noise. Powerful tools like the
XXVIII
Abstract
Kalman filter can combine these methods and enable the estimation of variables that cannot be directly measured, or only be measured at low sample rates. In robotics, the idea of using a team of several autonomous robots has already some tradition in the scientific community. One of the general goals is that the team might develop skills that are way beyond the sum of the abilities of the single robots. This concept is also denoted as emergence of behaviours. It is one of the central ideas of the thesis at hand to employ this concept for the navigation of teams of autonomous marine robots. As one can imagine, the realization of robot teams places considerable demands, especially in the areas of control and navigation. One could imagine that special navigation solutions at the level of the single vehicles should be developed, before the realization of a cooperative team of autonomous robots might come into focus. In contrast, this thesis will pursue the idea that the accessibility to several robots may enable completely new possibilities for navigation, that lie way beyond the abilities of a single team member. The thesis will study different scenarios in the area of marine robotics, some of which are taken from literature, while the majority reflect scientific work done by the author or by students under his supervision. After a very general overview of the requirements of navigation and a description of the state of the art, several interesting problems will be formulated that follow the above described idea to improve navigation capabilities by the use of several cooperating robots. The necessary methodologies from the area of control and system theory will be presented in a separate chapter. Finally, several methods will be discussed to realize cooperative navigation of autonomous marine robots. The theory will be validated in sea trials, hardware in the loop simulations and basic simulations. The thesis aims to bring into sharp focus the fact that the navigation within marine robotics can benefit from the usage of cooperative teams in a way which can justify the increased effort to operate several vehicles at once. In this respect, the work will present several scenarios and discuss a reasonable modelling of the involved systems as well as the structure of the estimation algorithm, that may allow the reader to use similar solutions in comparable scenarios The thesis therefore aims to contribute to the advancement of marine robotic technology for a number of applications with strong commercial and scientific value. Additionally, the separate chapter on the methodologies used may be of interest for a reader with some basic knowledge in control theory to obtain a deeper insight in advanced concepts such as observability and state estimation, even without any background in marine robotics.
Zusammenfassung Gegenwärtig erleben wir einen Anstieg der Bedeutung maritimer Robotik sowie ihrer wissenschaftlichen und kommerziellen Anwendungen. Anwender aus unterschiedlichen Disziplinen, wie Meeresbiologie, ‐geologie, Unterwasserarchäologie, verschieden Bereiche der Industrie sowie aus dem Sicherheitsbereich sind daran interessiert, sowohl tele‐operierte als auch autonom agierende Systeme einzusetzen, um ihre anspruchsvollen Aufgaben zu erfüllen. Generell kann man sagen, dass die verschiedenen Anwendungen oftmals die klassischen Motive für den Einsatz von Robotiksystemen mit sich bringen: Menschen vor anstrengende Arbeiten in potentiell lebensgefährlichen Umgebungen zu bewahren, als auch sie davon zu entlasten, monotone und langweilige Aufgaben ausführen zu müssen. Allerdings liegt noch ein weiter Weg in Forschung und Entwicklung vor uns, bevor der Einsatz autonomer maritimer Roboter zum Standard in der maritimen Umgebung wird. Man kann sagen, dass die speziellen Bedingen der Unterwasserwelt mehrere gewaltige Probleme schaffen, welche sich deutlich von denen unterscheiden, die für Roboter an Land und in der Luft von Relevanz sind. Das zentrale Problem ist das Fehlen einer breitbandigen, verlässlichen Kommunikationsmöglichkeit unter Wasser sowie die Unmöglichkeit der Nutzung von GPS‐ Signalen. Standard Funkverbindungen funktionieren nicht, und die gegenwärtig erhältliche Ausrüstung zur akustischen Unterwasserkommunikation leidet unter einer Vielzahl von Schwierigkeiten, wie z.B. dem Mehrwegeempfang sowie der Biegung des akustischen Strahls, welche zu einer signifikant hohen Verlustrate akustischer Nachrichten sowie einer geringen Übertragungsrate führt. Jedes Regelungskonzept muss dieser Tatsache von Anfang an in der Designphase Rechnung tragen. Das ist besonders dann relevant, wenn Teams robotischer Systeme realisiert werden sollen, welche notwendigerweise kommunizieren müssen. Eine weitere wichtige Voraussetzung für die Regelung autonomer Systeme ist das Vorhandensein einer Navigationslösung. In der maritimen Robotik versteht man unter diesem Begriff die Aufgabe, Positions‐, Orientierungs‐, und Bewegungsparameter der robotischen Fahrzeuge zu bestimmen. Dies ist ein deutlicher Unterschied zur Land‐ und Luftdomäne, wo derselbe Begriff mit anderer Bedeutung verwendet wird. Die unterschiedlichen Bedeutungen des Begriffs Navigation sind auf die beschriebenen Probleme in der Unterwasserumgebung zurückzuführen. Navigation unter Wasser leidet ebenfalls unter den beschriebenen Kommunikationsschwierigkeiten. Es gibt keinen Zugriff auf ein globales Positionierungssystem wie GPS, welches in Land‐ und Luftbereich lägst Standard geworden ist. Auch andere klassische Prozeduren wie visuelles Mapping können nur unter Schwierigkeiten vom Land‐ in den Unterwasserbereich übertragen werden, aufgrund der oft schlechten Sichtbarkeit unter Wasser, und dem möglichen Fehlen signifikanter Features für den Mappingvorgang. Maritime Roboter tragen meist verschiedene Sensoren zur Messung navigationsbezogener Größen wie Geschwindigkeit oder Beschleunigung, aber die Umwandlung dieser Daten in Positionsdaten erfordert eine mathematische Integration. Aufgrund der unvermeidlichen Messfehler wird daher die Positionsschätzung einen mit der Zeit wachsenden Fehler aufweisen. Externe Entfernungsmessungen können beispielsweise auf akustischen Prinzipien beruhen, wodurch die bereits beschriebenen Probleme mit akustischer Kommunikation auch die Navigation negativ beeinflussen. Methoden aus der Regelungs‐ und Systemtechnik können dazu beitragen, die Qualität der Navigationslösung zu verbessern. Es ist von Bedeutung, jede verfügbare Information auf optimale Weise zu nutzen um gute und akkurate Navigationsdaten zur Verfügung zu stellen,
XXX
Zusammenfassung
etwa für den Regelungsalgorithmus, oder für andere missionsbezogene Aufgaben wie der Erstellung georeferenzierter Karten. Mit der Zustandsraumdarstellung verfügt die Regelungstechnik über ein mächtiges Werkzeug, in welchem unterschieden werden kann zwischen den Zuständen und den zugänglichen Informationen (Ausgänge), um die Zustände zu schatzen. Dieses Konzept kann mit stochastischen Ansätzen kombiniert werden, um das nicht‐ deterministische Verhalten von typischem Messrauschen zu reflektieren. Mächtige Tools wie das Kalmanfilter kombinieren diese Methoden und ermöglichen die Schätzung von Größen welche nicht oder nur mit geringen Abtastraten gemessen werden können. In der Robotik hat die Idee, ein Team aus mehreren autonomen Systemen einzusetzen, eine gewisse Tradition in der wissenschaftlichen Community. Eines der zentralen Ziele dieses Bestrebens ist es, dass das Team Fähigkeiten entwickeln könnte, welche über die Summe der Fähigkeiten der einzelnen Systeme hinausgeht. Diese Zielstellung wird auch als Emergenz des Verhaltens bezeichnet. Eine der zentralen Thesen der vorliegenden Habilitationsschrift ist es, dass dieses Konzept für die Navigation in Teams autonomer maritimer Roboter zur Anwendung kommen kann. Wie man sich vorstellen kann, stellt die Umsetzung von Roboterteams große Anforderungen, speziell im Bereich von Regelung und Navigation. Man könnte erwarten, dass zunächst spezielle Navigationslösungen auf Basis der einzelnen Roboter entwickelt werden müssten, bevor die Realisierung eines kooperierenden Teams autonomer Roboter umgesetzte werden kann. Demgegenüber verfolgt diese Habilitationsschrift die Idee, dass die Verfügbarkeit mehrerer Roboter komplett neue Möglichkeiten zur Navigation ermöglicht, welche weit über die Fähigkeiten einzelner Systeme hinausreichen. Im Rahmen dieser Arbeit werden verschiedene Szenarios aus dem Gebiet der maritimen Robotik untersucht; mache davon stammen aus Veröffentlichungen, während die meisten wissenschaftliche Arbeit des Autoren oder von Studenten unter seiner Aufsicht widerspiegeln. Nach einem sehr generellem Überblick über die Anforderungen von Navigation und der Darstellung des Stands der Technik werden verschiedene interessante Problemstellungen formuliert, welche der gerade dargelegten Idee folgen, die Navigationsmöglichkeiten durch den Einsatz mehrerer kooperierender Roboter zu verbessern. Die dazu nötigen Methoden aus dem Bereich der Regelungs‐ und Systemtechnik werden in einem separaten Kapitel präsentiert. Schließlich werden verschiedene Wege besprochen, wie kooperative Navigation für autonome maritime Systeme realisiert werden kann. Die Therie wird durch reale Seeversuche, Hardware in the Loop‐ Simulationen und grundlegende Simulationen validiert werden. Diese Schrift strebt danach, den Leser zu überzeugen, dass die Navigation innerhalb der maritimen Robotik vom Einsatz kooperierender Teams in einer Weise profitieren kann, die den erhöhten Aufwand rechtfertigt, welcher mit der Operation mehrerer Roboter verbunden ist. In diesem Rahmen werden verschieden Szenarien betrachtet und die nötige Modellbildung der beteiligten Systeme sowie die Struktur der Schätzalgorithmen besprochen, was es dem Leser ermöglichen sollte, ähnliche Lösungen in vergleichbaren Szenarien einzusetzen. So will die Arbeit zum Fortschritt der maritimen Robotertechnologie für Anwendungen mit starkem kommerziellen und wissenschaftlichen Interesse beitragen. Zusätzlich kann das erwähnte separate Kapitel über die verwendeten Methoden von Interesse sein für einen Leser mit Grundkenntnissen der Regelungstechnik , welcher ein tieferes Verständnis fortgeschrittener Konzepte anstrebt wie Beobachtbarkeit und Zustandsschätzung, selbst ohne Hintergrund in maritimer Robotik.
1 Introduction 1.1 Autonomous Systems in Land, Air, and Water The research on autonomous mobile systems has gained considerable momentum and importance in the last decades. Generally spoken, the goal is to enable an unmanned technical system to reach a predefined destination, perform a given task, and to return to a defined position. This task shall be performed usually with little or no human interaction. The usual motive for the utilization of an unmanned system is to relieve humans from the necessity to perform work denoted as ‘dull, dirty, or dangerous’. From the first research efforts, which were usually aiming to create mobile systems that could be remotely controlled or tele‐operated by humans, considerable work has been done, but yet the vision that large groups of autonomous systems can cooperatively work together in a highly self‐organizing way with little or no constant human supervision is still far away from being realizable. This is true for all of the relevant domains, which can mainly be separated into land, air, and water. It was straightforward to begin with the development of tele‐operated systems. Examples are the mobile robots used by police and military in the area of bomb disposal or Explosive Ordnance Disposal (EOD), or the tele‐operated marine robots denoted as Remotely Operated Vehicles (ROV). In both cases, the human operator could be kept away from dangerous places or places that are hard to reach. In the marine case, the robot could be built in a much smaller and therefore compact and easier‐to‐operate form, as if a human need to be on board. Yet, these systems which are usually connected to a central station by a cable or an umbilical can directly be controlled by the human, who has immediate access to all sensor information and/or live video footage, can use its superior intellectual abilities to make decisions, and can finally return control commands to the robot. Different aspects required the development of robots with a higher level of autonomy. A cable connection between robot and central station limits the operational range of the robot and hinders the employment in rough terrain. A tele‐operation without a cable connection requires a reliable, wide‐band wireless connection, which cannot always be guaranteed. Also, it is desirable that a human operator should not have to supervise the complete robot activity, like the movement to a mission area, to prevent the human operator from ‘dull’ activities. Especially for marine robotics, the development of Autonomous Underwater Vehicles (AUVs) was of big importance, as a teleoperation based on a wireless communication connection is not realizable, even with the current technology. The concept of using a whole team of individuals robots has come into the focus of research some time ago. The general idea is that the team might gain new abilities that are far beyond of what the single members would be able to achieve. The first idea of multi‐robot scenarios was related to missions in which several cooperative units have to be controlled in order to achieve a predefined goal in an optimal manner. Additionally, current research also includes non‐cooperative multi‐robot scenarios (like in the regular street traffic, in which every vehicle forms its individual ‘team’ and follows individual goals which might differ from the ones of other units), or even those with hostile individuals (in military scenarios). Current goals in the research and development of mobile systems are challenging. The realization of a complete autonomous car (or at least of the technologies necessary) is currently aimed at by several companies (Google respectively Waymo, see Waymo, 2017, or
© Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2020 T. Glotzbach, Navigation of Autonomous Marine Robots, https://doi.org/10.1007/978-3-658-30109-5_1
2
1. Introduction
Tesla, see Tesla, 2017, to name but a few). In the air domain, Unmanned Aerial Vehicles (UAV) are widely used by the military for surveillance, but in the meantime also for armed attacks. These weapon carrying crafts are denoted as Unmanned Combat Aerial Vehicles (UCAVs) or Unmanned Combat Aerial Systems (UCAS), and they are increasingly employed in military conflicts (Friedman, 2010). On the other hand, the small drones based on the quadcopter principle become more and more familiar in usage, especially in the private and free time area. Yet, also companies plan to use them commercially, e.g. for delivery, see Amazon, 2017 as example. In the marine domain, the availability of robot systems is still lower as in the other domains, because the demands in terms of robustness to operate a robot system is much higher and requires more experience. Nevertheless, various possible mission scenarios like the supervision of offshore operations like wind farms, the exploitation of minerals to be found on the sea floor at great depths like manganese nodules, the security of missions to protect harbors and other maritime buildings can be expected to set a significant demand on adequate technology and will further boost the research and development in the coming decade. While in all three domains several similar problems have to be solved, it can also be stated that very individual challenges exist. In land robotics, in scenarios which are not related to movement on streets, the locomotion of the system is of particular interest. Wheeled systems are only suitable on flat and firm terrain. Systems with chains have a better maneuverability in rough terrain, at the cost of higher energy consumption. For the movement in much unstructured environment, no widely employed solution is available yet. Even the overcoming of a stair is not a trivial problem. For the mentioned goal to automatize street cars, the evaluation of measurement data and the decision making has to be performed in very short time. Difficulties to correctly recognize the current situation might result in heavy accidents. As it became clear after the investigations of the first fatal crash of an autonomously driven street vehicle by Tesla in May 2016, the drivers are still required to constantly pay attention while driving and to supervise the work of the autopilot system (Mitchell, 2017). For unmanned aircrafts flying at high speeds, the same can be said. A big challenge is to quickly evaluate sensor data and make decisions, considering other flying objects in the vicinity. For the quadcopter robots, a general problem is the small payload capacity and the limitations on sensors and also on batteries, which in turn causes relatively short mission times. For marine robots, several severe problems exist. In terms of hardware, it is necessary to protect the electric parts from the water, while still considering limitations in size and weight of the vehicle. One of the most challenging difficulties is the lack of reliable and broad banded underwater communication, and all the consequences arising from this stringent constraint. Standard radio does not work underwater. So far, the most commonly used communication method is based on acoustic signals. This results in various problems, such as high dropout rates and low communication bandwidth, as a result of effects like multipath propagation, different salinity layers, climate conditions, to name but a few. It is inevitably to consider these limitations right from the start when designing a control concept, be it for a single vehicle or for a team. On the other hand, as a direct consequence the estimation of position, orientation, and movement parameters like speed is nontrivial. There is no access to a global positioning system, as available for air and outdoor land systems, and image based methods like Simultaneously Mapping and Localization (SLAM) which are widely used for land systems are
1.2 Scope and Structure of This Thesis
3
much harder to implement, due to the usually limited visibility underwater, and the common absent of objects with a significant structure, especially when travelling somewhere in the middle between sea bottom and surface. In fact, these problems have resulted in different definitions of the term ‘navigation’ in marine robotics in comparison to the other mentioned domains. We will deepen this discussion in section 2.1. Up to date, underwater robot navigation, which mainly means the estimation of position, orientation, and movement parameters is one of the most challenging problems, and all solutions published in literature are usually directly bounded to a certain mission scenario and not applicable to all types of missions. It is at this point that this thesis puts forward the following question: Can we get a benefit in terms of possible navigation techniques, if we employ a team of autonomous marine robots? At a first glance, one might think that the availability of robust and reliable navigation concepts is the precondition to enable the realization of teams of underwater robots. Thus, this thesis is built upon the point of view that the idea to bring several robots into a mission will enable new opportunities for underwater navigation that lie far behind the possibilities a single robot can use. It can be stated that navigation is a precondition for autonomous behavior of any mobile system and therefore also for the realization of a team, but at the same moment the establishment of a team can also in the first place make an adequate navigation possible. Both aspects, navigation and team behavior, can rely on each other at the same time. This thesis will provide a general discussion on the navigation of underwater robots, introduce the necessary mathematical methods for cooperative marine robot navigation from the area of control and systems theory, and finally present possible realizations for cooperative navigation in marine robotics, validated both in simulation and real sea trials, to proof its concepts and methodologies.
1.2 Scope and Structure of This Thesis The scope of this thesis is to provide an overview of techniques available for navigation in marine robotics, which is the measurement and estimation of position, orientation, and other motion parameters of the robots, with a focus on teams of at least two robots. The work in this field of the author is reported with respect to the international state of the art. This includes contributions in the area of Cooperative Navigation, Optimal Sensor Placement, and the combination of both, which lead to a new concept for which a new notation is suggested. A central point of the thesis is the following one: While the availability of navigation solution is a precondition for the practical realization of any autonomous robot scenario, be it in single or in team mode, it is also true to argue that the availability of robotic teams affords access to new navigation solutions, which rise tremendously above the possibilities of a single robot. This might be an argument for the usage of robot teams, which at first always rises the costs and the efforts, but has the potential to provide new ways for navigation. This central belief is expressed in the title: ‘Novel approaches using cooperating teams’. The thesis will study several scenarios, suggest suitable methods and therefore prove this central statement. The author intends to show that he has a comprehensive overview in his area of expertise and is capable of performing independent research. At the same time, the general description of methodology is summarized in a single chapter. This is to show that the author is also able to prepare and to present ambitious topics of the area of control engineering in an understandable way, and is therefore capable of organizing and delivering university lectures independently.
4
1. Introduction
This thesis is structured as follow: The first chapter gives an introduction to the topic of mobile robotics. In section 1.3, different concepts for the realization of autonomous systems in single and in team mode are discussed and compared. This section summarizes the core contributions of the PhD work (“Dissertationsschrift”) of the author to show that he performed a significant change in his research work, which is considered as a precondition for habilitation. Section 1.4 will provide an overview of the three international research projects to which the author contributed to, by being involved in writing the proposals, performing the research work as a member of either the Technische Universität Ilmenau, Germany, or the Instituto Superior Técnico, Lisbon, Portugal, and partially supervising students and other members. The scientific work reported in this thesis was performed in the framework of these projects. Finally, in section 1.5 the author summarizes his contributions to the state of the art, related to the work reported in this thesis. Chapter 2 has the main goal of introducing the reader to the topic of navigation in marine robotics and to provide an overview of the state of the art. At first, the exact definition of navigation to be followed in this thesis is given, and it is compared with the different meanings that the term ‘navigation’ has in other robotic domains. This is an important issue, as during cooperation between researchers from different domains it might lead to misunderstandings during discussions. In section 2.1, the different meanings will be compared, and an explanation will be given as to what led to these different interpretations. Finally, a definition will made that will be used in the remaining thesis, basically that navigation in marine robots is the process of obtaining relevant data on position, orientation, and motion of marine objects. With this definition given, the relevant navigation data will be described in more detail in section 2.2, defining relevant coordinate systems and describing how data can be transformed between them. Other relevant issues are discussed, such as the difference between depth and altitude or heading and course angle. A short discussion is made about topological navigation. Navigation is one of the important problems to be solved in order to realize autonomous marine robots, the others being guidance and control. Moreover, the latter two will set the requirements for navigation in terms of necessary availability, frequency, and accuracy. These issues are discussed in detail in section 2.3. Sections 2.4 and 2.5 provide an overview of state of the art in marine navigation. In the first section, relevant sensors and methodologies are discussed and compared. It will be reasoned why the methods based on acoustic measurements will be of main interest for the further course of the thesis, and an adequate discussions on available acoustic navigation concepts is given in section 2.5. The problem formulation and definitions for the scientific part of this thesis is given in chapter 3. At first, the concepts of Internal and External navigation are defined to classify different procedures, which might be chosen in accordance with the requirements in a certain scenario (e.g. whether navigation is supposed to be performed as base for the vehicle internal control, or whether it is done to supervise a submerged robot from a central station). In what follows, a notation of the navigation parameters is introduced which will be used for the scientific part. Also, several problems and benchmark scenarios are defined that will be used later. Chapter 4 is the abovementioned textbook chapter, combining the introduction of most mathematical methods to be used later. It is aimed to be understandable for master students who already have participated in a lecture on basic control engineering, including topics like closed loop control systems or system description using the Laplace transform. In the basic section 4.1, the terms signal, systems, and models are introduced, and their most important
1.2 Scope and Structure of This Thesis
5
features are discussed. In what follows, the state space representation is introduced, as well as the time discretization. The following section 4.2 is dedicated to the concept of observability in state space. The term observability is defined, as well as the close related controllability, and its evaluation is described in the context of linear systems. Also, the importance and design of linear observers is discussed. The discussion is then expanded to the observability of nonlinear systems, both for autonomous and general ones. As an alternative approach to observability investigations, the observability gramian matrix is introduced, both for linear systems and (as the empirical gramian) for nonlinear systems. The Gramian matrices will widely be used later in the scientific part of the thesis. Section 4.3 provides an overview of parameter and variable estimation. While observability is a deterministic concept, it is important to mention that real world processes require usually stochastic signals and systems to be modelled in an adequate way. To this extend, at first the basics on stochastic variables and systems are introduced, including important concepts like expected value, covariance matrices and correlation. Afterwards, the basics of estimation theory are described. In this respect, the concepts of Maximum Likelihood Function and Cramér‐Rao‐bound are introduced, which will be of importance for Optimal Sensor Placement in the scientific part. Finally, as an important tool for robot navigation data estimation, the Kalman filter is introduced, as well as the corresponding versions for nonlinear systems, the Extended Kalman filter and the Unscented Kalman filter. The final section 4.4 provides a discussion of the differences of observation and estimates and the usage in this thesis. The scientific part of the thesis comprises chapters 5 – 7. In Chapter 5, various concepts for Cooperative Navigation based on acoustic measurements are presented. In section 5.1, the situation for static objects is discussed, based on approaches taken from literature. Different concepts for static position estimation are presented, compared an evaluated. One of these methods will later be used in chapter 6 to perform numerical Monte Carlo simulations. Two concepts for the cooperative navigation of moving robots are presented in sections 5.2 and 5.3. Both are related to different problems and benchmark scenarios defined in chapter 3. For both cases, the basic mission scenario and conditions are described, the modelling and estimation algorithms are discussed. Validation is shown using simulations in MATLAB and HIL‐ simulations as well as a report on real sea trials. Section 5.3.8 concludes the chapter. In both scenarios discussed in chapter 5, the planning of the concrete movement of the vehicles is not part of the estimation process; it is assumed that the robots move according to their own mission plan or even randomly. This gives rise to the question whether the navigation results could be improved if the movement can actively be controlled, or if an active control could result in a lower number of vehicles necessary to perform the mission in order to save resources. To answer this question, research in the field of Optimal Sensor Placement is necessary. This topic is tackled in chapter 6. Section 6.1 presents the general idea behind Optimal Sensor Placement and gives comments to one of the most important critics, namely that the position of the target to be estimated must be known to compute optimal positions for the sensors. In section 6.2, an example from literature is discussed, where an optimal angular configuration of distance measuring sensors around the target is computed using the Maximum Likelihood and the Cramér Rao bound. These results are then transferred into a marine robot scenario in section 6.3, and the question is asked where to place a team of surface robots to maximize the amount of information gathered by range measurements to a target, especially which range should be chosen. The computed results are then validated by numerical Monte Carlo simulations, employing a
6
1. Introduction
method which was before introduced in section 5.1. It is shown that the optimal simulation result is indeed achieved at the previously computed range. The discussions continue in section 6.4, now allowing the movement of the robots. The empirical observability gramian is used as method. As described before, for this chapter we assume that the target position/ trajectory is known and can therefore be used for computation. In section 6.4.1, an introduction into the usage of the gramian is given by comparing the level of observability in dependence on which navigation parameters of the target must be estimated and which are assumed to be measureable. It is shown that a scenario which is widely used in marine robotics indeed shows the best performance in terms of observability. In section 6.4.2, the gramian approach is used to compute optimal trajectories for a robot that has the task to perform estimation of a target position based on range‐only measurements. This is done for several subscenarios, and the results of the simulations are compared with similar ones from literature. The conclusion of chapter 6 will be that the concept of Optimal Sensor Placement seems to be suitable in marine robot navigation. However, the results are not of practical use, as the target position/ trajectory was always assumed to be known, which it is not in reality. Nevertheless, the question arises whether it is possible to combine the concepts of Cooperative Navigation according to chapter 5 with the Optimal Sensor Placement according to chapter 6. The idea is to estimate the target position and simultaneously use the current estimation to perform Optimal Sensor Placement computation. This concept is discussed in chapter 7. After the basic idea is explained in section 7.1, a simple example is given in section 7.2. For this example, the estimation approach of section 5.2 is combined with results from section 6.3 and a method to compute optimal sensor positions from literature. Finally, in section 7.3 a report is given on an approach to combine the estimation procedure of section 5.2 with the trajectory planning concept of section 6.4.2. With this idea, it is possible to perform positon estimation to a submerged target based on range‐only measurements of only one surface craft which moves on an optimized trajectory. The concept is validated in MATLAB simulation. The notation Simultaneous Trajectory Planning and Position Estimation (STAP) is suggested for this method. Finally, chapter 8 concludes the thesis and gives an outlook on possible further research.
1.3 Single‐ and Team‐Oriented Approaches for Autonomous Systems In this section, some of the most important statements of the PhD dissertation thesis of the author are summarized. The thesis have been published as Glotzbach, 2009. Parts of the discussions in this section have been published in Glotzbach, 2004a, Glotzbach, 2004b, Glotzbach and Wernstedt, 2006, Glotzbach et al., 2007 and Glotzbach et al., 2010. This section is to show that the author significantly changed his research area after gaining the PhD degree. The author was studying control concepts for autonomous mobile systems, mainly in the land domain. During that studies, several question concerning the exact definition of the term ‘autonomous’ came across, as this term is used with different meanings in the robot community, and there is no unique and generally accepted definition. In general, if autonomy is seen as the ability to operate on its own, without interactions from the outside, then the following questions could arise: Can a robot be denoted as autonomous, if it receives information from other instances, that is, from outside of its own sensors, like e.g. a central station? If not, any robot using GPS
1.3 Single‐ and Team‐Oriented Approaches for Autonomous Systems
7
would not be autonomous. If yes, what if the robot receives direct steering commands from the central computer? Is it then still autonomous, or simply tele‐operated? Can a human operator intervene in the mission execution without destroying the autonomy? This question expands the aspects of the first point. One could argue that any intervention of the human cancels the autonomy, however, in some situations it might be of significant importance that certain activities are explicitly initiated by a human operator, e.g. if the robot carries weapons or if it is able to cause severe damage due to its size and/or velocity. On the other hand, even modern computers cannot compete with a human in cognition, so it could be desireable that the human capabilities are used. In terms of robot teams, is it not an antilogy to speak of a ‘cooperative team’ of ‘autonomous systems’? Again it can be discussed if a single system can be denoted as autonomous in a case where it receives its complete steering commands from a neighboring robot. The author suggested the concept of adaptive autonomy. It is based on the idea that the notation ‘autonomous’ is not understood in a discrete, but rather in a continuous manner as a description of the current robot state. The state can actually vary between 100% autonomous (total autonomous) and 0% autonomous (remote controlled, tele‐operated), while all states in between are referred to as ‘semi‐autonomous’. A semi‐autonomous robot might still be in contact and receive commands and information from a central computer or even a human. An important issue of the overall concept is that the level of autonomy of each robot can change during the mission. This might be a conscious decision of the robot control software (e.g. lower the autonomy level to ask the operator for help because the situation is unclear), or happen automatically (e.g. raise the level of autonomy in cases of communication losses to the central station).
Figure 1‐1: (Total) autonomous and semi‐autonomous systems (Glotzbach, 2004a)
The suggested concept offers answers to the questions raised above. The interaction of mobile robots, a central station and a human operator can be organized in way required by the
8
1. Introduction
current mission. This sets up the possibility to include human cognition capabilities and to allow the human to make critical decisions if necessary, but also relieve him from dull activities (one of the motives for the introduction of autonomous robots according to section 1.1), which the robot can handle on its own. The proposed notation is shown in Figure 1‐1: The single robots are referred to as semi‐ autonomous systems. The central computer and the human operator are considered to be a part or the (total) autonomous system, yet the system borders are flexible and can be changed, e.g. to exclude the human in cases in which his interaction is not required. The central computer (or central station) shown in the figure can also be a metaphor for the team controlling software that might be realized inside of one of the systems, as software running on its computational hardware.
Figure 1‐2: Adaptive Autonomy The level of autonomy can be changed (Glotzbach et al., 2007)
On base of this, the concept of adaptive autonomy can be defined as the ability of the semi‐ autonomous robots to change their individual level of autonomy during the mission. This task is performed by an adapter which is part of the overall control software, as depicted in Figure 1‐2. The adapter has access to information about the current situation via the sensor information of its own vehicle as well as the concrete task, which is usually described within a mission plan. Based on these information, it has to compute the current level of autonomy in the range between 0% and 100%. This can be done in a discrete manner (where only a limited number of levels with exactly defined activities which the robot may or may not execute) or in a continuous one, where e.g. the exact value limits the permissible speed the robot is allowed to move at. If the control software is realized in the described way, the behavior of the robots can the compared with the cascade control concept of control theory, as shown in Figure 1‐3. In the inner loop, the systems can react very quickly to the current situation. But as they only have access to local information, which means those recorded by their own sensors, the accuracy of their activities in comparison to their mission goals might be limited. If the adapter lowers the autonomy level and allows an interaction with the central computer, more time in required due to the delay of the radio communication. In exchange, the central computer will usually possess more detailed information and has a greater computing power; thus enabling the robots to achieve better results. A human operator needs even more time for recognition and
1.3 Single‐ and Team‐Oriented Approaches for Autonomous Systems
9
deciding, but he has the best capabilities in cognition. It is therefore true to say that the outer loops are more precise than the inner loops, but the inner loops are faster – exactly like in cascade control.
Figure 1‐3: Interpretation of the suggested concept: a standard control loop with cascade control (Glotzbach et al., 2010)
Figure 1‐4: The Rational Behavior Model (RBM), (Glotzbach and Wernstedt, 2006)
In general, different approaches for the realization of the control software of mobile robots and systems can be found in literature. A hierarchical concept is proposed in Albus et al., 1989, as an example. Brooks, 1991, suggested a peripheral concept. For hierarchical realizations, architectures following the so called Rational Behavior Model (RBM) are often employed. The RBM was introduced by Kwak et al., 1992, and further developed by Byrnes, 1993. It consists of the three levels strategy, tactic and execution, as displayed in Figure 1‐4, which also shows the tasks typically linked to each level, from a marine robotic point of view. The Autopilot is part of the Executive Level. It is directly linked to the actuators of the vehicles and therefore needs to create direct propulsion commands, on a level of very low abstraction, but with high frequencies. The Tactical Level is responsible for the execution and supervision of whole
10
1. Introduction
maneuvers, which means it will act with a lower frequency and at a higher level of abstraction. The monitoring of the overall mission plan as well as adjusting and replanning if necessary is performed at the Strategical Level. Marine robots usually possess a pre‐planned mission plan, containing their task during the autonomous operation phase. The RBM architecture has successfully been employed in projects dealing with the autonomy of single marine robots, e.g. in Pfützenreuter, 2003. Therefore, it was also used as base for team behavior, e.g. in the research project GREX (s. section 1.4.1). How can the idea of ‘level of autonomy’ and cooperation in a team be combined? To this extent, we assume that every vehicle possesses its own level of autonomy as displayed in Figure 1‐2. The current level can be computed by the control software which has access to the vehicle sensors, the mission plan as well as the communication with the other crafts. As a representation of the current cooperation, we also introduce a ‘team instance’ which also has a level of autonomy attached to it. The team instance can be understood as a metaphor for the software responsible for the cooperation between the vehicles. It might run on the computational hardware on board of the leading vehicle or peripheral at several vehicles (see Figure 1‐9 and Figure 1‐10). With the autonomy level of the team instance being designed in the same way as the ones for the single vehicles, the mentioned antilogy between ‘single autonomy’ and ‘cooperation’ is solved. This is demonstrated in the following small examples:
Figure 1‐5: Perfect Situation: A Team of AUVs in a coordinated formation (Glotzbach et al., 2007)
Figure 1‐6: AUVs in a disturbed formation (Glotzbach et al., 2007)
1.3 Single‐ and Team‐Oriented Approaches for Autonomous Systems
11
Figure 1‐5 displays the movement of a team of AUVs in close formation. As stated before, the limited communication abilities in the underwater environment must be kept in mind when the control approach is realized. To this extend, the vehicles will usually operate at a higher level of autonomy like land or aerial vehicles would do in a comparable situation. In the research project GREX (see section 1.4.1), this challenges was solved by letting the vehicles move on preplanned trajectories that have been designed in way that the formation would be intact if the vehicles were able to follow the trajectories exactly. During the rare communications, the vehicles will compare their positions to find out whether the formation is still intact or within small acceptable deviations. If this is the case, they will just continue to follow their individual trajectories, remaining at a high level of autonomy. This is shown in Figure 1‐5, where consequently the autonomy level of the team instance is very low; it represents the periodic checks whether the formation is still intact. (Note that the elements on the left side of the Figure represent the autonomy scale, according to the description given in Figure 1‐2). If the team instance detects a discrepancy in the formation, it will interact by rising its level of autonomy, automatically reducing the ones of the vehicles. It will then compute new reference values for the speeds of the vehicles in order to reestablish the formation, as shown in Figure 1‐6. This activity would be performed in the Executive‐/Autopilot‐ Layer of the RBM as described before.
Figure 1‐7: A team of AUVs is avoiding an obstacle (Glotzbach et al., 2007)
Another interaction of the team instance is shown in Figure 1‐7. After the vehicles have detected an obstacle in their originally planned trajectories, the old formation is cancelled and changed into a line formation to allow for an easier passing. This is a far more severe interaction of the team instance; therefor its autonomy level is raised more than in the last example; leading to lower levels for the single vehicles. The interaction would have been executed at the Tactical Layer / Manoeuver Management of the team instance according to the RBM. Finally, a situation might emerge that requires an enormous interaction of the team instance into the mission plans of the single vehicles. Figure 1‐8 displays a situation after one vehicle had a serious accident. The team instance is therefore cancelling the current mission and providing new mission plans, in which two AUVs are commanded to surface to contact the human operator via radio, while one vehicle is commanded to stay at the disabled vessel to ease the later recovery. This interaction is performed at the Strategical Layer / Mission Management.
12
1. Introduction
Figure 1‐8: Accident of a single vehicle – New Mission Plans (Glotzbach et al., 2007)
Figure 1‐9: Hierarchical realization of the team instance (Glotzbach and Wernstedt, 2006)
In general, the described team instance can be realized in different ways. As discussed for single vehicles before the introduction of the RBM, the general concepts ‘hierarchical’ and ‘peripheral’ come into mind. While we assume that the control system for a single robot is usually realized in a hierarchical manner, especially for marine robots, a first idea for the realization of the team instance in the same manner is shown in Figure 1‐9. The team instance is also realized according to the RBM. The related software runs on the hardware of one certain vehicle that can be tagged as leader. In scenarios with bad communication possibilities, like in marine robotics, it is straightforward to distribute the software to several vehicles and to come up with a concept that allows for stable conditions even if long periods of
1.3 Single‐ and Team‐Oriented Approaches for Autonomous Systems
13
communication losses happen. Yet, there is still a clear hierarchy, and it is strictly clear which vehicle can send which kinds of commands or information to others. Interactions can occur in a way described in Figure 1‐5 to Figure 1‐8 A different concept is the usage of a peripheral team instance. Figure 1‐10 shows an adequate structure, in which the software of the team instance is equally spread among all team members. Note that the team instance part in every vehicle is also realized according to the RBM; that is just one possibility and can also be done in a different way. In general, there is no leader in this approach, and ever vehicle might decide to deviate from the general mission plan based on accordant cognitions. The activities can be coordinated, as long as communication contact exist. It can be assumed that this concept is less prone to communication losses, but it is also necessary to formulate strict end easy rules to avoid ambiguities. When comparing both approaches, it can be stated that they follow different concepts and might lead to differential behavior. The hierarchical approach results in clear structures for supervision and control, but the requirements the communication abilities are high. In the peripheral approach, the vehicles can adapt their communication requirements to the current situation, which can be seen as a great advantage of this method. On the other hand, the lack of clear structures can cause a lot of problems.
Figure 1‐10: Peripheral realization of the team instance (Glotzbach and Wernstedt, 2006)
These observations lead to the suggestion of two different notations for the realization of the control structure for robot teams. As shown in Figure 1‐11, the notation team or group is used as an umbrella term. Inspired by biological archetypes, the notation pack was suggested for team with strong hierarchical structures, and the notation swarm for peripheral ones. Both concepts have been used to successfully realize different robot teams, especially in the land
14
1. Introduction
domain. The author demonstrated the realization of a robot pack of three land robots in simulation in Glotzbach, 2004a, employing mainly rule‐ and graph‐based control strategies. Peripheral organizations are widely used in the simulations of ant swarms, with a focus on optimization tasks, as in Dorigo et al., 1996. In these realizations, often evolutionary or stochastic based control strategies are employed. Already this fact requires usually a large number of team members due to the Law of large numbers in probability theory. Also, these strategies usually accept the loss of a small number of team members. This is not a problem in simulation, e.g. to solve optimization tasks, but might be difficult for real teams. This might be especially true for marine robots, where current vehicles are very expensive and shall be retrieved at all cost.
Figure 1‐11: Different approaches for teams in the area of unmanned vehicles (Glotzbach et al., 2007)
Figure 1‐12 gives another overview of the principle differences between swarms and packs, also including land and air robots. The horizontal axis represents the ability of the vehicles to extent the boundaries of the total autonomous system. One could also say that it described the ability to lower the own level of autonomy for the sake of the team instance, the central computer, or the human operator. A point further on the right means that the level of autonomy can be lowered more. The vertical axis represents the number of vehicles in the team, whereas a lower point stands for a lower number of team member (with usually a better equipment), while a higher point represents a higher number of usually worse equipped robots. The dotted line shows the general relation between these two properties, and also demonstrated on which ends the notations swarm and pack are used: Swarms will usually contain of a high number of (preferably) low‐cost systems, which will operate at a higher overall level of autonomy in a team (because there is no strict hierarchy where a single vehicle is exactly told what to do). Packs, on the other hand, comprise of fewer, more specialist members which are able to reduce their level of autonomy more in a team in order to follow commands issued by the leader, the central computer, or even a human operator. As a conclusion, both discussed methods offer advantages and disadvantages. When employing the concept of adaptive autonomy, it is possible to also find realizations somewhere between the poles denoted a pack and swarm. The author introduced the concept of adaptive autonomy into the research projects he was involved (see the following section), and as these projects were dealing with marine robots, in general more hierarchical based realizations have been realized. This can be assumed to be the typical way in marine robotics. Due to the high cost of the vehicles and the difficulties of the human operator to directly intervene in cases of
1.4 Review of Selected European Research Projects in Cooperative Marine Robotics
15
problems, stochastic based control structure are more difficult to implement. It can be stated that most marine robot teams realized so far in the area of research and development follow more the hierarchical concept, whereas it shall be mentioned that different strategies are also experimented with, e.g. in the current subCULTron project, see subCULTron, 2017. The future might witness realizations with different combination of the two shown concepts, always aiming to fit the concrete organization to the current mission scenario.
Figure 1‐12: Relation of number of team members and boundary of autonomy in packs and swarms (according to Glotzbach, 2004b)
1.4 Review of Selected European Research Projects in Cooperative Marine Robotics The author collected his experience in marine robotics mainly by participating in three research projects, which will be presented in this section. The project GREX run during the time when he was achieving his PhD degree. The research related to the aimed habilitation was done during the projects CONMAR and MORPH. 1.4.1 GREX ‘Grex’ is the Latin word for a herd or flock, which described the general goal of the project. The idea was the development of a conceptual framework and middleware system in order to coordinate a team of diverse, heterogeneous physical objects (robotic vehicles) working in cooperation to achieve a well‐defined goal in an optimized manner. From a practical point of view, it was aimed to enable existing marine robots which would be provided by some of the project partners and which were originally developed for single‐autonomous behavior to become part of the GREX team. Several mission scenarios from the area of marine biology were defined as guidelines for the development, in order to align the activities of the partners to a common goal.
16
1. Introduction
The research project GREX was funded by the Sixth Framework Programme of the European Community (FP6‐IST‐2006‐035223) and run from 2006 to 2010. The following companies and institutions participated in the project: ATLAS ELEKTRONIK GmbH (Germany), Centre of IMAR (Institute of Marine Research) at Department of Oceanography and Fisheries at the University of the Azores (Portugal), Ifremer (France), Innova S.p.A. (Italy), Instituto Superior Tecnico IST ‐ Lab: Institute for Systems and Robotics ISR (Portugal), MC Marketing Consulting (Germany), SeeByte Ltd. (United Kingdom), Technische Universität Ilmenau (Germany).
Figure 1‐13: Two of the GREX mission scenarios: Fish data download (left), Marine habitat mapping (right)
Figure 1‐13 shows two of the aimed scenarios. The left picture depicts the fish data download. Marine biologist are interested to collect data about fishes, like the paths they are travelling of the depth the stay at. To this extend, fishes are caught, provided with some small electronic devise, and released. It is a big challenge to retrieve the data though. A possible solution is the employment of marine robots. Two autonomous surface vehicles have to track the fish, following the acoustic pings send by the device, and then guide an AUV close enough to the fish in order to download the collected data. The picture on the right side of Figure 1‐13 depicts the Marine Habitat Mapping. In this scenario, an AUV is linked to an autonomous surface craft by an optical cable, while both vehicles execute a preplanned mission plan in the form of a lawnmower. The AUV is collecting video or sonar data which is immediately send through the cable to the surface vehicle, and from there via radio to a central station where a human operator is checking the data online. As soon as he detects an interesting feature of which he would like to gain more information, he can send another AUV that was hovering in the area before to execute a finer lawnmower in order to collect more data. The following major objectives had been handled during the project: Development of a User Interface for multiple vehicle mission preparation, programming, and post mission analysis, Generic control system for multiple marine vehicle cooperation taking into account mission alteration and event triggered actions on the fly, A cooperative navigation solution for relative positioning, A generic multichannel communication system (LAN, radio, and underwater acoustic), Realization and validation by sea trials
1.4 Review of Selected European Research Projects in Cooperative Marine Robotics
17
Figure 1‐14: Overview of the GREX main components
Figure 1‐14 depicts the main components. Mission plans for the whole team could be created by an adequate planning software which would automatically translate them into individual plans for every single vehicle. These plans were then transferred to the original control stations of every vehicle, were they were translated into a mission plan in the individual language of the particular vehicle. Thus, it was possible to include heterogeneous vehicles in the GREX team, which was one of the basic tasks. All vehicles were equipped with additional computational hardware, referred to as ‘GREX Black Box’, which run all the GREX related control software. A Team Handler was responsible to adapt the execution of the individual mission plans in order to guarantee for cooperative behavior. The GREX Black Box also run the software for communication, employing acoustic and radio modems, and the cooperative navigation, which was mainly based on the usage of the individual navigation information of the vehicles. A GREX interface module was responsible for the data exchange between the software on the GREX Black Box and the proprietary control systems of the single vehicles.
Figure 1‐15: GREX Final Trials in Sesimbra
18
1. Introduction
The research work was validated during the final sea trial in Sesimbra, Portugal, during September 2009. Figure 1‐15 shows the four participating vehicles: SeaCat (Atlas Elektronik), Vortex (Ifremer) and the two autonomous catamarans DelphimX and Delphim (IST) (left column, from top to bottom). The right side shows the central command center.
© The GREX Consortium
Figure 1‐16: Cooperative Path Following (CPF)
Figure 1‐17: Cooperative Line of Sight Target Tracking (CLOSTT)
1.4 Review of Selected European Research Projects in Cooperative Marine Robotics
19
One of the realized abilities of the team was to move in a closed formation at the sea surface. To this extend, each vehicle was provided with a single vehicle mission plan that contained an aligned path to follow. During mission execution, the vehicles exchanged their position data. The team handlers on board the vehicles computed the percentage of fulfillment of the current path segment of all vehicles and commanded a new speed for their individual vehicle in order to keep or reestablish the formation. An example is shown in Figure 1‐16. The four vehicles had to maintain a formation where all four of them would move in a row. The colored dots in the figure show the original positions of the vehicles according to the GPS measurement. As it gets visible in the figure, the vehicles maintain their formation even during the arc maneuver, which requires the vehicles on outer lanes to move at a higher speed. This adjustment is automatically done by the team handler. The procedure is denoted as Cooperative Path Following (CPF). Figure 1‐17 shows the successful execution of another mission procedure denoted as Cooperative Line of Sight Target Tracking (CLOSTT). It was developed in inspiration by the Fish Data Download scenario, as explained above. In the final validation, the two catamarans, of which one was equipped with an acoustic modem, had to follow a submerged buoy with acoustic modem with was towed by a manned vehicle maneuvering in an “unforeseen” manner. The manned craft carried a GPS receiver, and the position measurement was send to the submerged buoy and from there to the catamarans via acoustic communication. The catamarans had to follow the buoy while at the same time maintaining their line formation. In a real Fish Data Download scenario, this would be of interest, as the position of the tagged fish would then be estimated via range measurements based on the acoustic ping sent by the electronic device. In Figure 1‐17, two successful executions are depicted. The curvilinear line represents the path covered by the manned craft, while the other solid lines show the path covered by the catamarans. The planned paths for the catamarans are depicted as dashed lines. As a conclusion, in the GREX project it was possible to successfully realize cooperative missions of heterogeneous marine robots, still on a very basic level. A middleware system as well as all the other requirements have been developed and been validated in sea trials. 1.4.2 CONMAR CONMAR (Cognitive Robotics: Cooperative Control and Navigation of Multiple Marine Robots for Assisted Human Diving Operations) was the name of the project that the author and Prof. Antonio Pascoal proposed and which was funded by the Seventh Framework Programme of the European Community in the framework of a Marie Curie Intra European Fellowship (no. 255216). It enabled the author to perform research at the Instituto Superior Técnico, Lisbon, Portugal, under supervision of Prof. Pascoal during a period of 18 months from 2010 to 2011. The project CONMAR was related to the research project Co3‐AUVs (see also Birk et al., 2011), which was funded by the European Commission in the framework of the Seventh Framework Programme (no. 231378). It ran from 2009 to 2012 and was performed by the following partners: Jacobs University Bremen (Germany), Interuniversity Center Integrated Systems for Marine Environment ISME (Italy), Instituto Superior Tecnico / Institute for Systems and Robotics, IST/ISR (Portugal), and Graaltech (Italy). The logos and the base scenario is depicted in Figure 1‐18. The underlying mission scenario of CONMAR aimed at cooperative navigation and control of networks of autonomous marine robots, which had to work together with humans in the loop. The basic idea is to address situations in which the employment of human divers is important
20
1. Introduction
for the execution of a mission, like a scientific exploration in the framework of marine biology or underwater archeology. As it was discussed in section 1.3, it is reasonable to include humans in several missions due to their tremendous cognitive capabilities, which are far beyond the possibilities of robot systems. However, the mission execution might expose the human diver or the team of divers to serious dangers, e.g. if the visibility conditions are very low. The diver is in danger to be entangled in submarine structures he plans to investigate, or to get lost in unstructured environment.
Figure 1‐18: The basic inspiration of the research projects Co3‐AUVs and CONMAR
As a safety feature, he could possibly aim for the cooperation with a team of marine robots. To this extend, he is equipped with a pinger emitting an acoustic signal periodically. Prior to his mission start form a supply ship or the shore, he has started a team of small marine robots carrying hydrophones in the water. By maintaining a specific formation and by receiving the pings sent by the diver equipment, the marine robots should be able to continuously estimate the diver position in an absolute frame, given that they are also equipped with GPS sensors and remain at the surface, thus being able to communicate with each other via radio. The team can therefore also use the GPS to synchronize their clocks; however, it shall be assumed that the clocks do not need to be synchronized with the one carried by the diver. Figure 1‐19 provides and overview of the concrete mission scenario. In the Co3‐AUVs project, the diver could plan a path he intends to follow prior to the mission. The robot team would then use the position estimation to provide some heading information to the diver in order to maintain at the planned path. These heading commands were sent to the diver via acoustic communications and visualized by LEDs (Light Emitting Diode) in his goggles. The position estimation has to be done collectively by the robot team, based on acoustic range measurements and some trilateration algorithms. Due to the challenges in using acoustic communication underwater, the overall concept must provide some robustness in cases of temporary communication losses. This required the estimation of additional movement parameters, like speed and heading, and gave importance to the usage of some advanced filtering structures that would allow to include the concrete unknown movements of the diver as simple stochastic models. In this respect, the CONMAR project went beyond the scopes of Co3‐AUVs in its goal to also consider optimal placements of the marine robots at the surface in order to continuously follow the submerged diver while at the same time chose a formation that maximizes the amount of information gained by the acoustic range measurements. The principle is depicted
1.4 Review of Selected European Research Projects in Cooperative Marine Robotics
21
in Figure 1‐20. If the robots (‘Scouts’) have no idea about the current position of the diver, they might use the available range measurement information to perform a rough estimate of the diver position. This enables them to compute an optimal vehicle configuration for themselves if the diver was at the estimated position. After moving to the desired configuration, they can perform another range measurement and estimate the diver position with a greater level of accuracy.
Figure 1‐19: The general mission scenario of the CONMAR project
Figure 1‐20: Maximizing the amount of information gained from range measurements by organizing the surface robots in an optimal manner
22
1. Introduction
During the CONMAR project, it was possible to develop the described filter system and to perform the position estimation successfully in real seal trials. The research activities of the project CONMAR will form a major part of the scientific discussions in chapters 5‐7. The presentation of the sea trial results will also be performed in that part of the thesis. 1.4.3 MORPH In the research project MORPH (Marine robotic system of self‐organizing, logically linked physical nodes), the core team of the GREX consortium together with some new partners, extended the previous realized cooperation of marine robots to solve an even more challenging mission scenario. The project which lasted from 2012 to 2016 was supported by the European Community) within the Seventh Framework Programme (FP7‐ICT‐2011‐7, Project : 288704). It was performed by the following partners: ATLAS ELEKTRONIK GmbH (Germany), Consiglio Nazionale delle Richerche CNR (Italy), Centre of IMAR (Institute of Marine Research) at Department of Oceanography and Fisheries at the University of the Azores (Portugal), Ifremer (France), Jacobs University Bremen, Germany, Instituto Superior Tecnico IST ‐ Lab: Institute for Systems and Robotics ISR (Portugal), Technische Universität Ilmenau (Germany), Centre for Maritime Research and Experimentation CMRE (Italy), Universitat de Girona (Spain).
Figure 1‐21: Bathymetry map and picture of vertical wall underwater
The base scenario aimed at the mapping of vertical walls of underwater environments such all reefs and cliffs. It can be stated that the autonomous mapping of horizontal sea floor with a single autonomous vehicle using optical and acoustic sensors was performed before, but the challenges are a lot bigger if a vertical wall ought to be mapped. Figure 1‐21 displayed on the left side a 3D bathymetry map of an underwater cliff, which was obtained from measurement collected by a single marine vehicle moving over the area. As it can be seen, the detail level at the wall is not very precise, as the top view of the acoustic sensor is not sufficient to record enough data for a more detailed map. On the right side of the figure, a picture of such a wall is shown. The practical interest in the mapping of vertical underwater walls was expressed by project partner IMAR. These marine biologists are interested in observing the spreading of marine
1.4 Review of Selected European Research Projects in Cooperative Marine Robotics
23
habitats of various species to detect any changes in their sizes. A species of interest are the cold‐water corals, which belong to the oldest living species in the planet and have a big importance in the research to reconstruct past changes in deep‐sea conditions and to understand potential impacts of current climate changes. In addition, these species are very vulnerable to different fishing techniques. The scientists are very interested to map the habitats of these species, which can often be found on vertical walls of coral reefs, therefore expressing the importance of the suggested scenario.
Figure 1‐22: The MORPH Supra Vehicle at different cliff walls with the project logo on top
Therefore, the desired final products were the creation of seabed maps of complex physical habitat like Digital Terrain Maps (DTM) from optical and/or sonar measurements and registered and geo‐referenced acoustic and optical imagery. It is straightforward to say that these requirements cannot be meat by measurements taken from a vehicle moving above the reef. It is necessary that a vehicle would move along the vertical wall to have a direct view on it. However, an autonomous operation of a vehicle in such an unstructured environment is not easy. The vehicle is in danger to collide with the wall, which might change its shape rapidly. Doppler Velocity Log (DVL) sensors which play an important role in the position estimation of autonomous underwater vehicles are best operated over flat terrain and might fail in close vicinity of a vertical wall, given also that the terrain at the bottom of the wall is usually not perfectly flat, but very unstructured, as it can be seen in Figure 1‐21. The usage of Remotely Operated Vehicles (ROV), which are tele‐operated by a human operator, might come into thought as a possible solution. However, the cable of these robots might get entangled in the unstructured environment. In addition, their employment does not solve the problem of position estimation, which is of importance in order to create referenced maps. The base idea of the project was the employment of a team of autonomous marine vehicles with separated tasks. These single vehicles were considered as the nodes of a larger structure, the so‐called ‘MORPH Supra Vehicle’, which were linked by logical links rather than real physical constraints. The single vehicles would not have to move in a close formation, like in GREX before. The formation would need to be adapted to the concrete terrain constantly. By changing their relative positions towards each other on a regular base, the shape of the supra
24
1. Introduction
vehicle would also change, which was denoted as ‘morphing’. Figure 1‐22 depicts the aspired vehicle team in two scenario at a sloped and an overhanging wall. All vehicles in the team have specific tasks. The Surface Support Vehicle (SSV) is the only surface craft and is responsible to provide high precise navigation due its access to GPS measurement. At the same time, it serves as a communication rely between the central command station and the submerged vehicles. The Global Navigation & Navigation Vehicle (GCV) routes the communication between the surface craft and the remaining team members, especially when these have to operate outside of line of side to the SSV, e.g. under an overhanging cliff. It also provides navigation data for its teammates based on acoustic range measurements. The Leading Sonar Vehicle (LSV) is equipped with sonar sensors (multibeam and forward‐looking). As it is at the top of the supra vehicles, its task to scan the environment for obstacles, like overhanging parts of the wall, and to warn the other vehicles, especially the two camera vehicles, denoted as C1V and C2V. These robots carry the cameras to obtain optical measurements and therefore have to operate very close to the wall. The tasks to be solved in the project contained the implementation of a middleware system, the realization of communication, guidance, control, and communication, the necessary upgrades of the single vehicles, which were provided by some of the partner, and finally the processing of the recorded data to create the required maps. For the middleware system, it was agreed to use the Robot Operating System (ROS), which is widely spread in the robotic community, see ROS, 2017.
Figure 1‐23: Vehicles used in the sea trials of the MORPH projects
The project team was able to fulfill its challenging goals. The functionality of the MORPH supra vehicle could be validated in real sea trials in 2014 and 2015, which were held at the island Faial, Azores, Portugal. Figure 1‐23 gives an overview of the marine robots which were used
1.4 Review of Selected European Research Projects in Cooperative Marine Robotics
25
Figure 1‐24: The MORPH supra vehicle; top view during sea trials
Figure 1‐25: Video data of vertical wall, obtained during final trial (top), and derived 3D reconstruction (bottom)
during the sea trials. Figure 1‐24 provides a top view on the MORPH supra vehicle over flat terrain during the trials in 2014. The five marine robots can be recognized, of which four were submerged. They were able to build up a formation, moved coordinated while data was only
26
1. Introduction
exchanged via acoustic communication, and changed their formation to avoid a virtual obstacle. In the final trials, it was possible to approach a vertical cliff wall and to collect the data required for the creation of the demanded map. Figure 1‐25 which was provided by project partner Jacobs University shows a picture of the wall which was created by mosaicking of single video images as well as below a created 3D model of the wall (view from bottom to the top). The methods used for these activities are discussed in Bülow et al., 2013 and Pfingsthorn et al., 2012. More details of the project can be found in Kalwa et al., 2015 and Kalwa et al., 2016.
1.5 Contribution of This Thesis to the State of the Art The scientific contributions of the author are described within chapters 5 – 7. In section 5.2, the setup of a navigation system for the supervision of a human diver by three surface robots is discussed. The discussions are related to work which the author performed during his Marie Curie Fellowship at the Instituto Superior Técnico (IST), Lisbon, Portugal, in the framework of the research project CONMAR. The basic concept is an advancement of the GIB principle, which was taken from literature and is explained in detail. Basically, the author developed two new measurement models to deal with the specific requirements of the scenario at hand. An extensive simulative environment was created under MATLBAB, which includes the simulation of the participating objects as well as the communication process, a tool for planning the missions, the performing of the estimation process, and the visualization of the results. Both measurement modeling approaches have been tested in simulations, and the better one was selected for further real trials which were executed by the staff of the IST. The results of this work have been published at an IFAC conference with peer‐reviews. Another concept of cooperative navigation is discussed in section 5.3. In the underlying scenario, a surface and two submerged marine robots have to move along a preplanned path in close formation. The work was done in the framework of the MORPH project. The author contributed by the modelling of the participating vehicles and the communication process, according to the planned real communication procedure developed by project partners. The author created a simulative environment for the movement of the vehicles (without detailed control) and the simulation of communication as well as measurement of range and bearing between them, to run under MATLAB. Several concepts for relative position estimation have been realized and tested. A first approach on nonlinear estimation with an Extended Kalman filter did not lead to acceptable results, because the estimation did not converge with the true positions, due to the considered communication losses and the low communication frequency. Following a suggestion from Prof. Antonio Pascoal, the author developed and implemented a linear Kalman filter with linearized pseudo measurements which could successfully overcome the convergence problems in simulations. The filter was validated in HIL (Hardware In the Loop)‐simulation within the MORPH software environment. The results lead to a publication in the at journal. Beyond the MORPH project, the author worked on an improvement of the pseudo linear filter for the last described problem. To avoid the necessity for pseudo linear measurements, the approach of an Unscented Kalman filter was tested. The filter was adapted to the mission scenario mentioned before and successfully evaluated in the original MATLAB simulative environment. This work was performed by a student in the course of her bachelor thesis under supervision of the author. The results have been published at an IFAC conference with peer‐ reviews.
1.5 Contribution of This Thesis to the State of the Art
27
In chapter 6, the author discussed his work on Optimal Sensor Placement. Inspired by an article from literature, which is also presented in short form, the author studied on optimal ranges between an underwater target and surface crafts with range measuring capabilities as base for target position estimation. This work was done in the framework of the CONMAR project. As reported in section 6.3, the author could compute an optimal range for a 3D mission scenario, employing a method based on the Maximum Likelihood function and the Cramer‐Rao Bound. The results have been validated in numerical Monte Carlo simulations. A publication at an IFAC conference with peer‐reviews demonstrates the results. Additional work in the area of Optimal Sensor Placement was done, employing the method of empirical observability gramians. Some initial research was done to evaluate the general usability of this method in the framework of a bachelor thesis, supervised by the author, and presented at an International conference, as shown in section 6.4.1. The author continued this work later as theoretical research during the MORPH project, as reported in section 6.4.2. He studied optimal trajectories for single surface crafts in order to estimate the position of an underwater target, whose positon/or trajectory is assumed to be known. Optimal trajectories have been found for a static and a moving target, and a theoretical optimal speed of the surface craft for a static target have been found. The results have been compared with similar ones in literature, and were published at an IFAC conference with peer‐reviews. In chapter 7, the combination of the concepts on cooperative navigation and Optimal Sensor Placement has been presented. In section 7.2, some previously unpublished work done by the author during the CONMAR project is presented, in which he combined the results from section 5.2 with a control concept adopted from literature, and used simulations to show that this concept leads to an improvement of the performance. In section 7.3, the authors described how the results of sections 5.2 and 6.4.2 have been combined, to perform range‐ based position estimation of an underwater object moving on an unknown path by employing only one surface craft which has to move on an optimized trajectory to maximize the amount of information gathered by the range measurements. This work was done in the framework of a research project at the Technische Universität Ilmenau, supervised by the author, and have been published at an IFAC conference with peer‐reviews.
2 Navigation in Marine Robotics: Methods, Classification and State of the Art In this chapter, we will define the meaning of the term ‘navigation’ in the relevant application area, review existing sensors and methods, and discuss their characteristics as well as advantages and disadvantages. In this context, we will narrow down the area to be further in the focus within this thesis, whereas the requirements will be related to the underwater robotics domain and the idea to make use of a cooperating team of robots. In section 2.1, we will define the exact meaning of navigation within the area of marine robotics and discuss the differences that exist e. g. to the definition in land robotics. We will narrow down the term navigation to the estimation of pose and velocity of a marine agent. In order to describe pose and velocity, we need to introduce suitable coordinates, which will be done in section 2.2. As a result of the different interpretations of navigation in several domains we will seek a deeper understanding of the terms Guidance, Navigation and Control in section 2.3 to understand the principal working scheme of an autonomous marine robot. Section 2.4 contains a review and literature study on available navigation sensors and methods, and we will at the end already stress which topics will be of further interest within this thesis. As the navigation employing acoustic measurements will be in the focus, section 2.5 provides an overview of already existing technologies in this area and will at the end stress the interface to the own work, which will be reported from chapter 5 on.
2.1 The Term ‘Navigation’ in Marine Robotics and Other Domains It is very important to explicitly define the meaning of the term ‘navigation’, as it is used differently in several robotic domains. This can lead to significant misunderstanding, when experts of different domains wish to cooperate. To begin the discussion, we shall start with some very basic statements about the meaning of navigation in marine robotics, and the differences to similar terminology, especially in land robotics. First of all, where does the term ‘navigation’ originate from? According to the Random House Webster's Unabridged Dictionary (Dictionary.com, 2016), it originates in the 1530s and goes back to the Latin word navigare “to sail, sail over, go by sea,” which itself was composed from the words navis “ship” and the root of agere “to drive”. So it can be stated that the term is closely related with seafaring. One of the most famous references on this subject is “The American Practical Navigator”, which is often simply referred to as the Bowditch, according to its principal author, the American mathematician Nathaniel Bowditch, who wrote the first version in 1802. Later, the copyrights were sold to the US government, and up to date, updated versions of this work are published, incorporating new developed techniques and methods. The current version is edition 53, edited by the Defense Mapping Agency Hydrographic Topographic Center (DMAHTC) in 2017, see NGA, 2017. In this work, the following is stated in the chapter on “The Art And Science Of Navigation”: “Marine navigation blends both science and art. A good navigator constantly thinks strategically, operationally, and tactically. He plans each voyage carefully. As it proceeds, he gathers navigational information from a variety of sources, evaluates this information, and determines his ship’s position. He then compares that position with his voyage plan, his operational commitments, and his predetermined “dead reckoning” position. A good navigator anticipates dangerous situations well before they arise, and always stays “ahead of the vessel.” He is ready for navigational emergencies at any time.
© Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2020 T. Glotzbach, Navigation of Autonomous Marine Robots, https://doi.org/10.1007/978-3-658-30109-5_2
30
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
He is increasingly a manager of a variety of resources‐‐electronic, mechanical, and human. Navigation methods and techniques vary with the type of vessel, the conditions, and the navigator’s experience. The navigator uses the methods and techniques best suited to the vessel, its equipment, and conditions at hand. Some important elements of successful navigation cannot be acquired from any book or instructor. The science of navigation can be taught, but the art of navigation must be developed from experience.” As one can see, the overall process of planning and executing of the trip of a marine vessel is originally summarized as “navigation”. Interestingly enough, within the marine robotic domain, the definition of navigation differs a lot, as it only includes one part of the stated definition from the Bowditch. To start from a very simple point, it can be stated that navigation in marine robotics is related to the problem: “Where am I? In which direction am I going”, from the point of view of a robot. That means, the navigation system must estimate the pose (position and attitude or orientation and respective rate) and velocity (direction of movement and speed) of an agent, which can be an autonomous vehicle, a diver, a towed buoy etc. It must be kept in mind that the ‘orientation’ and ‘direction of movement’ are not necessarily identical, as the existence of a current in the water can have a strong influence on the agent’s motion. We shall discuss this further in section 2.2.5. The inputs employed by the navigation system are usually noisy measurements of selected variables, like acceleration, turning rate, velocity through water or over ground, or distances to other objects. Also, video and/or sonar pictures can be used as inputs for a navigation system. While keeping that in mind, it is important to mention that the problem “How do I get to a desired position?” is not a part of navigation in marine robotics. As examples from the literature, Kinsey, 2006 summarizes “Navigation of Underwater Vehicles” as “reliable three‐dimensional position sensing of underwater vehicles”. Thor Fossen writes in his book Marine Control Systems ‐ Guidance, Navigation, and Control of Ships, Rigs and Underwater Vehicles (Fossen, 2002): “Navigation is the science of directing a craft by determining its position, course, and distance traveled. In some cases velocity and acceleration are determined as well.” Note that, according to these definitions, navigation for marine vehicles is related to the process of determining relevant motion parameters of the vehicle, but not to the task of actually controlling the vehicle to reach a defined goal. This is completely different for land robots, where the definition is much closer to the one from Bowditch. We shall look at some literature to clearly point this out: In Hertzberg et al., 2012 (own translation), the authors subsume the following tasks under the term ‘navigation’ for land robots: Planning of the spatial path from a start point to a destination point (which might have to be determined first), and finally the activity to move the robot to the destination point, while considering unforeseen obstacles. Stachniss, 2009, states in his habilitation thesis that “Robot navigation is the process of autonomously making a sequence of decisions that allows a mobile robot to travel robustly to selected locations in the environment. This ability involves a large set of problems that need to be solved including sensor data interpretation, state estimation, environment modeling, scene understanding, learning, coordination, and motion planning”. In Koditschek, 1987, we find the following definition of the ‘planar analytic navigation problem’: “Given a desired destination point in the configuration space, 𝑥 , construct an analytic vector field on the configuration space for which 𝑥 is an asymptotically stable equilibrium state whose domain of attraction includes the entire component of the space connected to 𝑥 .”
2.1 The Term ‘Navigation’ in Marine Robotics and Other Domains
31
In Martínez‐García et al., 2011, it is stated that „The robot navigation is the ability to determine its own position within a frame of reference, while planning a path towards the next goal‐location. Hereafter, the purpose is to control the global robot’s in terms of controlling linear speed and yaw.” In summary, one can say that ‘navigation’ in land robotics relates to a bigger area of problems than in marine robotics, where the main task is the planning of a path (spatial) or trajectory (spatiotemporal) from a starting to a destination point, considering a priori available information on static obstacles in the area. The actual movement of the robot along the planned path, considering unknown or dynamic obstacles, is mentioned as a part of the navigation problem by Hertzberg et al. and Stachniss (environment modelling, scene understanding), denoted as ‘purpose hereafter’ by Martínez‐García et al., or not mentioned at all by Koditschek. Also, the task that we have explicitly defined as ‘navigation’ in marine robots at the beginning of this section, scilicet the pose and velocity estimation of an agent, can be found under different notations: In the book of Hertzberg et al., there is a chapter related to ‘localization in maps’ for the problem to estimate the current position in a given reference system, denoted as map, while in the chapter ‘locomotion’ the authors discuss ways to estimate the movement for several locomotion principles (like differential drive or uniaxial steering) using different filter systems. Stachniss denotes this ability as “sensor data interpretation, state estimation”. Martínez‐García et al. discuss the problem “to have an accurate match between multisensory observations and a robot position” in the chapter ‘Robot Positioning’. Koditschek, once again, puts the focus on the exploration of potential function for the purpose of path planning and does not discuss the process of estimating the robot position in a real world scenario. At this point it should be mentioned that the science of estimating the position of a submerged marine robot is a challenging task. Air robots and land robots outside of building are usually able of using global navigation in terms of a so‐called Global Navigation Satellite System (GNSS), of which the American NAVSTAR GPS (Navigational Satellite Timing and Ranging – Global Positioning System), furthermore to be denoted as GPS, plays the major role in Europe and the United States, see Xu, 2003 for details. As soon as a marine robot submerges into water, it has no longer access to this technology. Therefore, other methodologies need to be employed. As a conclusion, the content of the term ’navigation’ in marine robotics can be found as ‘localization’, ‘movement estimation’ or ‘robot positioning’ in the domain of land robotics. On the other hand, the term ‘navigation’ in land robotics comprises several problems, including path or trajectory planning, that are related to different notations in the marine robotics. It is interesting that the definitions from land robotics are much closer to the original one from the seafaring domain by Bowditch than the ones used within the marine robotics domain. A possible reason for the different understanding might be related to the fact that at the very beginning, submerged marine robots had little or no possibilities to communicate with a central station. Therefore, as stated in Kinsey et al., 2006, the traditional way within the ocean engineering community was to perform the path planning a priori, that is, before the robot starts, while only the pose/ velocity estimation was performed in‐situ, during the mission, to enable the control system to move the robot along the replanted path. Any in‐situ change of the path would be unknown to the human operators. Therefore, in case of an emergency situation, the operators had no chance to know where to find the robot if it purposely deviated from the pre‐planned plan. In fact it is the personal experience of the author that back in 2006 it was very difficult to convince marine robotic providers to even allow for very small online
32
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
changes to an a priori mission plan, which was one of the big challenges in the mentioned GREX project. Today, with underwater acoustic communication equipment available, that is still very less broad banded and reliable than air communication possibilities, the situation is more relaxed, and in‐situ path planning and changing is coming more into focus, as discussed within Section 2.3.4. Hence, the classical definition of navigation in marine robotics remains active. We will look at the concrete definitions and notations in marine robotics by introducing the terms ‘Guidance’ and ‘Control’ in Section 2.3. For the scope of this thesis, we will use the following definition: Definition: Navigation in maritime robotics Navigation refers to the procedures necessary to determine position, orientation/ attitude and velocities of a marine object, like a robot, a vehicle, a buoy, or a human diver. The computation of some of these data may be performed by direct measurement, which will always be superimposed by some noise. Other data might not be measureable directly; it has to be derived from different measurements. The term navigation summarizes the overall process, from the selection and employment of certain sensors, including the handling and pre‐ processing of the sensor signals, up to the process of merging data from different sources to estimate navigation data that cannot be directly measured. In general, data from different sources, with different measurement frequencies and different accuracies has to be merged. In a concrete mission scenario, usually not all of the mentioned data has to be determined; therefore, for each scenario there need to be detailed information on which data are needed for which purposes. As a precondition for the discussions in this chapter, we shall at first have a look on how the navigation data of a marine robot can be expressed.
2.2 Structure of Navigation Data in Marine Robotics To clearly describe the position and orientation/ attitude of a rigid body in the three‐ dimensional space, three position coordinates (𝑥, 𝑦, 𝑧) and three angular coordinates (𝜙, 𝜃, 𝜓) are necessary (six degrees of freedom, 6 DOF). The combination of position and orientation/ attitude is referred to as the pose of the body. The angular coordinates are also referred to as Euler angles. However, these coordinates can be related to different frames. We will discuss the most important ones. 2.2.1 Inertial Reference Frame for Description of Position For the description of position and orientation, a global inertial frame (or coordinate system) is needed. An inertial frame is a frame of reference with a description of time and space that is homogeneous, isotropic and time‐independent (Landau & Lifshitz, 1960). In an inertial frame, Newton’s first law of motion is valid, that is, an object either remains at rest or continues to move at a constant velocity, unless acted upon by a force (Browne, 1999, pp. 58, Holzner, 2005, pp. 64). For a frame to be inertial, it must be nonrotating and nonaccelerating, but may be moving with constant velocity (Stevens & Lewis, 1992, pp. 19). To be able to describe every possible location of a robot on earth (not limited to marine ones), it is straightforward to use an inertial coordinate system with its origin at the center of the earth, which is translating with the earth, but with a fixed orientation relative to the stars. An adequate system is referred to as Earth‐Centered Inertial (ECI) reference frame and widely used for the navigation of airplanes (Stevens & Lewis, 1992, pp. 19). To be absolute precise, the frame is not inertial, as it moves with the center of the earth on an elliptical path around the
2.2 Structure of Navigation Data in Marine Robotics
33
sun. Nevertheless, this effect can be neglected if the movement of a plane is to be described. It is a disadvantage of the ECI‐frame that it is not fixed with respect to earth. This means that the coordinates of a robot which remains at a stationary position on the earth will nevertheless constantly change. In order to use a frame in which the coordinates remain constant as long as a robot is not in motion, the Earth‐Centered Earth‐Fixed (ECEF) frame can be used. It is a right‐ handed Cartesian system, with its origin at the center of the earth, its 𝑍 ‐axis is oriented along the mean rotational axis of the earth, and its 𝑋 ‐axis is pointing to the mean Greenwich meridian. In this frame, the 𝑋 𝑌 ‐plane is denoted as mean equatorial plane, and the 𝑋 𝑍 ‐ plane is called mean zero‐meridian (Xu, 2003). In distinction to an equivalently oriented ECI‐ frame, the ECEF‐frame rotates with an angular rotation rate of 7.29 ∙ 10 rad/s. Therefore, in a mathematically strict sense, it is also not an inertial frame, but for the comparatively slow motion of marine robots, it can be considered to be inertial (Fossen, 2002). Figure 2‐1 displays the position of the ECEF‐frame, where the axis are marked with the subscript 𝐸.
Figure 2‐1: The Earth‐Centered Earth‐Fixed (ECEF)‐frame XEYEZE and the Geocentric Coordinate system , , r
However, the usage of the ECEF‐frame for the motion description of mobile systems is not the best option. Especially for air and marine robots, we would prefer a frame in which one coordinate directly describes the height above or the depth below the earth surface (at a standardized level like main sea level). We can obtain an adequate frame by transferring the ECEF‐frame into a spherical coordinate system with the coordinates , , r. The first two angular coordinates are referred to as geocentric longitude and latitude, where is counted eastward from the Greenwich Meridian, and is counted northward from the equator. Note that the latitude must not be confused with the Euler angle defined before. The distance 𝑟 is measured between the object and the center of the earth, and the difference of 𝑟 minus the earth radius yields the height of the object. The axis 𝑟 heads vertically through a plane that is normal to the earth surface at zero height. The frame is also referred to as Geocentric coordinate system, as the earth center is the base for the determining of and 𝑟. Figure 2‐1 displays an object at the coordinates 𝑥 , 𝑦, 𝑧 (the preceded superscripted 𝐸 shows that the coordinates are given in the ECEF‐frame) and additionally the coordinates of the Geocentric coordinate system (Xu, 2003).
34
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
Figure 2‐2: Difference between geocentric latitude and geodetic latitude displayed by a cut through the mean zero‐median of a sphere and an ellipsoid
The downside of this system is the fact that the earth is not in a shape of a sphere, but of an ellipsoid. The accuracy can be improved by the usage of a coordinate system that is based on an ellipsoid, like the Geodetic coordinate system, which is widely used in GPS‐based navigation (Cai et al., 2011). The dimensions of the ellipsoid often used to model the earth have been defined in WGS‐84, 1987. The Geodetic coordinate system describes the position of the object with the coordinates , , h. The first two coordinates are again referred to as (geodetic) latitude and longitude. It can be stated that the geocentric and the geodetic longitude are identical, while the latitudes differ due to the different forms of the sphere and the ellipsoid. The difference is displayed in Figure 2‐2, which shows a sphere and an ellipsoid cut through the mean zero‐median. In the Geodetic system, the height ℎ of the object is computed above a plane normal to the ellipsoid surface, and the latitude is measured between the major axis and the line that leads from the object vertical through the mentioned plane. Due to the ellipsoid shape, this line does not cross through the center point, as the line in the Geocentric system does (Xu, 2003). We will use a preceded superscripted 𝐺 to denote that given coordinates are in the Geodetic format. Finally, it is reasonable to define a local coordinate frame that describes the position of the robots with respect to a defined origin, like a command station, and that uses coordinates that can easily be obtained from GPS‐measurements available in the Geodetic coordinate frame. For the local frame, often the n‐frame or NED‐frame is used, where NED means North‐East‐ Down. That means, the origin of this right‐handed Cartesian coordinate system can be any arbitrary point on the surface of the earth. The 𝑋 ‐axis points towards the true north of the ellupsoid, the 𝑌 ‐axis to the geodetic east, and the 𝑍 ‐axis downwards normal to the earth surface. As long as the ECEF‐frame can be considered as inertial, as discussed above, this is also true for the NED‐frame. Figure 2‐3 which borrows from Cai et al., 2011, pp. 24, displays the three relevant reference frames: The ECEF‐frame, the Geodetic coordinate system and the local NED‐frame. So far, we have shown how we can introduce an inertial local frame based on the ECEF‐frame or the Geodetic Coordinate system with an arbitrarily chosen origin. For the sake of simplicity, we will use the axis notation XYZ for the local inertial frame, and we will use a preceded superscripted 𝑖 for coordinates to denote that they are expressed in the inertial frame. To
2.2 Structure of Navigation Data in Marine Robotics
35
improve legibility, we will skip the index in cases where the relation is clear. Following the definitions of navigation data in Fossen, 2002, we can combine the pose coordinates within the pose vector 𝛈. The pose vector can be decomposed into the position vector 𝐩 and the orientation/ attitude vector 𝚯, which yields the following description: 𝛈
𝐩 ; 𝐩 𝚯
𝑥 𝑦 ; 𝚯 𝑧
𝜙 𝜃 𝜓
(2‐1)
Figure 2‐3: The Earth‐Centered Earth‐Fixed (ECEF)‐frame XEYEZE, the Geodetic Coordinate system , , h, and the local North‐East‐Down (NED)‐frame XNYNZN
2.2.2 Body‐Fixed Frame for Description of Velocities and Forces/ Moments As discussed before, in marine navigation we are not only interested in the estimation of the agent’s pose, but also of the velocities, that is the first order derivative of the pose coordinates. However, we have to keep in mind that pose and velocity might be expressed in a different frame and require transformation, as we will discuss in section 2.2.3. The behavior of the velocities is determined by the accelerations, which in turn are driven by the forces and moments acting at the vehicle. It is common to transfer these sizes in a body‐fixed frame with its origin in the center of gravity (CG) of the vehicle, and orientation of the axes along the principal axes of inertia of the vehicle. We shall use the subscript 0 for the denotation of the axes: 𝑋 equals the longitudinal axis from stern to bow, 𝑌 is orientated along the transverse axis directing to starboard, and 𝑍 points along the normal axis downwards. Using the superscript 𝑏 for a body‐fixed frame (also denoted as b‐frame), we can introduce the velocity vector 𝐯 and the vector of forces and moments 𝛕, which will again be separated into the transversal and rotational components:
36
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
𝐯
𝐯 ; 𝐯 𝛚
𝛕
𝐟 ; 𝐟 𝐦
𝑢 𝑣 ; 𝛚 𝑤 𝑋 𝑌 ; 𝐦 𝑍
𝑝 𝑞 𝑟 𝐾 𝑀 𝑁
(2‐2)
Note that 𝛚 describes the angular velocity of the b‐frame with respect to the n‐frame decomposed in the b‐frame. In what follows, we will use the notation 𝐧 for a vector containing the navigation data necessary in a concrete mission scenario. Therefore, 𝐧 might contain 𝛈 and/ or 𝐯, or parts of them, and possibly additional information like sea current velocity. Table 2‐1. Notation of movement and parameters for marine objects (SNAME, 1950) DOF
Direction of movement
Denotation
1 2 3 4 5 6
Motion in x‐direction Motion in y‐direction Motion in z‐direction Rotation about x‐axis Rotation about y‐axis Rotation about z‐axis
Surge Sway Heave Roll Pitch Yaw
Position / Euler Angles 𝑥 𝑦 𝑧 𝜙 𝜃 𝜓
Velocity components 𝑢 𝑣 𝑤 𝑝 𝑞 𝑟
Forces and Moments 𝑋 𝑌 𝑍 𝐾 𝑀 𝑁
Figure 2‐4: Display of the defined frames and motion parameters for a marine robot according to the xyz‐ convention for the rotation
With these definitions, we have the necessary tools to describe the pose and motion of a marine agent in 6 DOF. Table 2‐1 summarizes the parameter names and notations that are
2.2 Structure of Navigation Data in Marine Robotics
37
widely used in marine robotics according to SNAME, 1950. They also form the base for the modeling of vehicle motion in a marine environment, whereas it can be differentiated between kinematic and dynamic modelling. In the first mentioned procedure, only the geometrical aspects in terms of the relations between velocity/ acceleration and pose are handled, while dynamics contains an analysis of the forces and moments causing the movement. This will be discussed in greater detail in section 2.3.1. For a better understanding of the definitions made so far, Figure 2‐4 which borrows from Fossen, 2002 provides an overview of the defined parameters for a marine robot. In the part a), a marine robot is depicted at position ( 𝑥 , 𝑦, 𝑧) in the inertial frame XYZ. The body‐fixed frame X0Y0Z0 originating in the center of gravity (CG) is displayed, and the directions and orientations of velocities and forces/moments are shown. The remaining parts of the figure display the Euler angles and the rotation from the inertial frame to the body‐fixed frame according to the so called xyz‐convention. To that extend, a frame X3Y3Z3 is defined that results from a rotationless translation of XYZ until its origin is superimposable with the center of gravity of the marine robot (CG). In a first step, X3Y3Z3 is rotated around Z3 by yaw angle 𝜓, yielding frame X2Y2Z2, where Z3 and Z2 are identical (as shown in b) ). In a second step, X2Y2Z2 is turned by pitch angle 𝜃 around Y2, resulting in frame X1Y1Z1 (part c) ). Finally, the body‐fixed frame X0Y0Z0 can be obtained by rotating X1Y1Z1 by roll angle 𝜙 around X1 (part d) ). 2.2.3 Coordination Transformations According to Euler's theorem on rotation, every change in the relative orientation of two rigid bodies or reference frames 𝐴 and 𝐵 can be produced by means of a simple rotation of 𝐵 with respect to 𝐴 (Fossen, 2002). Following the same source, we summarize the process of transforming a velocity vector between different frames: Given vector 𝐯 in the b‐frame, and 𝜙 𝜃 𝜓 with respect to the the information that the b‐frame is rotated by the vector 𝚯 n‐frame, the same vector decomposed in the n‐frame is given by: 𝐯
𝐑 𝚯
𝐯 ,
(2‐3)
where 𝐑 𝚯 as a function of 𝚯 is the rotation matrix from the b‐frame to the n‐frame. This rotation is the opposite as the one shown in Figure 2‐4 b) to d) and therefore also denoted as zyx‐convention. Note that, for 𝐯 being the velocity vector as defined in equation (2‐2), 𝐯 yields the velocity vector in the n‐frame, which is the first order derivation of the position vector in the local inertial frame, 𝐩 according to equation (2‐1). Therefore we can write: 𝐩
𝐑 𝚯
with 𝐑 𝚯 where s ∙
𝐯 , c𝜓c𝜃 s𝜓c𝜃 s𝜃
sin ∙ and c ∙
s𝜓c𝜙 c𝜓s𝜃s𝜙 c𝜓c𝜙 s𝜙s𝜃s𝜓 c𝜃s𝜙
s𝜓s𝜙 c𝜓c𝜙s𝜃 c𝜓s𝜙 s𝜃s𝜓c𝜙 , c𝜃c𝜙
cos ∙ ,
while the transformation in the other direction can be performed by computing
(2‐4)
38
2. Navigation in Marine Robotics: Methods, Classification and State of the Art 𝐯
𝐑 𝚯
𝐩.
(2‐5)
Following again the discussions in Fossen, 2002, the transfer from the vector with the angular velocity decomposed within the b‐frame, 𝛚, into the local inertial frame to 𝚯 𝜙 𝜃 𝜓 , can be performed by employing the transformation matrix 𝐓𝚯 𝚯 , yielding 𝚯
𝐓𝚯 𝚯
with
𝛚 or 𝛚
1 𝐓𝚯 𝚯 = 0 0
s𝜙t𝜃 c𝜙 s𝜙⁄c𝜃
where s ∙
sin ∙ , c ∙
𝐓𝚯
c𝜙t𝜃 s𝜙 ; c𝜙⁄c𝜃
𝐓𝚯
cos ∙ , and t ∙
𝚯
𝚯 ,
1 𝚯 =0 0
0 c𝜙 s𝜙
s𝜃 c𝜃s𝜙 , c𝜃c𝜙
(2‐6)
tan ∙ .
Another important task is the transfer of data from the Geodetic Coordinate System towards a local inertial frame. This is of interest because GPS data that are very important for the navigation of ASVs and emerged AUVs will usually be available in the Geodetic format. To this ℎ . We want to extend, we suppose that an autonomous object is located at 𝐩 ℎ . transfer this data into a local NED frame with its origin located at 𝐩 Following the discussions in Cai et al., 2011, we need to transfer the object's position data into the ECEF‐frame ( 𝐩) as an intermediate step. This yields:
𝐩
𝑁 ℎ ∙ cos ∙ cos 𝑁 ℎ ∙ cos ∙ sin , 𝑁 1 𝑒 ℎ ∙ sin
𝑥 𝑦 𝑧
(2‐7)
where 𝑁 and 𝑒 are parameters that base on the ellipsoid used to model the earth. According to WGS‐84, 1987, the first eccentricity 𝑒 equals 0.08181919, and the prime vertical radius of 6,378,137.0 m to curvature 𝑁 can be computed with the help of the semi‐major axis 𝑅 be 𝑁
𝑅 1
𝑒 sin
.
(2‐8)
Employing equations (2‐7) and (2‐8), we can also compute the ECEF‐coordinates of the origin of the local NED‐frame, 𝐩 . In the last step, we can now transfer the ECEF‐coordinates of the mobile object into the NED‐frame, 𝐩, yielding: 𝑥 𝐩
𝑦
𝐑
𝐩
𝐩
,
(2‐9)
𝑧 where 𝐑 is the rotation matrix from the ECEF‐frame to the local NED‐frame, which can be computed to be
2.2 Structure of Navigation Data in Marine Robotics 𝐑
sin ∙ cos sin cos ∙ cos
sin ∙ sin cos cos ∙ sin
cos 0 sin
39
.
(2‐10)
It can be stated that even more frames can be introduced. For instance, Kinsey, 2006 defines the Instrumental frame for sensors that are usually not mounted directly at the CG of the vehicle, and might also be rotated against the b‐frame. Summing up the discussions so far, we have introduced the relevant frames that we will use in the further course of the thesis and the transformation between them. For the local frame, we will also use coordinate systems in which the x‐axis points eastwards and the y‐axis points northward, especially when we treat the navigation problem in 2D, neglecting the 𝑧‐ coordinate, as the depth of an AUV can be measured with a good accuracy employing state‐of‐ the‐art depth sensors (see also discussions in section 2.4) 2.2.4 Physical Meaning of the iz‐ Coordinate In the ECEF‐frame, the distance 𝑟 to the center of the earth was used as one of the three necessary coordinates. This is not of practical usage in real applications. In the local frame, the origin can be placed in a way that the 𝑧‐coordinate represents a reasonable quantity, like the height above sea level, for instance. If we put the origin of the local frame at the sea surface level, 𝑧 equals the distance between the vehicle and the surface, which is also denoted as depth. It is crucial to observe the depth at all time, as all vehicles are only constructed to operate below a maximum depth. An exceeding of the limit might lead to sever damages up to a complete loss of the vehicle. The depth can easily be measured with high accuracy by a depth cell; this will further be discussed in section 2.4.
Figure 2‐5: The different meaning of the terms 'depth' and 'altitude' in marine robotics
40
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
However, the 𝑧‐coordinate is often expressed in terms of another physical size, namely the height above the seafloor, denoted as altitude. Like the depth, the altitude must be observed continuously, as a collision with the seafloor can also damage and destroy the vehicle. Moreover, in mission scenarios where the vehicle has to record video data of the seafloor, a constant altitude should be maintained by the vehicle in order to collect video data of good quality. The altitude can be measured with an echo sounder; see section 2.4 for details. Figure 2‐5 shows the different meaning of the terms 'depth' and 'altitude'. It becomes clear that, when operating several marine vehicles in close vicinity, it is dangerous to base a scaling in 𝑧‐direction on altitude measurements. Two vehicles can be at the same depth level, even if their altitudes differ, depending on the structure of the sea floor. 2.2.5 Difference between Heading and Course Angle Marine robots are heavily affected in the execution of their missions by existing sea currents. Especially, differences can occur between the orientation of the vehicle and the angle of their movement in the inertial frame. We limit our consideration to the 2D‐case, namely we consider a current with only 𝑥 ‐ and 𝑦‐components. Therefore, we look at a vehicle moving in the horizontal plane.
Figure 2‐6: Introduction of sideslip and crab angle
The yaw angle 𝜓 of a marine vehicle is also denoted as heading, while the surge speed was introduced as 𝑢 before. As the main propulsion of most marine robots is effective alongside the 𝑋 ‐axis, the surge speed will usually provide the largest contribution to the velocity vector 𝐯 . Vehicles are usually able to measure their heading by an Attitude Heading Reference System (AHRS), and they can estimate their surge speed based on the propulsion rate of the propellers and a simple vehicle model (more details in section 2.4). However, the estimate of the surge speed does not consider the current; it is the surge speed with respect to the
2.2 Structure of Navigation Data in Marine Robotics
41
surrounding water, which is also called as surge speed through water and denoted as 𝑢 . Using their actuators like propellers, rudders, or fins, vehicles are able to maintain a movement with a certain velocity through the surrounding water. We will use the subscript 𝑊 to indicate that a speed or velocity is with respect to the surrounding water. Dependent of the concrete actuator setting, the velocity vector through water, 𝐯 , might not be oriented along the X0‐ axis of the body frame. This happens if the sway speed through water, 𝑣 , is non‐zero. This is denoted as sideslip effect. The angle 𝛼 between the X0‐ axis and 𝐯 is called ‘sideslip angle’. The sum of heading and sideslip angle is denoted as course through water . Figure 2‐6 depicts the situation. Note that the 𝐯 ‐ vector is spawned by the surge and sway speeds through water, denoted as 𝑢 𝑣 , the heading 𝜓 and the sideslip angle 𝛼 , which can also be written as 𝛼
arcsin
𝑣 𝑢
arctan
𝑣
𝑣 𝑢
𝜓 ,
(2‐11)
where is denoted as course angle through water. However, in cases of existing sea currents it is necessary to distinguish between two different 𝑣 𝑣 0 with the velocities. The sea current adds an additional velocity vector 𝐯 current components in the direction of the three axis. In this case, the true velocity of the vehicle, which is also referred to as velocity over ground and denoted as 𝐯 , can be computed to be 𝐯
𝐯
𝐯 ,
(2‐12)
as depicted in Figure 2‐6. As a consequence of the presence of the current, the vehicle will move in the direction of the vector 𝐯 in a direction with the angle , also denoted as course angle or simply course. Note that the orientation of the vehicle will remain along the 𝑋 ‐axis, as depicted in Figure 2‐6. The vehicle is not orientated along the movement direction; there is an angle caused by the current. This angle is denoted as crab angle 𝛼 and can be computed to be 𝛼
𝜓
arcsin
𝑢 𝑢
𝑣
arctan
𝑢 , 𝑣
(2‐13)
where 𝑢 and 𝑣 denote the surge and sway speeds of the vehicles over ground. Note that in order to observe the vehicle position in the inertial NED‐frame based on velocity and time, the velocity over ground have to be employed. In order to measure velocity over ground, vehicles need a device like a Doppler Velocity Log (DVL), which is relative expensive (see section 2.4). In the scientific part of this thesis in chapter 5‐7, we will discuss possibilities to estimate the sea current and the velocity over ground without the usage of a DVL system. In our further discussions, we will usually assume that the sway velocity through water 𝑣 and therefore the sideslip angle 𝛼 are close to zero, and the vector 𝐯 is oriented along the vehicle’s 𝑋 ‐axis. However, we need to consider the existence of sea currents. Therefore we need to consider a possible crab angle and distinguish between the heading 𝜓 and the course angle of a robot. Two important consequences can be derived from the discussions so far:
42
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
Firstly, at a given location with a certain sea current, an intended mission with marine robots can be unfeasible. It might be impossible for all or for certain robots to execute parts of the mission plan, e.g. if a path is planed towards the sea current vector, and its magnitude is bigger than the maximum surge speed with can be obtained by the vehicle based on its propulsion system. Current marine robots usually operate at speeds around 1 to 2 m/s; sea currents can easily reach similar values at certain areas even close to the coast.
Figure 2‐7: Desired and real behavior of two marine robots while collecting video data in an area with strong sea currents (Eckstein et al., 2015)
Secondly, as discussed, the course angle and the yaw angle of a vehicle can clearly differ. This can result in tremendous problems, because all sensors are usually rigidly mounted at the vehicle body and will therefore be turned away from the true movement direction by the crab angle of 𝛼 . This can be a serious problem for certain missions. In cases of cooperative mapping, the recorded overlapping video data of two or more vehicles have to be combined after the mission. If the video cameras are turned, as described above, it can no longer be guaranteed that there will be an overlapping. Figure 2‐7 shows an appropriate example. It becomes clear that it is of advantage if the vehicles move with or against the current (as long as their propulsion is strong enough) to avoid large differences between heading and course angle. This is discussed with more details in Eckstein et al., 2015. 2.2.6 Topological Navigation So far we have discussed the possibility of describing the pose and velocity data of a marine agent by coordinates in several reference frames. Nevertheless, this is not the only possibility to describe relevant navigation data. In fact, as human beings, we use a different method to orient ourselves. For instance, if we have to find an office within a building, we will use information like “room number 342, third floor, right aisle” rather than the coordinates within a local NED‐frame with its origin at the main entrance, or even coordinates in the Geocentric coordinate system. The reason for this that we have no sensing possibilities for our position in a reference frame, but we have a very good perception of the environment around us. In the stated example we can enter the building, find stairs or an elevator to reach the accordant floor, follow an aisle leading towards the right side and finally find the office by comparing room numbers at the doors. At no time, we will be able to express our position in coordinates in any reference frame, but in relation to distinct locations, like “in the staircase between the second and third floor” or “in the right aisle between room 320 and 322”.
2.3 Navigation, Guidance and Control in the Autonomous Control for Marine Robots
43
Methods like this have also been researched in the field of mobile robotics. The method of Simultaneously Mapping and Localization (SLAM, see section 2.4.3) aims to create a map of the environment and use it at the same time for self‐localization. According to Angeli et al., 2009, one can differentiate between the metrical and the topological approaches. The first concept is related to the localization of the robot in a metrical map and therefore similar to the concepts discussed so far. The latter one employs an environmental map which is represented by a graph of discrete locations, where the nodes identify distinct places while the edges link them according to their similarity or distance. This method was introduced by Choset & Nagatani, 2001 as T(opological)‐SLAM, aiming at a method that works in situations without global positioning possibilities and in environments without engineered landmarks. The robot does not know its position in a global reference, but in a Graph based map between distinct places. This comes close to the procedure discussed above to be employed by humans ("I am between room 320 and 322"). So, the robot performs, as the title of the mentioned document states, “Exact Localization Without Explicit Localization”. The authors (and many successors) have demonstrated that it is possible to control a robot based on that information. For this thesis, we will restrict the considerations to navigation data that can be expressed in a metrical map, that is, as coordinates in a reference frame. Marine vehicles which do not carry video or sonar equipment cannot use topological navigation, and even if video equipment is available the possible usage is restricted to areas with a certain level of visibility. The method is also limited if the vehicles operate in great distances to the seafloor, and it requires distinct underwater formations like cliffs or artificial objects as base for topological maps. Moreover, many applications in marine robotics explicitly require the usage of metric data. In the MORPH project for example, it was of importance to know the global positions of the vehicles collecting video and sonar data in order to generate geo‐referenced maps. The usage of topological format for navigation data was therefore limited, as long as it would not be transformed to metric data at some point. Those transformations are possible. In fact, some work has been done to merge the advantages of the metric and the topological concepts, like the ones from Eade and Drummond, 2007 or Konolige and Agrawal, 2008. After we have so far discussed how the navigation data of marine robots can be expressed, we will now seek a deeper understanding of the meaning and role of navigation in the overall control concept that expresses the autonomous movement of a marine robot. In this relation we will find the terms ‘Guidance’ and ‘Control’ that represent the other important parts of the control concept in the marine domain.
2.3 Navigation, Guidance and Control in the Autonomous Control for Marine Robots In order to understand the significance of the ‘Navigation’ in the context of the autonomous control of an unmanned marine robot, we shall take a look from the perspective of control theory. In this respect, there is the need to develop a control concept that enables the robot to autonomously execute missions preplanned by a human operator. This is mainly related to motion control, as more complex behavior as e. g. manipulation is not performed in autonomy up to date. According to Lapierre et al., 2006, the problems related to motion control of marine surface crafts can be classified into three basic groups: Point stabilization refers to the task to maintain a vehicle at a given position, rejecting external disturbances like wind, waves, or currents. Trajectory tracking is the science to track a time‐parameterized reference, that is, to make the vehicle following a predefined path with a desired precision, while also defined
44
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
points of the paths must be reached at given times. This relates to the term ‘spatiotemporal’ discussed in section 2.1. The third group is related to path following, which is similar to trajectory tracking, but without introducing temporal constraints: The vehicle has to follow a defined path, while it does not matter at which velocity it performs this task or when it reaches the end of the pat (‘spatial’ according to the discussion in section 2.1). It can be stated that this classification can also be made for the motion control of submerged marine robots. 2.3.1 Model of the Marine Robot In general, for the task of motion control, a scheme can be introduced as shown in Figure 2‐8, which borrows from archetypes in Antonelli et all., 2008, pp. 998, Fossen, 2002, pp. 11, and Fossen, 1994, pp. 2. First of all, we can consider the dynamical system ‘marine robot’ as a sequential combination of three subsystems: Firstly, there are the actuators, subsuming all devices that can actively influence the robot movement, like propellers, thrusters, or rudders. These are influenced by control signals summarized in the control vector 𝛅, like set point values for turning rates or angels. The output vector 𝛕 contains the resulting control forces and moments, applied to the robot, namely the second subsystem, the Dynamic behavior. Additionally, this subsystem is influenced by forces and moments created by external disturbances, like waves, sea current, and wind (for surface crafts). The output of the second subsystem is the velocity vector 𝐯, which than acts as input for the third subsystem, the Kinematic behavior. Based on the velocity, it outputs the pose vector 𝛈 which contains the position and the orientation and has an influence back on the Dynamic behavior. 2.3.2 Navigation System Below the Marine Robot Block in Figure 2‐8, we find the navigation system which is consisting of two parts. The first part is related to a wide range of different sensors that measure pose‐ related sizes, like acceleration, velocity through water, turning rate, to name but a few. It can be assumed that every marine robot has several of these sensors available, which deliver very heterogeneous data, differing in the possible accuracy and in the frequency in which the data is available. It is at this point where the second part of the navigation system comes into play, the observers. They perform the task to fuse the heterogeneous data and to make an estimation of the relevant pose information, like position, inertial velocity over ground or current alignment of the robot. From a control theory’s point of view, observers perform state observation. According to Rugh, 1995 (pp. 265), “[…] state observation involves using current and past values of the plant input and output signals to generate an estimate of the (assumed unknown) current state.” We shall discuss this in a greater depth in section 4.2 and 4.3, after we have introduced the state space description, and we will also introduce the concept of estimation which will be of even more importance for the scenarios we are going to discuss. For the moment, we can say that the observers have access both to the outputs of the system ‘marine robot’ which are determined by sensors, and also to input values. The knowledge about the latter ones might directly be transferred from the control system within the overall control software to the Observers (like e. g. a computed set value for a propeller rotational rate, as shown by the arrow from the signal 𝛅 to the observers in Figure 2‐8), or also measured by sensors (like the actual propeller rotational rate or the resulting forces/ accelerations), which is represented by the arrow between the blocks ‘Actuators’ and ‘Sensors’.
2.3 Navigation, Guidance and Control in the Autonomous Control for Marine Robots
Figure 2‐8: Scheme of the automatic control of an unmanned marine robot
45
46
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
The estimated navigation data vector 𝐧 functions as output of the navigation system. It contains estimates for all pose quantities relevant for the Guidance and the Control system. To be more precise, possible components of 𝐧 can be estimates of the pose 𝛈, the velocity 𝐯, and possibly of the relevant disturbances. For underwater agents e. g., it is of great benefit for the controller to have a good estimate of the water current. In general it can be stated that the concrete content of 𝐧 always depends on the current mission scenario, especially on the requirements of the Guidance and the Control system. It becomes clear at this point that especially the ‘Observers’‐block within the navigation system has to be designed with advanced methodologies of control theory, hence the focus of the results being discussed in chapters 5‐7 will be related to this part of the overall control scheme. 2.3.3 Guidance and Control System
Figure 2‐9: Configurations of the series or cascade compensation (a), feedforward compensation with additional trajectory generator (b), and cascade control (c)
To understand the role of the Guidance and the Control system, we shall compare them to the standard elements of the control loop in control theory. Figure 2‐9 a) shows the scheme of a standard single‐loop feedback system in series or cascade compensation (Golnaraghi and Kuo, 2010, pp. 490). The system to be controlled is denoted as Plant and mathematically described by the transfer function 𝐺 𝑠 , which describes the transfer of an input or control signal 𝑢 𝑡 to an output signal 𝑦 𝑡 using Laplace transforms. The plant can be subjected to one or several disturbances, denoted as 𝑧 𝑡 , which are additional input values of the plant that can usually not be directly manipulated. Given a so‐called reference signal 𝑟 𝑡 , the task of controller design is equivalent to that of finding a dynamic system, denoted as Controller and described
2.3 Navigation, Guidance and Control in the Autonomous Control for Marine Robots
47
by the transfer function 𝐺 𝑠 in order for the output signal 𝑦 𝑡 to strive to and maintain at the current value of 𝑟 𝑡 . The controller uses the difference 𝑒 𝑡 between 𝑟 𝑡 and the measured signal 𝑦 𝑡 as input and outputs the control signal 𝑢 𝑡 . For many practical applications, the reference signal can be considered to be a static value, denoted as set‐point or desired value. In these cases, the main task of the controller is to equalize the influences of the disturbance(s) to keep the output signal at the desired value. An automatic heating system of a regular house can be considered as an example for this method. The desired temperature within the rooms acts as set‐point value, which will usually remain constant. A typical disturbance for this system would be the outside temperature, or to be more precise the changing of the outside temperature. If the outside temperature drops, the heat transfer between house and environment will rise, resulting in a lower room temperature. The controller will need to raise the controller signal (e. g. the amount of water flowing through the radiators) in order to reestablish the room temperature at the desired value. The process described here is an example for the fixed set point control (SAMSON, 2016, pp.23), in which the reference signal remains at a fixed value, and the main focus is usually put on disturbance rejection. For marine vehicles, this task is denoted as Set‐Point Regulation (Fossen, 2002), e. g. for constant depth or speed values which were given by a human operator.
Figure 2‐10: Typical Lawnmower maneuver to cover a defined area, e. g. in a mapping mission
If we look at the control of an autonomous robot, we will usually want to plan a path or trajectory to be followed by the robot with high precision. On the one hand, this might be because the path is planned around a priori known obstacles, like walls, so that a large deviation to the path might result in a collision. On the other hand, if the robot has to perform a mapping mission, it is absolutely necessary that it follows a preplanned path with a desired temporal speed assignment along the path, usually in the shape of a so called lawnmower maneuver (see Figure 2‐10), to guarantee for the coverage of the whole area. Therefore, if we consider the robot position as output signal to be controlled, it becomes obvious that it is not enough to use a static value as reference signal, e. g. for the 𝑥‐ and 𝑦‐ position in a Cartesian coordinate system, but that these values need to follow a specific trajectory, related to the concrete mission plan. Those applications, in which a control system needs to enable the output signal to follow a changing reference signal with high precision, are denoted as follow‐ up control (SAMSON, 2016, pp. 23) or servocontrol (Lunze, 2010, pp. 700). In opposition to the fixed set point control, now the command response of the system is of utmost importance. For marine vehicles, this is denoted as Trajectory Tracking Control (Fossen, 2002).
48
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
To realize such a control the scheme can be expanded as depicted in Figure 2‐9 b), which borrows from Golnaraghi and Kuo, 2010, pp.490, as well as Lunze, 2010, pp. 383. In comparison to the simple structure discussed before, two changes can be noticed: Firstly, there are additional feedforward controllers which are displayed with dashed lines, as this part is optional; nevertheless it is a widely used process to improve the command response. Using the upper feedforward controller, changes of the reference signal can quickly influence the control signal. It must be kept in mind that the regular controller, described by 𝐺 𝑠 , often features an integral part, which makes it more precise, but also slower. The introduction of the upper feedforward controller improves especially the dynamic of the command response, while the classic controller equalizes the disturbance(s) and the model inaccuracies which always occur in real applications. The solution with the lower feedforward controller will usually be employed if the main controller 𝐺 𝑠 reacts relatively quick, e. g. if it does not feature an integral part. The combination of the main controllers with one of the dashed‐line feedforward controllers in the depicted way is denoted as Feedforward compensation. The second difference is made up by the trajectory generator which is responsible to compute the dynamic values for the reference signal. To this extend, it has access to the overall task description of the automated system. The trajectory generator might be activated when a specific task has to be accomplished, like the transfer of a subsystem of a process engineering facility from one state to another. It will then compute an optimized course for the output signal, often as a function of time, and forward this course as reference signal to the controller(s). The trajectory generator might also compute derivatives of the reference signal, like 𝐫 𝑡 , or 𝐫 𝑡 , which is not shown in Figure 2‐9 b) to improve readability. If we compare this procedure with the scheme of the automatic control of an unmanned marine robot as depicted in Figure 2‐8, we can see that the guidance system can usually be described by a feedforward compensation principle with added Trajectory Generator as discussed before: The task of the Guidance system can be denoted as to read the mission data from a human‐created mission plan which describes the general purpose of the autonomous vehicle missions, possibly to consider additional data like weather, and finally to compute reference values for the control system, while it must be noted that the concrete reference data to be transferred between Guidance and Control system can be very different in several realizations. The consideration of weather data can be of great benefit in certain circumstances. If we look at missions of marine gliders, which usually cover large routes up to several 100 kilometers within one mission, it can be reasonable to plan the concrete path considering the present and the forecasted sea currents to minimize the mission duration and the energy consumption, as successfully demonstrated by Eichhorn, 2013. Hence, this strategy is only of limited usability for typical AUV‐ and AUV‐team‐ missions which usually occur within a limited area for which no detailed sea current measurements and forecasts are available. Nevertheless, if adequate current measurements are available, e. g. from previous missions in the same area, they can be employed to find paths that enable the vehicles to move with or against the currents which is of advantage for sensor measurements. See Eckstein et al., 2015 for details. As shown in Figure 2‐9 b), it is common in control theory that the Trajectory Generator block does not have any information about the current status of the output signal 𝑦 𝑡 . In difference to this, Figure 2‐8 shows a (dashed) connection between the estimated navigation data and the trajectory generator. This is to show that the Trajectory Generator within the Guidance system will usually have at least some access to the navigation data. For instance, if the mission plan contains a lawnmower maneuver like the one depicted in Figure 2‐10, the
2.3 Navigation, Guidance and Control in the Autonomous Control for Marine Robots
49
Guidance system must be aware which of the lines or arcs composing the lawnmower is the one the vehicle is currently trying to follow, and it must detect when the end of the current part is reached and the next one is to start. In this case, the frequency in which the Trajectory Generator needs to read the estimated navigation data can be lower than for the other parts of the Guidance and Control system. As a result of these computations, there will be a geometrical object (usually a line of an arc element) that the vehicles shall follow with some defined precision, or other geometrical information, like course angle or distance to a defined destination. In Figure 2‐8, these are subsumed in the vector 𝐠. At some point, there must be a computation of concrete reference values for some navigation data on base of the current part of the mission plan and the current position/ velocity/ orientation. Usually, this computation will also be performed within the Guidance System, by a system referred to as Guidance controllers, which will in this case need to access the estimated navigation data. The line with the estimated navigation data is still dashed to denote that the Guidance controller will usually require this information at a lower frequency as the Autopilot (see below), but usually also more often than the Trajectory Generator. The output of the Guidance system will then be reference values for concrete navigation data, denoted as vector 𝐧 in Figure 2‐8. As discussed for the estimated navigation vector before, 𝐧 does not necessarily contain reference values for all navigation data like position, velocity and orientation, but only for some of them, depending on the actuation of the concrete vehicle. The Control system in Figure 2‐8 consists of the two blocks ‘Autopilot’ and ‘Control Allocation’. In this relation, the autopilot computes the necessary forces and moments in order to follow the references given by the Guidance system and stores them in the vector 𝛕 . The subscript 𝑟 already hints at the fact that these values are also reference values, while the Control allocation is responsible to control the actual actuators of the robot in order to achieve the required forces and moments. If we look at a classic propulsion system of a ship with a propeller and a side rudder, then the control of the vehicle’s course angle requires both propeller and rudder, as the vehicle cannot change the course without moving forward. A large number of realizations for different navigation data elements to control can be found in the books of Thor Fossen (Fossen, 1994 and Fossen, 2002). The set‐up of the Control system can therefore be compared with the cascade control as depicted in Figure 2‐9 c): Several (at least two) classic cascade compensations are nested inside each other. The output of the first controller (autopilot) equals the reference signal for the inner control loop, where the control allocation acts as controller. In a system like this, usually the inner loop is designed for a quick response time, while the outer loop is designed for high precision. If a good design can be chosen, the actual forces and moments 𝛕 acting on the vehicles will follow the planned one by the Guidance system, 𝛕 . Of course, the concepts shown in Figure 2‐9 b) and c) can be combined to improve the results, as it is the case in the demonstrated overall control scheme according to Figure 2‐8. 2.3.4 Example and Literature Study on Guidance and Control As the collaboration of the Guidance and the Control system depends a lot on the concrete realization for a certain vehicle, it is difficult to explain in a generalized way. We shall therefore discuss this using a concrete example. In Schneider et al., 2007a, an approach for the simulation of autonomous marine vehicles is described. An accordant control scheme on kinematic level is depicted in Figure 2‐11. The scheme is designed for a vehicle with propeller and yaw and pitch rudders without vertical thrusters.
50
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
Figure 2‐11: Possible structure of a controlled vehicle behavior model in a simulation of an AUV (Schneider et al., 2007a)
The upper part is denoted as ‘Task manager’ which can be considered a part of the Trajectory Generator. Together with the Guidance Controllers, it makes up the Guidance part of the control scheme. The Task manager accepts data from the mission plan and navigation data as inputs. By employing the latter one, it detects which part of the mission plan is currently valid, and when a defined point is reached. The Task manager in this example can fulfill two of the three movement control tasks that have been defined at the beginning of section 2.3: Reaching and maintaining at a given point (Point Stabilization) and following a defined path with a speed defined in the mission plan, without the need to reach certain points at the path at predefined times (Path Following). For the former one, it computes the distance to the desired point, 𝑒 , and hands it to the Distance Controller. This unit will then compute a reference value for the speed over ground, 𝑤 , that will decrease when the vehicle approaches the desired position, and drop to zero when it finally arrives. Also, the task manager computes the desired course angle from the current position to the desired position, 𝑤 , and delivers it to the SeaCurrent Compensation‐block, bypassing the TrackKeeping controller which is not needed in this case. For the Path Following task, the Task Manager hands the absolute angle of the path, 𝛹 , and the current Cross Track Error (the vertical distance of the vehicle to the desired path), 𝑒 , to the TrackKeeping Controller, which then computes a suitable reference for the course angle of the vehicle. The speed defined in the
2.3 Navigation, Guidance and Control in the Autonomous Control for Marine Robots
51
mission plan, 𝑤 , is directly given to the SeaCurrent Compensation‐block, bypassing the Distance Controller in this case. The depth/altitude of the vehicle is controlled separately. The Task Manager provides a reference value for altitude, 𝑤 = Height over Ground, or for the depth, 𝑤 , to the Depth Management, that computes the error value of the 𝑧‐coordinate, 𝑒 , which is then used by the DepthByPitch Controller to compute the reference value for the (absolute) pitch angle, 𝑤 . As the change of depth by pitching requires a certain speed through water, the Task Manager computes an adequate reference value, 𝑤 . All reference values described so far as well as the estimates of the sea current are imported into the SeaCurrent Compensation Block that computes the final reference values for the vehicle, for the pitch and heading angle as well as for the surge speed: 𝑤 , 𝑤 , and 𝑤 . If we compare this to the scheme depicted in Figure 2‐8, these three values are summarized in the vector 𝐩 and form the interface between the guidance and the control part. From that point on, the model employed in Figure 2‐11 simplifies the control system and the dynamic part of the vehicle model by assuming that the control loops for the elements in 𝐩 can be assumed to be high‐order delay blocks, that means, the real values will follow the reference ones with some delay. In this relation, the reference value for the roll angle 𝜙, 𝑤 , is computed proportional to the angular velocity 𝑟, that is, the vehicle will roll more, the faster the yaw angle is about to be changed, as it is realistic for typical underwater vehicles. Finally, based on the attitude and the current surge speed, the block "Body‐fixed‐>Earth‐Fixed Transformation" computes true velocity of the vehicle in the global inertial frame (𝑣 , 𝑣 , 𝑣 ), and after the summation of the sea current components, the values are integrated to result in the inertial position. Finally, all navigation data according to equations (2‐1) and (2‐2) has been computed. It has become clear so far that there is a close connection between the Guidance and the Control system and that the concrete realization can vary a lot, depending on the concrete motion task. As guidance and control is not within the focus of this thesis, we will only state some example out of literature to give the reader a general idea of the possibilities: A classic strategy to follow a path is to create a virtual target that moves exactly along the path, and then control the robot the follow the target, that is, to minimize the distance to the target. This procedure is sometimes called “chase the rabbit”, according to Millington and Funge, 2009, pp. 77. An adequate approach was successfully adapted to underactuated marine vehicles, as described by Aicardi et al., 2001. An overview of other solutions can be found in Bibuli et al., 2009. Another application of the virtual target strategy to AUVs, combined with backstepping control design methodologies, is presented in Lapierre and Soetanto, 2006. A solution based on gain‐scheduling control theory and the linearization of a generalized error vector about trimming paths has been proposed in Pascoal et al., 2006. The approach introduced by Indiveri et al., 2007, which was experimentally validated by Bibuli et al., 2008, addressed the underactuation of a marine robot already when defining the error variable to be globally and robustly stabilized to zero. In the context of cooperative robots, for instance in order to keep a closed formation, a possible strategy is to plan optimized trajectories for all involved robots that would keep the desired formation if all robots were able to exactly follow the trajectories with the desired speed, like suggested in Glotzbach et al., 2014. In order for this approach to work, it is necessary to use a path following strategy as the ones proposed in literature above, and to adapt the speeds of the vehicles along the paths following some control laws, while at the same time keeping in mind the narrow banded and error prone underwater communication,
52
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
like in the paper by Ghabcheloo et al., 2009. On the other hand it is possible for underwater vehicles to maintain a desired formation without following a predefined path, by determining the reference values for their course and speed based on their relative position with respect to two or one other marine robots, as demonstrated in Rego et al., 2014 or Abreu and Pascoal, 2015. If we compare the different concepts, we become already aware that the requirements for cooperative navigation may differ a lot:
For the first concept, which can be denoted as Cooperative Path Following, the robots need to be able to follow a defined path; therefore they must be able to estimate their positions in an inertial earth‐fix coordinate system.
For the second concept, they need to know the relative positions of two other vehicles with respect to themselves.
As a conclusion of these observations, we shall introduce a classification possibility for navigation in marine robotics within the following sections. 2.3.5 Requirements of the Navigation System for Guidance and Control
Figure 2‐12: Global/ Absolute navigation: Estimating a robot's position in a global reference
It has become clear that not necessary all navigation data according to Equations (2‐1) and (2‐2) need to be estimated. It always depends on the requirements of the guidance and control system, or possibly on the mission scenario itself. This can also be stated concerning the required reference frame in which the data need to be available. It is straightforward to say that the position coordinates usually are the most important values to be estimated. In equation (2‐1), they have been defined in the local earth‐fixed frame. The science of
2.3 Navigation, Guidance and Control in the Autonomous Control for Marine Robots
53
estimating these earth‐fixed coordinates of a vehicle is denoted as "global" or "absolute" navigation. Figure 2‐12 displays a scenario where global navigation is required. We assume that a surface vehicle is intended to follow a predefined path with good accuracy. The path itself is defined within the global frame; the robot uses GPS fixes to estimate its own position in the global frame. On that base, path following algorithms as described within the last section can be employed to fulfill the task. In the last section, we also discussed the possibility to realize the movement control of robot within a robot team based on estimation of other robots, relative to its own position. The procedure to estimate the position of a marine robot or any marine object in the body‐fixed reference frame of another (mobile) marine robot is referred to as "relative" navigation.
Figure 2‐13: Relative Navigation: Estimating a robot’s position in a body‐fixed reference of another robot
Figure 2‐13 displays the set‐up: The position of vehicle 2 is described in coordinates within a reference fixed to the body of vehicle 1, denoted as 𝑥 , 𝑦 . In an adequate scenario, vehicle 2 might be at the surface, following a predefined path with the help of GPS fixes. By estimating the relative position of vehicle 1 with respect to vehicle 2, the latter one can maintain in a formation with vehicle 1, as discussed in the mentioned literature source. In Figure 2‐13 it is assumed that the orientation of reference X1Y1Z1 is aligned with the orientation of vehicle 1. It is also possible to define the system with a fixed alignment, e.g. following the NED‐convention, with the origin fixed to the CG of the vehicle. As discussed, whenever defining a concrete scenario, it is important to clearly state which navigation data are supposed to be estimated, and in which frame they must be available. Two observations can be made concerning the relative navigation: Firstly, a vehicle 1 that performs relative navigation with respect to another vehicle 2 is usually not able to estimate its own global position. However, if we assume that vehicle 2 is able to perform global navigation for its own positions, that both vehicles have synchronized clocks, and that both vehicles store all navigation data obtained during mission, then it is possible to compute the global position data for vehicle 1 after the mission is over and all data can be retrieved. That means, vehicle 1 cannot use global navigation data for its motion control, hence global navigation data can be used for activities related to the payload data. In MORPH, for instance,
54
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
vehicle 1 would record video data that was intended to be used for the creation of the geo‐ referenced map after the mission. Therefore, the global navigation date of vehicle 1 while recording the video data is needed. As discussed, this information is available offline (after the mission) and can be used for post mission data processing. Secondly, we might consider a scenario in which vehicle 2 is replaced by a stationary object, like a buoy or a beacon. If we consider that vehicle 1 is able to perform relative navigation with respect to the stationary object, and that the constant global position of the object is known to vehicle 1 from before the mission started, then vehicle 1 is also able to determine its own position in a global reference; hence the principle of relative navigation has led to global navigation data. An adequate concept is employed in the process of single beacon navigation (see section 2.5.2). This does not work for dynamic objects like vehicles, except vehicle 2 is communicating its full navigation data to vehicle 1, which might lead to a high communication load for the acoustic channel. 2.3.6 Summary of the Discussions on Navigation, Guidance, and Control As a final summary of the discussions at this point, the following definitions for navigation, guidance and control according to Fossen, 2002 are cited (for navigation, it was already cited in section 2.1): “Navigation is the science of directing a craft by determining its position, course, and distance traveled. In some cases velocity and acceleration are determined as well.” “Guidance is the action or the system that continuously computes the reference (desired) position, velocity and acceleration of a vessel to be used by the control system.” “Control is the action of determining the necessary control forces and moments to be provided by the vessel in order to satisfy a certain control objective.” As we have so far clearly identified the meaning of the term ‘navigation’ in marine robotics and distinguished it from guidance and control, we can now move the focus to navigation as the process of providing all necessary information on position, orientation, and movement for both guidance and control systems. As discussed before, not necessarily all navigation data need to be estimated; this is always related to the requirements of the guidance and control system. Also, for a given scenario it is important to keep in mind which measurement data can be acquired by existing sensors. As a result of the evaluation of the required estimates, it is possible that one detects that more measurement data is necessary, and therefore more sensor(s) need to be employed. Therefore, when we define the benchmark scenarios for the navigation scenarios to be discussed in the further curse of this thesis in section 2.6, we will start to name the concrete navigation data we need to estimate, and the measurements that are available for this purpose. The tasks summarized as ‘Navigation’ in marine robotics is a very crucial subtask in the overall process to realize an autonomous agent. For instance, it is stated by Cox 1991 in the relation of land robotics that “using sensory information to locate the robot in its environment is the most fundamental problem to providing a mobile robot with autonomous capabilities”. This is especially true for the underwater environment, where the navigation is in general even more complicated than for land and air systems, due to the lacking of an existing system for position estimation, like the Global Positioning System (GPS), and due to the lacking of a reliable, broad banded communication link. We shall discuss this in more detail in section 2.4 and 2.5.
2.4 Sensors and Methods for Navigation of Marine Robots
55
2.4 Sensors and Methods for Navigation of Marine Robots As we have discussed so far, navigation for marine robots is a challenging, but important component of the overall process to realize autonomous behavior. Also, the concrete requirements on which data need to be measured or estimated at which frequency and with which accuracy strongly depends on the control task or the overall mission scenario. Also, submerged marine robots have no access to any GNSS. Both facts give rise to the development of various different solutions that usually have both advantages and disadvantages. There is no single ‘perfect’ solution currently available. For that reason, several heterogeneous data is recorded and merged to raise the accuracy of the final estimation. Also, it can be stated that a good estimation of the 𝑥‐ and 𝑦‐coordinate in an inertial frame is the biggest challenge up to date. First of all, GNSS can still play an important role in marine robot navigation, as it is a standard device for surface crafts. For them, it can be considered a cheap and precise navigation system. Marine surface robots usually have a good visibility to a large number of the satellites, as the view is not blocked by houses, trees, or mountain walls. It can be stated that the precision GPS varies around 0.1 to 10 m (Kinsey et al., 2006), but can further be improved with methods like DGPS (Differential GPS), more generically denoted as DGNSS (Differential GNSS) or RTK (Real Time Kinematic). These methods make usage of additional stationary reference stations or interpolated virtual stations. They provide corrective values for the position estimation that need to be transferred to the surface robot online in order for it to improve its position estimation. DGNSS procedures can reduce errors caused by the clocks of the satellites and ionospherically induced run time errors. The possible precision for the user station is within the area of 1 meter, as long as the distance to the reference station does not exceed a few hundred kilometer. RTK systems employ the phase of the carrier wave from the signal. The precision of RTK system can reach areas of 1 to 2 centimeters, if the conditions are optimal and the distance to the reference station remains lower than a few kilometers (Bauer, 2011). For any submerged object, GNSS cannot be used. In this subchapter, we review alternative navigation methods and provide literature studies. At first we will discuss sensors which enable a direct access to some navigation data (section 2.4.1), methods based on distance and/or bearing measurements to external objects (section 2.4.2) and mapping based methods (section 2.4.3) as the main methods to gain raw measurement data. In section 2.4.4, we look at filter techniques suitable to merge different raw data. A survey on cooperative navigation in robot teams will be given in section 2.4.5. Finally, section 2.4.6 provides an introduction and literature study on Optimal Sensor Placement, whose necessity will become clear in the preceding section. Finally we will sum up our discussions on navigation in general in section 2.4.7. We will motivate our selection of navigation based on inertial sensors and range and bearing measurements for the further course of this thesis. As a result, we need to take a deeper sight into the acoustic based range and bearing measurement techniques that are currently available within the maritime sector, which we will do in section 2.5. Table 2‐2 provides an overview of commonly used sensors for marine robot navigation. It contains mainly data from Kinsey et al., 2006, while other sources are stated explicitly. For each sensor, it is stated which navigation data can be measured, and update rate, precision, range and drift is given. The final column states in which section of this thesis the sensor is discussed in detail. It shall be mentioned that the actual performance of maritime navigation systems depend on a large number of variables, so the numbers given in Table 2‐2 should be understood as a rough estimation.
56
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
Table 2‐2: Typical sensors used for marine robot navigation, based on Kinsey et al., 2006 plus additional data Sensor
Navigation Data
Update Rate
Acoustic Altimeter
𝑧 (Altitude)
0.1
Pressure Sensor Inclinometer
𝑧 (Depth)
10 Hz 1 Hz
Precision
Range
Drift
Section
1.0°
/ 45°
10°
360°
‐
2.4.1 2.4.1 2.4.1 2.4.1
360°
10°⁄ℎ
2.4.1
0.01
1.0 m
Varies with frequency
‐
0.1
0.01%
Full ocean depth
‐
𝜙, 𝜃
1
10 Hz
0.1
𝜓
1
10 Hz
1
𝜓
1
10 Hz
‐
Magnetic Compass Gyro (mechanical) Gyro (Ring‐Laser and Fiberoptic) Gyro (North Seeking)
𝜓
1
1600 Hz
0.1
0.01°
360°
𝜙, 𝜃, 𝜓, 𝑥, 𝛚
1
100 Hz
0.1
0.01°
360°
IMU
𝑥, 𝛚, 𝛚,
1
1000 Hz
0.01 m
varies
varies
2.4.1 2.4.1
AHRS1
𝜙, 𝜃, 𝜓, 𝑥, 𝛚
1
200 Hz
𝜙, 𝜃: 0.1 0.5°; 𝜓: 0.2 1.5 °; 𝑥: 0.005 0.02 mg
Full range
1‐10°/h
2.4.1
5 Hz
0.3% or less
varies
2.4.1
2.4.1
‐
2.4
‐
Bottom‐Lock Doppler Propeller rotational rate + dynamic vehicle model2
𝑥 , 𝑥
1
𝑥
0.1
𝐩
1
LBL (12 kHz)
𝐩
0.1
LBL (300 kHz)
𝐩
1
SBL (150 kHz 3 USBL (20 28 kHz)5 USBL (7 17 kHz)6
𝐩
GPS
𝐩 𝐩
0.1°
18
100 m
1 Hz
Standard deviation: 0.1 m⁄s
unlimited
10 Hz
0.1 10 m; 0.01 0.1 m with DGPS/RTK
In air: unlimited if at least 4 satellites are visible In water: 0 m
1 Hz 10 Hz 1 Hz
0.1
2 Hz
ca. 1 Hz
10°⁄ℎ
0.1
2.4.1
𝑟: 0.2 m, α: 3°
500 m horizontal, 150 m vertical,
‐
2.5.1 2.5.1 2.5.3 2.5.4
𝑟: 0.01 m, α: 0.1°
8000 m
‐
2.5.4
0.1
10 m
0.007 m 𝑟4: 0.03 m, |∆𝐩|: 0.75 m
5
10 km 100 m 200 m
‐ ‐
2.4.1 Sensors With Direct Access to Navigation Data The high quality paper of Kinsey et al., 2006, provides an excellent discussion on different available sensor types and is used as general base for the discussions in this section. It can be stated that the 𝑧‐coordinate is usually the part of the navigation data that can be measured in the easiest way. The depth is directly related to the seawater pressure via standard equations for the properties of seawater; see e.g. Fofonoff and Millard Jr., 1983. Pressure sensors deliver highly precise data and belong to the low‐cost area. The most commonly employed methods are related to strain gauges and quartz crystals. With the accuracies reached according to Table 2‐2, it is common to use the measured depth, therefore reducing other localization problems as the ones discussed in section 2.4.2 and 2.5 to two‐ dimensional problems. For the measurement of the altitude, acoustic altimeters can be employed that are based on echo sounding. They measure the runtime of an acoustic signal from the sensor to the sea floor and back, from which an estimate for the altitude can be derived. 1
according to InertialLabs, 2016 according to own experiences in the MORPH‐project 3 according to Bingham et al., 2005 4 For SBL and USBL systems, 𝑟 refers to the measured distance, and to the measured bearing. 5 according to data sheet TRITECH, 2016 6 according to Product Information EVOLOGICS, 2016 2
2.4 Sensors and Methods for Navigation of Marine Robots
57
Magnetic sensors are the classical way to determine the angular information concerning heading (single‐axis) or three‐axis (flux‐gate magnetometers). Their benefits are their relatively good accuracy when properly calibrated, high update rates, low costs, and low power consumption. State of the art systems usually contain an on‐board microprocessor and provide a serial digital data output. However, the use of magnetic based navigation sensors may result in some typical errors within the overall navigation solution, based on magnetic disturbances of the vehicle itself or due to geographic, local magnetic anomalies, to name but a few. See Whitcomb et al., 1999a as well as Kinsey and Whitcomb, 2004, for a more detailed discussion of these issues. The measurement of roll and pitch angle is relatively easy. Standard inclinometers are based on determination of the direction of the acceleration due to gravity, employing pendulum sensors, fluid‐level sensors, or accelerometers. The accuracy of low cost sensors suffers from time‐varying vehicle acceleration (e.g. heave, surge, sway). Sensors from the medium price range feature additional gyroscopic devices which stabilize the attitude measurement. For the measurement of angular rates, gyroscopes are widely used. The first‐generation sensors were based on rotating mechanical gyroscopes, and their large size, costs, and energy consumption hindered a general use in civil marine robots. Vibrating gyroscopes are widely used to determine angular rates with good accuracy, yet not good enough for angular position determining. Optical gyroscopes, which can be separated into Fibre Optic Gyroscops (FOG) and Ring Laser Gyroscopes (RLG), represent the high‐end of this class of devices, yet again their high costs and power consumption have to be considered. Cheaper devices can be realized as Microelectromechanical Systems (MEMS), yet they suffer from higher drift rates up to 70°/h (Woodman, 2007). Very good results can be achieved with north‐seeking gyrocompasses, which employ the earth’s rotation and earth’s gravitational field. Recent improvement have decreased the cost of these systems, resulting in a commonly employment in high‐precision survey operations. They are also an important part of the so‐called Inertial Measurement Unit (IMU), which comprises of several sensors like acceleration sensors or angular rate sensors and is suitable to estimate position data by integrating acceleration measurements. Together with the components to merge the data from the single sensors, it is also denoted as Inertial Navigation System (INS). However, navigation data suffers from a drift, yet it might be very small for high‐end devices. It must the stated that standard IMUs are very expensive (often in the range of more than € 100,000) and their size, weight, and power consumption makes it hard to impossible to employ them for small civilian oceanographic robots. These systems might be employed for high‐precision surveys in challenging areas, like under icecaps. Details have been published e.g. by Larsen, 2002, Stokey et al., 2005, or McEwen et al., 2005. However, MEMS‐technology has enabled the production of cheap and small components, which also raised the availability of the system to be presented next. A sensor suit typically used today to replace the single gyroscope based measurement devices, while still being cheap and in the size of a match box is the Attitude Heading Reference System (AHRS). These devices feature solid‐state or MEMS gyroscopes, accelerometers and magnetometers on up to all three axis and usually provide measures for all Euler angles and possibly accelerations. The device usually also contains a microcontroller and employs some filter techniques (see section 2.4.4) to merge measurements of the single component sensors and to provide the data in a digital format, which tremendously improves the accuracy. See e.g. Martin and Salaün, 2010, or Zhi‐jian et al., 2003, for a more detailed discussion. The velocity of a vehicle can be measured employing the Doppler effect. Doppler Velocity Logs (DVL) can provide measurements for the true velocity over ground, that is, the speed over
58
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
ground and the true course angle, as long as the vehicle moves over passably flat terrain which is within the sensor range. This method, denoted as bottom lock, is possible for altitudes in the area of up to 100 ‐ 200 m. The system can also be used to measure the velocity through water, in cases where the bottom is out of range, which is denoted as water lock (Alcocer, 2009). DVLs play an important role in the navigation, especially for single autonomous vehicles, to provide the backbone for navigation or to support IMUs (Brokloff, 1994; McEwen et al., 2005) or other navigation techniques like LBL systems which are discussed in section 2.5.1. (Spindel et al., 1976; Whitcomb et al., 1998). According to Kinsey et al., 2006, DVL system typically exhibit two error sources, namely heading in terms of attitude sensor accuracy and precision as well as sensor calibration alignment errors between DVL and attitude sensor. The procedure to estimate the position based on velocity measurements is denoted as Dead Reckoning (DR). It must be kept in in mind that the described error leads to a drift over time for the position estimation error, as described for the IMU‐systems before. Similar sensors that operate on the base of the Doppler Effect are denoted as Correlation Velocity Log (CVL) which were designed to provide velocity measurements at an altitude up 500 m with a good accuracy even at low speeds (Bovio et al., 2006), and as Acoustic Doppler Current Profiler (ADCP) to directly measure the speed of the sea current (Yildiz et al., 2009). It must be kept in mind that the usage over unstructured terrain is limited; therefore, in mission scenarios like the ones under execution within the MORPH project, the usage of DVL systems was not considered an option. An easy, yet promising way to estimate the surge speed through water employs the rotational rate of the propeller(s), which is usually computed by the control system as reference value, and a dynamic model of the robot. It is usually possible to derive a steady‐state relation between the rotational rate and the speed with good accuracy. It is a disadvantage that this method needs to employ a dynamic model of the vehicle which is elaborate to build, and therefore needs to be adapted to every new vehicle. The measurement principles discussed so far based on sensors carried on board of the robots. None of them is capable to deliver a permanent drift‐less estimation of the 𝑥‐ and 𝑦‐ coordinates, which is usually of the biggest importance. A possible solution for underwater missions executed at low depth is that the robot carries DVL/ IMU/ AHRS equipment and comes back to the surface regularly in order to use a GPS position fix to correct the position estimation. This is a possible approach for survey missions, in which the vehicles are intended to move in lawnmower maneuvers (see Figure 2‐10) over the area of interest to collect sonar and/or video data. Usually, usable data is only collected at the legs of the lawnmower. The turning maneuvers, which have just the aim to bring the vehicle back in position for the next leg, might influence the roll angle of the vehicle and therefore also shift the sensor equipment, resulting in data that will not be used later. In this case, it is thinkable to emerge to the surface during the arcs for a GPS fix, and to return to the reference depth before the next leg starts. This solution is only feasible for missions at low depth. In general, there is the need for a global navigation solution that also works for submerged robots. In the next section, we discuss possible ways by usage of external devices. 2.4.2 Navigation Based On Distance and/or Bearing Measurements to External Objects In order to obtain a long‐time stable estimate of global 𝑥‐ and 𝑦‐ coordinates, a classical approach in engineering and science applications is to perform range and/or bearing measurements between the object which coordinates are to be estimated, and a set of devices
2.4 Sensors and Methods for Navigation of Marine Robots
59
with known coordinates. As soon as those measurements are available, lots of different mathematical procedures exist to compute estimates for the unknown position. Nevertheless, the first problem to solve is the concrete measurement. The ranges between the object and the devices can be obtained by measuring the Times of Arrival (TOA) of acoustic or electromagnetic signals. Assuming that the propagation speed of the signal in the medium is known and constant, the computation of the ranges is straightforward. Another possibility could be the measurement of the signal strength, which may also be directly related to the distance of the signal source. For this procedure, the notation Received Signal Strength (RSS) is used (Yan et al., 2010). According to Alcocer, 2009, it is common in the underwater domain to base the TOA methods on acoustic signal propagation between an object and a set of hydrophones/transponders with known coordinates, see e.g. Alcocer et al., 2007, Caiti et al., 2005, or Rendas and Lourtie, 1994. For land applications, like indoor systems, electromagnetic signals and a set of RF receivers/ transmitters are employed (Priyantha, 2005, Cheung et al., 2004). The described TOA approach requires the knowledge of the runtime of the signal between devices and object. That means, the receivers must know when the signal was transmitted by the sender. In some applications, this cannot be assumed, e.g. in passive sonar and radar systems and geophysics. In these cases, only the arrival times of the signal at the receivers are known, which gives rise to the Range Differences (RD) or Time Difference of Arrival (TDOA) problem. This problem is stricter in some sense as the TOA; hence it is possible to use TDOA algorithms on TOA measurements, but not vice versa (Alcocer, 2009). In TDOA algorithms, typically a common offset exists for the pseudoranges which must be estimated as an additional parameter. This offset can be eliminated by employment of differences among the pseudoranges, which led to the name of the approach (Yan et al., 2010). Examples can be found in Huang et al., 2001, or Smith and Abel, 1987. These approaches are referred to as Range‐only localization, or trilateration. The term triangulation is also sometimes used, but in a strict physical sense, it refers to localization based on angular measurements. (Alcocer, 2009). It is possible to determine range and direction to a source by employing TOA and TDOA approaches, as it is done in Ultra Short Baseline (USBL) systems, see section 2.5.4. The process of finding the unknown coordinates based on ranges exhibits a nonlinear problem. As such, no direct solution exists. The general approach to solve these problems is related to a optimization problem, namely to find the minimum or maximum of an objective function. As such function, sums of squared residuals, likelihood functions, power density functions, or similar can be employed. This gives rise to the two most commonly used criteria, the Least Squares (LS) and the Maximum Likelihood (ML) (Yan et al., 2010). The former one seeks to minimize the sum of a squared error, e.g. between the measured ranges and ranges that would result for a given estimate of the object’s position. The errors are squared to prevent positive and negative errors to cancel each other out. If the unknown parameter enters into the measurement value in a linear way, a closed form for the solution exists, that is, an optimal estimate can immediately be computed. This leads to a procedure referred to as linear regression and will be described in detail in section 5.1.2.1. Even though the Range‐only localization problem is nonlinear, developments have been done to achieve directly solvable LS approaches, by accepting some assumptions or simplifications. Related results have been reported e.g. by Yan et al., 2009, Guvenc et al., 2008, or Venkatesh and Buehrer, 2006.
60
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
The second mentioned criterion, ML, bases on the likelihood function, which describes, as a function of the unknown parameters, the probability to achieve the measurements at hand. The ML‐estimate is that set of parameters that maximizes this function, that is, the parameters that cause the observed measurements with the highest probability. The problem eventually requires the minimization of a noncovex nonsmooth objective function, for which no closed form solution exists. According to (Yan et al., 2010), this gives raise to the employment of Iterative Descent (ID) techniques, e. g. the steepest‐descent method, the Newton method, or the Gauss‐Newton method, to name but a few. A detailed analysis was published by Teunissen, 1990. These methods base on an iteration requiring several steps from a usually arbitrary chosen start parameter vector. The computation might produce a heavy load on the robot control hardware. Also, the final parameter vector determined on convergence might represent a local, but not the global minimum. Especially in this relation, one might be interested in the best possible estimation result that can theoretically be achieved. For this purpose, the Cramér Rao Bound (CRB) is studied as the estimate with the lowest possible error variance that can be achieved by an unbiased estimator (van Trees, 2001). This gives raise to the problem of Optimal Sensor Placement (OSP), which is of special interest in the further course of this thesis and will be introduced in section 2.4.6. It can be stated that in the maritime sector, the usage of acoustic signals is the most common way to generate range and bearing measurements. Nevertheless, especially in the close range, alternatives employing laser techniques are currently developed. In Bosch et al., 2016, the authors demonstrate a method based on pose recognition of an underwater robot, marked with active light markers, and the employment of computer vision. The author of this thesis is currently involved in the development of an underwater laser tracker feasible to measure bearing and altitude angle between two submerged targets. Intermediate results have been published in Eckstein et al., 2014, and Hamann et al., 2013. As a result from the discussions within this section, we shall further use the notation “Target” for the object(s) whose position(s) is/are to be estimated, while we introduce the term “Reference Objects (RO)” for the devices serving as a base for measurements of range and/or bearing. As one can already imagine, within the cooperative navigation these ROs might often be realized as marine robots. After having introduced the base idea behind the range/bearing based navigation, we will further discuss existing acoustic‐based solutions in the maritime sector in section 2.5, which will later set the course for the results presented within this thesis. 2.4.3 Mapping Based Methods Using a map for orientation is a straightforward approach for humans on land. As it was already discussed in section 2.2.6, this is due to the perceptional abilities of human beings, while we lack a sense of detecting our executed trajectory, based on velocity or acceleration, or find our coordinates in a global frame. The usage of maps is an important part in the overall navigation problem of land robotics. It is aimed to obtain a map of the environment, also denoted as environmental modeling. In marine robotics, the importance is not as big as in the land domain. On the one hand, this is due to the available sensors. In land robotics, it is straightforward to use all kinds of cameras, laser range finders or similar to obtain maps of the environment. The usage of cameras underwater is always a big challenge. It requires an advanced data processing. Also, it strongly
2.4 Sensors and Methods for Navigation of Marine Robots
61
depends on the mission area. In harbor basins, visibility is often limited to a few meters. On the other hand, in order to obtain maps that allow usage for navigation purpose, the robot has to be in the vicinity of environments with rich spatial diversity. For robots operating in great altitudes or over a flat and monotonous sea floor, the usage of maps is difficult. Nevertheless, mapping based methods have also been employed in marine robotics to support navigation, together with other already discussed methods. The mentioned difficulties in the employment of cameras gave rise to the usage of other principles for mapping, e. g. bathymetric, geomagnetic, or gravimetric features. If a vehicle is equipped with an echosounder and a depth cell, it can easily measure the local bathymetry. Bathymetric sonar sensors usually feature a range of up to 100 meter (Kinsey et al., 2006). Using an a priori existing bathymetry map of the mission area, the knowledge of the staring position, and velocity measurements or estimates thereof, navigation solutions have been demonstrated, e. g. by Teixeira, 2007, or Oliveira, 2007. In general, it is aimed to study the geophysical navigation principles of animal in order to copy their abilities. Discussions about this can be found in Bingman and Cheng, 2005, or Walker et al., 2002, for instance. In many scenarios, no maps are available a priori. This gives rise to methods that enable a robot to create a map of its environment and, at the same time, use it for its own orientation. This method is referred to as Simultaneously Localization And Mapping (SLAM) or Concurrent Mapping and Localization (CML) and has a great importance especially in land robotics. In the well‐known book “Probabilistic Robotics” by Thrun, Burgard and Fox (Thrun et al., 2006), it is stated that in SLAM the robot has to acquire a map of its environment while simultaneously localize itself relatively to this map, and that this procedure is significantly more difficult than the two involved single steps, namely the localization within a known map and the mapping with known poses. According to Montemerlo and Thrun, 2007, it can be stated that in a SLAM scenario, a robot moves through an unknown environment, observing typical landmarks in its environment. As the pose of the robot becomes more and more uncertain due to the existing control error, also the estimation of the position of the landmarks becomes more uncertain, while the robot moves. When the robot detects a landmark it has already observed before, this will lead to a significant decrease of the uncertainty in the robot pose estimation. As landmarks, or features, significant geographical objects have to be employed, which can be recognized from different distances and viewing angles. It is possible to base the algorithm on significant small objects, denoted as point features, edges, denoted as line features, or even more complex objects. The concrete choice depends on the structure of the environment in the mission area, the available sensor suites, and general conditions like visibility. As it was already discussed in section 2.2.6, the navigation data which is aimed to be achieved can be of metric or topological kind, and a various number of different approaches out of control and systems theory can be employed, while filters play an important role (see also section 2.4.4). For a more detailed description of SLAM, see for instance Montemerlo and Thrun, 2007, and the references therein. As described above, it can also be stated that the importance of SLAM algorithm in the maritime navigation is smaller than in land robotics, due to the worse conditions for optical sensors and the frequent absence of objects with significant shapes that can be used as landmarks, or features. Nevertheless, SLAM has also been investigated in marine robotics. An overview of recent work is given in the thesis of Aulinas, 2011. As it is stated there, Williams, 2001, reported an approach where he fused data from the on‐board sonar and vision system of a marine robot, relying on point features. Other examples for point feature based underwater SLAM were reported by Leonard and Feder, 2001, or Newman et al., 2003. Barkby
62
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
et al., 2009, presented an approach in which no explicit features are necessary, because the algorithms are based on a low‐resolution map that needed to be generated by a surface vessel before. Another SLAM approach for the inspection of ship hulls was reported by Walter et al., 2008. In Ribas et al., 2008, the authors used a mechanically scanned imaging sonar to extract line features as base for their SLAM algorithm. An approach based on side‐scan sonar was introduced by Tena‐Ruiz et al., 2004. According to Bichucher et al., 2015, recent work in underwater SLAM is concentrating on estimating the entire vehicle trajectory, which was first presented for an AUV by Eustice et al., 2006a, and later extended by Kim and Eustice, 2013, by considering the visual saliency of each underwater image. Another report of full trajectory smoothing was given by Fallon et al., 2011. We will recap the statements about the usability of SLAM‐based navigation algorithms and distinguish it from the methods to be discussed in this thesis in the summary section 2.4.7. 2.4.4 A Review of Filtering Techniques As explained before, it is common that several necessary navigation data cannot be measured directly, but it is possible to measure sizes that are derived from the data of interest. Nevertheless, these measurements might be error‐prone, noisy, and of low frequencies. Robust methods are needed to estimate the navigation data of interest, based on the measurement of derived sizes, while using all available knowledge about the general vehicle behavior. It becomes clear that the observer and filter methods of control theory have the potential to deliver an important contribution to this problem. Therefore, in chapter 4 we will discuss the concepts and ideas behind these methods as a precondition for the usage in chapter 5–7. At this point, we will look at recent work within this area. It is a precondition of the usage for these methods that a mathematical model of the robot is available, which might, at the other hand, also be used for the controller design. Stochastic state estimators play an enormous role (Kinsey et al., 2006), especially optimal unbiased estimators. Those employ knowledge of process and measurement noise for the computation of optimal gains (see chapter 4). In most cases, kinematic vehicle models are employed. A huge leap in the of stochastic state estimators has been undertaken by Kalman, 1960a, whose work resulted in a filtering method referred to as Kalman‐Filter (KF) and widely used in various applications and domains of control theory. As an important feature of this solution, the estimation of the navigation data is not only based on the noisy measurement of derived data, but also uses a model of the system behavior and known input values which are accessible as outputs of the control system or can easily be measured. We refer again to Figure 2‐8 to boost the understanding that the reference values for surge speed and attitude which are provided by the control system can directly be employed by the navigation system. The employment of filter techniques can be considered as a unique approach to use as many information as possible to merge them into a final estimate of variables that cannot be accessed directly. Nevertheless, the usage of the original Kalman Fillter concept is limited as it is formulated for linear systems only, while the models employed to describe mobile systems very often contain nonlinearities. Therefore, one standard approach for position estimation of mobile vehicles in general is the use of the Extended Kalman Filter (EKF), (Müller et al., 2010). It linearizes the nonlinear relation between measured output and the position to be estimated at the point of the current trajectory estimation. EKFs have been successfully applied to navigation of AUVs, e.
2.4 Sensors and Methods for Navigation of Marine Robots
63
g. in the context of SLAM, see Hernandez et al., 2009. A concept based on EKF and SLAM techniques was presented by Bayat and Aguiar, 2010, where partial measurements from an IMU, an acoustic ranging from a single beacon buoy, and a monocular camera attached to the AUV are merged based on multi model estimation techniques. In general, a drawback of the EKF is the fact that nonlinearities lead to non‐Gaussian distributions of stochastic signals, which actually violate the premises of the EKF. Approaches that consider nonlinearities are the Particle Filter (PF) and the Unscented Kalman Filter (UKF), which is also called Sigma‐Point Kalman Filter (Julier et al., 2000, van der Merwe and Wan, 2004, van der Merwe et al., 2004). Both filters have been used for navigation of AUVs, see Maurelli et al., 2009, Lammas et al., 2008 for PF, and He et al., 2009, Liu et al., 2008, or Qi et al., 2007, for UKF. It can be stated that Particle Filters are preferred if no rough guess of the initial vehicle positions is available. On the other hand, the UKF requires much less computational effort, which can be considered as advantage, since computational effort requires energy, which is limited on board the vehicle. Whereas the vehicle models can be obtained with a sufficient precision (if parameters like masses, engine power, etc. are known) stochastic filters suffer from the fact that covariances especially of the measurement usually have to be adopted. Concepts of a superior adaptation module have been developed, e.g. using fuzzy rules (Grana et al., 2009, Pentzer et al., 2009) or, from outside marine robotics, based on self‐calibration strategies as an initial adaptation to the sensor performance (Wachten et al., 2009). During vehicle maneuvers, classification methods can be deployed to the measurement, in order to classify different manoeuvre situation. Adaptive classification approaches as Kernel Methods (Shawe‐Taylor and Cristianini, 2006, Rasmussen and Williams, 2006) can be deployed, in order to assess environmental conditions and to allow best suited state estimations. Communication bandwidth is limited in the underwater environment. This may lead to significant time delays such that location information from other vehicles is only obtained from past time points. Hence, it is important to consider time delays within state estimation of inter‐ vehicle navigation, as we have demonstrated in Glotzbach et al., 2012, and will continue to discuss in section 5.2. For each transmitted measurement a time stamp is additionally provided. It can be used within the estimator to step back in time to this specified time point, in order to make a correction of the estimate followed by a prediction forward to the current time point (Back and Forward approach, see Alcocer et al., 2007, and Alcocer, 2009). Concepts have also been developed for master‐slave AUV configurations as in Yao et al., 2009a, where the filter is called Delayed Extended Kalman Filter (DEKF), or in the context of networked control systems (Ying et al., 2007). If no measurement if available, the model‐based estimator is able to continue predicting the vehicle state. This property can be used to reduce communication requirements: The transmitting vehicle checks, if an innovation has been occurred and distributes its information only in this case. This will lead to an asynchronous filtering as proposed in Yonghui et al., 2006. As an alternative to stochastic estimators, deterministic ones can be employed (Kinsey et al., 2006). They are based on deterministic models of system behavior and measurements. Employing methods from control theory, it can be proven that an employed estimator is asymptotically stable. The method is often used with dynamic vehicle models, see e.g. Lohmiller and Slotine, 1998, or Kinsey, 2006. The employment of the vehicle’s nonlinear dynamics might be of advantage in comparison to stochastic estimators, however, it is a drawback that no analytical methods exist for the computation of optimal gains; therefore, numerical simulation have to be employed.
64
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
The examples mentioned within the last chapters partially hinted at the complex of navigation methods based on interactions between cooperating marine robots, denoted as cooperative navigation. We will provide a short literature study in the next section. 2.4.5 Cooperative Navigation With the growing availability of marine robots, it can be stated that cooperative navigation comes more and more into research focus, in the sense that one tries to exploit the availability of several robots to compensate for the missing global positioning possibilities underwater. The usage of at least one Autonomous Surface Vehicle (ASV) with GPS access, range and/or bearing measurements to one or more AUVs, and the employment of advanced methods from system theory can result in working navigation solutions. Several examples can be found in literature. For example, Bahr et al., 2009, employed a bank of filters for “bookkeeping” multi vehicle trajectories. Based on dead‐reckoning and range‐only measurements over a sliding time window, a set with all possible solutions for the AUV trajectories is build, and the most promising one based on the minimizing of a cost function is selected. Methods have been developed to overcome the use of a beacon network (e.g. long baseline, s. section 2.5.1), like e. g. Baker et al., 2005, or Baccou et al., 2001. Systems based on acoustical measurements (de la Cruz et al., 2000), certain architectures like a leader‐follower configuration (Edwards et al., 2004), and sensor data fusion algorithms (Yao et al., 2009b) have been investigated. Papadopoulos et al., 2010, reported an experiment with one ASV and one AUV comparing several estimators, i.e. EKF, Particle Filter (PF) and Nonlinear Least Squares (NLS). The movement of the ASV was fixed to a zigzag track. Another experiment of Matsuda et al., 2014, consisted of a setup with two AUVs, and the leader role was alternated between each other. By these discussions it becomes clear that one focus within navigation in general and cooperative navigation in particular lies on the process of merging data which was acquired with different sensors at different rates and with different accuracies. This gives rise to advanced concepts from nonlinear filtering theory, which is an important part of the control theory. Within this thesis, we will contribute especially to this domain with suggestions of new solutions for specific problems that will be formulated in section 3.2. For this extend, we will mainly focus on range‐ and (partially) bearing measurements and on available measurements from internal sensors, which might be different for several mission scenarios. Additionally, it is of great importance to understand the ultimate limitations of the achievable performances that can be achieved with different filtering structures. This gives rise to studies in an area which is referred to as Optimal Sensor Placement, which we shall discuss next. 2.4.6 Introduction to the Problem of Optimal Sensor Placement (OSP) In our discussions on the navigation possibilities for marine robots we have put a focus on those methods employing range and bearing measurements between the object(s) whose positions are to be estimated and some reference objects, so far mainly represented by static beacons or passive buoys. Even for these scenarios the question arises as to where to place the sensors to provide measurement as a base for a position estimation that is optimal in some sense. This problem is referred to as Optimal Sensor Placement in literature. As we are intending to replace the buoys in the GIB scenario with marine robots that can actively be controlled, the question can even be extended to the problem of finding optimal motion schemes.
2.4 Sensors and Methods for Navigation of Marine Robots
65
This problem has been studied in different contexts, not limited to marine robotics. In the literature, one can find studies on the optimal spatial placement of a single sensor, e. g. for a camera in the context of a global vision system (Kay and Lou, 1992), for a point tactile sensor including the computation of a search path for the determination of a unique pose of a known object (Ellis, 1992), or for a laser range finder to determine the next viewpoint for the 3D‐ perception of the environment (Maver and Bajcsy, 1993). Other studies deal with the placement of several sensors. Abel, 1990 discussed the optimal placement of sensors within a line‐shaped array. Zhang, 1995 describes a method for two‐ dimensional Optimal Sensor Placement for underwater vehicles that bases on the interpretation of the sensor uncertainties as ellipses, computing the superimposed area of uncertainty caused by all employed sensors. In these studies, as stated by Zhang, very often different constraints on the placement of sensors need to be considered. To employ methods of control and estimation theory, it is possible to introduce the Cramer‐ Rao bound (CRB) as cost function for an optimization problem. The CRB yields relevant information on the best possible performance in target positioning that can possibly be achieved with any unbiased estimator, where performance is evaluated in terms of the variance of the position estimates (van Trees, 2001). Therefore one is interested to minimize the CRB. As the CRB is the inverse of the Fisher Information Matrix (FIM), possible strategies include the maximizing of the FIM determinant (as e. g. performed by Bishop et al., 2007 or Martinéz and Bullo, 2006) or the minimization of the trace of the CRB (see e. g. Yang and Scheuing, 2005 or Yang, 2007). We will provide an elaborate discussion on FIM and CRB in the theory chapter 4. See for example Taylor, 1979 and as well as Porat & Nehorai, 1996 and Ucinski, 2004 for lucid presentations of the above topics. In the area of marine robotics, the work of David Moreno‐Salinas shall be mentioned, see e.g. Moreno‐Salinas et al., 2010, Moreno‐Salinas et al., 2011, or Moreno‐Salinas et al., 2013. Some of the work demonstrated in this thesis was mainly inspired by the work of Martinéz and Bullo. They have, among others, investigated the following question: Assuming a wheeled robot is able to move inside of a defined convex area, and assuming there are several sensors available that can measure the distance to the robot with some overlaid measurement noise, and that have to be placed on the boundary of the convex area, what would be the best angular configuration for the spreading of the sensors to maximize the determinant of the FIM? The fact that the usage of OSP requires the target position to be known is often used to question its usefulness and its practical relevance in scientific discussions. As the OSP is in the main focus of the discussions in chapter 6 of this thesis, we will commend on that standpoint at the beginning of this particular chapter. The OSP is of big relevance in the scientific parts of this thesis, which will be organized in the following way: While we will report our activities on cooperative navigation with filter‐based data merging in chapter 5, and on Optimal Sensor Placement in chapter 6, we will discuss the possible combination of both techniques in chapter 7. Thereby we will provide another proof of the usefulness of the Optimal Sensor Placement methods in the marine robot domain. 2.4.7 Summary of Discussions on Navigation Procedures and Methods To summarize the discussions on maritime navigation so far, we can state the following: Navigation is a challenging subtask in the overall process of realizing autonomous behavior for marine robots. As there is no global solution available for navigation especially of submerged robots, several very different approaches have been employed that all have advantages and
66
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
disadvantages. As a common solution is still out of reach, it is important to clearly define the requirements of a concrete mission scenario, and then realize a solution that fulfills these demands. To this extend, it is possible to employ internal sensors, range and possibly bearing measurements to external objects, and mapping techniques, which has been discussed in detail. As these different methods will deliver partial solutions with different levels of rate and accuracy, it is important to merge the data, which gives rise to advanced technologies from the control theory domain. Especially the linear and nonlinear filtering theory plays a major role. The usage of multiple, cooperating robots may pave the way to new navigation possibilities, which was already discussed in section 2.4.5 and will further be examined in the chapters 5‐7 of the thesis at hand. In this relation, we will use internal navigation data as well as range and bearing measurements to external objects (and likewise to other team members) to gather data for the merging process. The usage of mapping techniques will not be discussed furthermore. It can be stated that the employment of range and bearing methods result in a slightly higher generality of the proposed solutions, as no requirements concerning visibility and proximity to objects with proper complexity have to be meet. This should not be understood as a general critic on these methods, as it has been shown in section 2.4.3 that valuable contributions can be achieved if the stated requirements are met. In fact, if in a certain scenario navigation data from mapping techniques is available, it can be included in the filtering concepts that are to be present in chapters 5‐7. For the same reason, we will also not include the usage of DVL systems for direct measurement of velocity over ground, as this is not possible for vehicles in certain altitudes or above rough terrain (as for instance within the MORPH project). Keeping these statements in mind, in section 2.5 we will discuss available methods for acoustic based range and bearing measurements in the maritime sector. The final technique discussed in section 2.5.5 will later be the entry point for the first benchmark scenario definition in section 3.3.1 and the discussions in section 5.2.
2.5 Navigation Employing Acoustic Measurements Within this section, we continue the discussion started in section 2.4.2. At this point, an overview of existing technologies based on these techniques in the maritime sector is provided. The discussions at this point were inspired by those in Alcocer, 2009, and Kinsey et al., 2006. The procedures to be described here became necessary to support the classical INS and DR systems, as these will always exhibit a drift over time. As discussed before, high‐precision systems are available, but even these can only operate for a limited time until the position estimation error becomes too big to be of further usability. Besides, high‐precision systems are very expensive, and usually they are also relatively large in size and exhibit enormous energy consumption, therefore they might not be a choice especially for small, civilian marine robots. This gave rise to concepts that allow for an estimate of global position data based on acoustic range and bearing measurements to sources with known locations. We will discuss the most common concepts. 2.5.1 Long Baseline (LBL) It is a basic concept to describe the position of an underwater target by ranges to at least three reference objects with known coordinates. This led to the so‐called Long Baseline systems (LBL) which are a classical procedure for global underwater navigation. As shown in Figure 2‐14, several beacons with acoustic transponders are fixed at the seafloor prior to the mission
2.5 Navigation Employing Acoustic Measurements
67
of the marine robot. Usually, the robot will interrogate the beacons; that means it will send a trigger ping which will be answered by one of the beacons. Based on the overall runtime, the robot obtains a measurement for the range to the beacon. Employing a set of range measurements and the a‐priori knowledge about the beacon positions, the robot can obtain a position estimate.
Figure 2‐14: Long Baseline (LBL) Navigation
LBL systems are designed to operate over distances of a few kilometres. Usually, a distance of at least 100 meters is assumed between the beacons, so it can be stated that LBL systems are employed for long range operations, as suggested by the name. As shown in Table 2‐2, these systems typically exhibit an error in the range of several meters and a quite low frequency, as several acoustic communication processes are necessary to obtain enough range data for a position estimation. The frequency might become even lower if one uses this system for a mission with multiple marine robots. Systems employing higher acoustic frequencies might reach a higher precision, at the cost of a smaller operation area. Approaches for underwater navigation based on LBL systems have been described by Kinsey and Whitcomb, 2004, Whitcomb et al., 1999b, or Hunt et al., 1974, to name but a few. The position estimation based on the range measurements can be obtained by collecting enough data to use a trilateration algorithm, as discussed in section 5.1, or by the employment of filters. For instance, Batista, 2015 describes the usage of a globally exponentially stable filter for LBL measurements. It must be kept in mind that the usage of an LBL system is usually relatively costly. At first, the beacons must be transported to the seafloor and fixed, especially at slopes. After this, the position of the beacons must be determined with high precision. Usually, a support ship is used for these operations, and the beacon positions are estimated based on range measurements to the supply ship at various positions, while GPS is used to measure the true global positions of the ship (see Carta, 1978, or Hunt et al., 1974). This is a time‐intensive procedure. One has to keep in mind that the daily costs of supply ships for marine robots are in the range of five digit numbers. This gave rise to efforts for finding alternative global positioning systems for underwater targets.
68
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
2.5.2 Single‐Beacon Navigation
Figure 2‐15: Single Beacon Navigation
In order to simplify the process of global navigation, methods have been studied that rely on range measurements to only one RO. As the simple knowledge of the range to a single RO with known coordinates still results in an infinite numbers of possible positions for the target, it is necessary to merge several consecutive range measurements with estimates of the vehicle’s true velocity, 𝑥 , which can be obtained by a bottom lock DVL, for instance. It can also be stated that the ranges are used as a fix for the drifting error of simple DVL‐based navigation. The principle is depicted in Figure 2‐15. Standard approaches for single beacon navigation have been derived from LBL systems while only employing one beacon, therefore copying the concept that the vehicle has to interrogate the beacon, resulting in a two‐way time‐of‐flight range measurement. Examples have been reported by Ross and Jouffroy, 2005, Baccou and Jouvencel, 2002, and Larsen, 2000. In order to simplify the overall concept, efforts have been undertaken to develop systems based on one‐way time‐of‐flight range. This was boosted by the development of acoustic modems and it refered to as One‐Way Travel Time (OWTT) navigation. In general, this requires synchronized clocks at the vehicle and the RO. Also, it was studied to use surface ships as RO, in order to enable the operation in a larger area. If the RO is at the surface, it is common to use GPS receivers to synchronize clocks between robot and RO before the robot submerges. Discussions about this method can be found in Webster et al., 2012, Eustice et al., 2006b, or Curcio et al., 2005. 2.5.3 Short Baseline (SBL) Another approach to bypass the time‐ and cost‐intensive mounting and calibration of the beacons at the sea floor are the Short Baseline (SBL) systems. As the term ‘long’ in Long Baseline systems was usually defined as distances of more than 100 meters between the beacons, SBL systems usually exhibit distances between 1 and 100 meter. The basic idea was to mount a set of receiver hydrophones to a ship hull or another rigid structure, to that no calibration would be necessary. The principle is depicted in Figure 2‐16. It becomes clear that, opposite to the LBL‐principle, firstly the position of the marine agent is estimated relatively to the reference object carrying the receivers. A global position estimation is only possible if the
2.5 Navigation Employing Acoustic Measurements
69
position of the RO is known. If a surface ship is employed, GPS can be used to estimate the ship position. In this case, the translation between the GPS receiver and the hydrophones as well as the movement of the ship, especially the attitude changes caused by waves, must be considered. Secondly, it is important to mention that the position of the submerged robot can be estimated at the supply ship, not at the robot itself. If the robot needs the position estimation for control purposes, it must be communicated back from the supply ship, which required another acoustic communication in each interrogation circle.
Figure 2‐16: Short Baseline (SBL) Navigation
SBL systems are described e.g. in Bingham et al., 2005, or Smith and Kronen, 1997. SBL systems have not become a widely used method for global position estimation. This might be due to the development of the next system to discuss, that features an ever simpler handling and still delivers position data at accuracies which are sufficient for several applications. 2.5.4 Ultra‐Short Baseline (USBL) Continuing the idea of shortening the distances between the transponders or hydrophones, and aiming at the simplification of the mounting and calibration process, Ultra‐Short Baseline (USBL) systems have been developed. In a USBL system, several acoustic transponders are mounted on one transducer head, which results in distances between them below one meter.
70
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
As one can imagine, this results in a system that is very ease to mount at a supply ship, or even at a marine robot, so that it is very interesting for cooperative navigation. USBL systems and their usage have been described e.g. by Jalving et al., 2004 or Peyronnet et al., 1998. One the other hand, due to the short baselines, advanced methods of signal processing are necessary to obtain usable measurement data. Figure 2‐17 depicts the principle. The robot to supervise sends an acoustic message, which is receives by the transducers in the receiving array. USBL systems employ both TOA and TDOA and provide a measurement for the distance as well as the bearing and altitude angle between the sender of the acoustic message and the carrier of the system. An overview of the obtained measurement data is given in Figure 2‐18.
Figure 2‐17: Ultra‐Short Baseline (USBL) navigation
Several of the properties discussed for SBL are also relevant for the USBL. The position estimation of the target is made relative to the RO carrying the system; therefore, global position estimates will only be available is the global positions of the RO are known. This can be assumes if the RO is at the surface and has access to GPS. Actually, state of the art USBL systems often are equipped with a small INS unit and accept inputs from GPS systems in order to deliver global position estimates, or they already feature an intergrated GPS system (Audric, 2004). The other issue is once again the fact that the position estimates will be available at the RO, not at the target; that means, another acoustic communication from the RO to the target is necessary if the target is intended to use the information for control issues.
2.5 Navigation Employing Acoustic Measurements
71
Figure 2‐18: Range r, bearing angle , and altitude angle obtained by an USBL system carried by vehicle i (yellow)
Figure 2‐19: Comparison of the acoustic baseline systems
The second issue and the fact that USBL systems are quite small gives rise to the possibility to mount them to the target instead of the RO. Then ROs, which might for instance be firmly
72
2. Navigation in Marine Robotics: Methods, Classification and State of the Art
mounted to the seafloor, send acoustic pings, while the target is able to determine its position. This procedure is denoted as “Inverted USBL” (Morgado et al., 2006, Vickery, 1998). We will employ this idea, with a moving RO, in Benchmark Scenario II (section 3.3.2) and section 5.3. If we summarize the discussions of section 2.5 so far, we see that every system has advantages and downfalls. As the single beacon navigation relies on a velocity measurement over ground, which is usually performed with a DVL system, and we have explicitly excluded the usage of DVL for the solutions to be discussed later in section 2.4.7, we shall compare especially the baseline systems, which are displayed in Figure 2‐19. The LBL system obtains the best accuracy in a certain large area, but the system is costly to deploy and cumbersome to calibrate. On the other side, a USBL system is easy to deploy and to calibrate. It also delivers usable position estimation with good accuracies for specific geometric formations and limited ranges between target and RO. But if one intends to cover the whole area of operation that can be obtained by employing a LBL system, the accuracy of a comparable USBL system will drop significantly. Still, there is the need for a system that reaches the performance of a LBL‐system, while the effort to employ it should be more related to a USBL system. A possible solution for this task is discussed in the next section. 2.5.5 GPS Intelligent Buoys (GIB)
Figure 2‐20: GPS Intelligent Buoys (GIB) Navigation
The idea to employ a LBL‐like system, without the need to mount the beacons at the seafloor, has led to the GPS Intelligent Buoys (GIB) concept. This system, depicted in Figure 2‐20, consist of a set of buoys, equipped with GPS receivers, hydrophones, and radio modems. When receiving an acoustic message from a submerged agent, the buoys sent the relevant information instantly via radio to a Command Center, which can be located on board of a
2.5 Navigation Employing Acoustic Measurements
73
supply ship, or on land, for operations close to shore. This system is also referred to as “Inverted LBL”, as the LBL principle is used as the base idea, but the RO are now free moving or moored surface buoys. Again, as discussed for SBL and USBL systems, this results in the fact that the position estimation is available at the command center; if it is needed by the submerged robot, it must be transfer by acoustic communication. One can state that this is comparable with a GPS‐like system for underwater applications, and first ideas were formulated accordingly (Youngberg, 1992). Due to the challenging problems in underwater communication and control theory to be addressed, it took a long time to develop working solutions. Systems employing surface buoyse with GPS receivers and acoustic communication capabilities are reported e. g. in Freitag et al., 2001 and Thomas, 1998. A commercially available GIB‐system is described in Alcocer et al., 2007. The GIB scenario marks the interface to the own work, reported in the chapters 5–7. As one can imagine, the question arises about what happens when the static or only slowly moving buoys are replaced by autonomous surface crafts, able to follow the underwater target. We will choose this as the scenario to start from. .
3 Problem Formulation and Definitions for the Discussions to Follow This short chapter describes the goals of the research activities that will be discussed in the chapters 5‐7. At first we will introduce two new classification possibilities that we will employ in the definition of relevant mission scenarios. Then we will state the problem formulation and therefore introduce a unique notation that we will use in the remaining thesis. At the end, we will define some benchmark scenarios that will be used to validate the research activities.
3.1 Two Different Concepts: Internal vs. External Navigation As we have already stated a possible classification for navigation concepts with the absolute and relative navigation in section 2.3.5, we shall now suggest another classification possibility that arises from the discussions so far and that will help to differentiate between the benchmark scenarios defined in section 3.3: Definition: Internal Navigation Internal Navigation is the process of estimating the position, orientation, and/ or the velocity of an underwater object from within the object, that is, having access to all data from sensors mounted on the object, but only limited access to data from outside of the object (usually only distance and/or bearing measurements from one or more reference objects). It is a consequence of this definition that the results of an internal navigation procedure is directly available within the underwater object and can therefore be used as an input for the control system. Note that for static reference objects, which positions were known to the underwater object before diving, internal navigation can deliver absolute navigation data. For mobile reference objects (especially other marine robots) that do not transmit their current position coordinates, the results are always relative navigation data. The scheme is depicted in Figure 3‐1.
Figure 3‐1: Internal Navigation: The pose/ velocity of the robot is measured/ estimated inside the vehicle
© Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2020 T. Glotzbach, Navigation of Autonomous Marine Robots, https://doi.org/10.1007/978-3-658-30109-5_3
76
3. Problem Formulation and Definitions for the Discussions to Follow
Figure 3‐2: External Navigation: The pose/ velocity of the robot is measured/ estimated from outside the vehicle
Definition: External Navigation External Navigation is the process of estimating the pose/ velocity of an underwater target from outside the object, that is, being (usually) placed outside the water and simultaneously having access to distance and/or bearing measurements from (a) (static or dynamic) reference object(s), while only having limited or no access to data from sensors mounted at the object. As long as the reference objects are placed at the surface/ on land, external navigation delivers absolute navigation data. This method is suited for the task of supervising a submerged object from a central station; however, if the object requires the navigation data for control purposes, an additional acoustic communication process from the central station to the object is mandatory. This is summarized in Figure 3‐2.
3.2 Problem Formulation In this section we state the problem formulation for the cooperative navigation, mainly based on range and/or bearing measurements. The formulation of the range‐based navigation problem seeks inspiration from Alcocer, 2009. We assume that we are interested in estimating navigation data of a submerged marine agent. The concrete amount of data to be estimate depends on the concrete mission and will later be individually be defined for every benchmark scenario. Hence, we can state that the 2D‐ position will always be of interest; no matter whether it is needed by the guidance and control systems, for the concrete mission task (e. g. the creation of geo‐referenced maps), or to supervise a robot from the central computer by a human operator. With respect to the definitions made in section 3.1, it is of special meaning whether one strives for an internal or external navigation solution, as this set several limitations. For the problem formulation, we will try to be as generic as possible. Additional to the Range‐only localization problem, we will also describe the situation for additional bearing measurements. To start our discussions, we shall assume that there is an underwater object, denoted as Target, and we wish to estimate its navigation data. Additionally, we have a varying number of 𝑛 marine robots, denoted as Reference Objects (ROs), to support the navigation task, either by
3.2 Problem Formulation
77
measuring range and/or bearing to the target (external navigation), or by enabling the target to perform range and/or bearing measurements to them (internal navigation). Note that in a larger group with several submerged vehicles, both concepts might be mixed (see e. g. section 3.3.2 or section 5.3). Every vehicle/ object carries its own reference frame, with the origin in its center of gravity (CG), and the orientation of the axis are either fixed to the body, as described in section 2.2.2, or fixed within the environment, e. g. following the NED‐convention. Using the indices 0 for the target and 1 ‐ 𝑛 for the RO, this gives rise to 𝑛 1 coordinate systems, denoted as X0Y0Z0 ‐ XnYnZn. Additionally, we introduce a local coordinate system, considered to be inertial according to the discussions in section 2.2.1, with its origin at an arbitrarily user‐defined position, and the orientation of the axes either following the NED‐ or a Cartesian convention. For this system, we use the notation XYZ. In this section, we will use Cartesian reference frames.
Figure 3‐3: Problem formulation: Range‐based navigation
We will start the discussions with the situation where only range measurements are performed. According to Figure 3‐3, we assume that the target is currently located at the position 𝐩 , and n ROs are located at positions 𝐩 ‐ 𝐩 . As stated before, the problem might be treated in all three dimensions, hence, if the depth of the involved vehicles, that can easily and very precisely be measured, can also be spread between the vehicles, based on the communication architecture, the problem can also be handled in 2D. Between target and the 𝑖th RO, the range 𝑟 can be measured. The measured value will be denoted as 𝑟̀ and equals
78 𝑟̀
3. Problem Formulation and Definitions for the Discussions to Follow
‖𝐩
𝐩‖
𝑣,; 𝑖
1, … , 𝑛 .
(3‐1)
In this equation, 𝑣 is considered to be some zero mean Gaussian stationary measurement error. To be able to refer to all measurements at once, we might introduce a vector equation. ‖𝐩 Let 𝐫̀ 𝑟̀ … 𝑟̀ be the measurement vector, 𝐫 𝑟 … 𝑟 with 𝑟 𝐩 ‖ be the vector with the true, but unknown ranges, and let 𝐯 𝑣 noise vector, the set of measurements can be written to be 𝐫̀
𝐫
𝐯 .
,
…𝑣
,
be the measurement
(3‐2)
The described measurement equations might be extended in order to consider a variance of the measurement noise that grows with distance. More details will be given in section 5.2.2.2. We can introduce an estimated range 𝑟̂ as a function of a position estimation; usually that of the target. In the situation depicted in Figure 3‐3, we assume that the global position of the target is to be estimated (most probably within an external navigation procedure), hence the current position estimation is denoted as 𝐩 . From this, it follows that 𝑟̂
𝐩
𝐩 ; 𝑗
1, … , 𝑛 .
(3‐3)
We might also consider an internal and relative navigation procedure, where a robot denoted as 𝑗 is supposed to estimate the position of the target, relative to its own position, that is, in a reference frame with its origin in its own center of gravity (CG). For that task, it might also need the bearing angle between the vehicles, see Figure 3‐4 and the discussions around. It can be stated that based on the current position estimation of the target, 𝐩 , equation (3‐3) still holds true, yet the elements in 𝐩 can be set to 0, as the position of 𝐩 is denoted in the coordinate frame with the origin in the CG of vehicle j. In either of the discussed ways, it is straightforward to introduce the range estimation vector 𝐫 𝑟̂ … 𝑟̂ . As one might already imagine, the comparison of 𝐫 and 𝐫̀ will form the base of the formulation of a mathematically traceable problem. We have assumed that the measurement error is zero mean and follows a Gaussian distribution. According to the discussions in Alcocer, 2009, this assumption must be handled carefully. We will handle the problem as TOA, therefore we assume that the range measurement is achieved by measuring the overall runtime of the signal and converting it into the range. This converting can be considered as major cause of the measurement error. The sound speed is usually considered to be constant, but it varies with depth, temperature, and salinity, as described by Urick, 1996, or Mackenzie, 1981. If the sound speed is also estimated, it will exhibit an error that might add a bias to the range measurement. Also, multipath propagation must be considered, that is, the sound is not only moving on a direct line between transmitter and receiver, but on multiple paths with several reflection at the sea bottom and the sea surface. The multipaths must be identified and isolated, otherwise they cause biased and non‐Gaussian disturbances (Olson et al., 2006). As long as the assumption of zero mean and Gaussian distribution for the range measurement error hold, we can state that the expected value of the error is zero, that is,
3.2 Problem Formulation E𝐯
79
0 ,
(3‐4)
and we can introduce the range measurement error covariance matrix 𝐑 as
𝐑 ∶
𝜎 ⎡ , ⎢𝜎 , ⎢ ⋮ ⎢ ⎣𝜎 ,
E𝐯𝐯
𝜎 𝜎 𝜎
, ,
⋮ ,
⋯ ⋯ ⋱ ⋯
𝜎 𝜎
,
⎤ ⎥ . ⋮ ⎥⎥ 𝜎, ⎦ ,
(3‐5)
This matrix contains the variances of the single range measurements in its main diagonal, while the other elements describe the covariances between them. Usually we will consider that all covariances are zero, and the variances in the main diagonal are all identical. In this case, equation (3‐5) simplifies to 𝐑
𝐈 𝜎 ,
(3‐6)
with 𝐈 being the Identity matrix in the dimension 𝑛. In addition to the range measurements, we will in some scenarios assume that a RO is equipped with a USBL‐system. In these cases, the RO will obtain bearing and altitude angle measurements with each successful acoustic communication. In the MORPH project, we made the experience that it is reasonable to handle the problem in 2D. This is due to the good precision of the depth cell. The target transmits its current depth measurement with the acoustic communication; the RO can directly use this measurement data. This is way more precise than the computation of the target depth based on range‐ and altitude angle measurement. For this reason, the altitude angle measurement is not further discussed. Figure 3‐4 demonstrates the situation under discussion in a Cartesian coordinate system. The RO 1 vehicle in yellow carries a USBL‐system, which is firmly fixed to its body. For a successful acoustic communication from the target to the RO 1, the yellow robot obtains a measurement of the bearing angle α . This angle can be computed from the position 𝑥‐ and 𝑦‐coordinates of both vehicles, and additionally the heading angle must be considered, as the USBL‐ system rotates with the vehicle. In general, the bearing angle between vehicles 𝑖 and 𝑗, measured from vehicle 𝑖, can be computed to be 𝛼
𝜓
atan2 𝑦
𝑦 ,𝑥
𝑥 ,
(3‐7)
where the function atan2 is defined as follows:
atan2 𝑦, 𝑥
arctan 𝑦⁄𝑥 ⎧ arctan 𝑦 ⁄𝑥 𝜋 ⎪ ⎪ arctan 𝑦⁄𝑥 𝜋 𝜋 ⁄2 ⎨ ⎪ 𝜋 ⁄2 ⎪ ⎩ undefined
for 𝑥 0 for 𝑦 0, 𝑥 for 𝑦 0, 𝑥 for 𝑦 0, 𝑥 for 𝑦 0, 𝑥 for 𝑥 𝑦
0 0 . 0 0 0
(3‐8)
80
3. Problem Formulation and Definitions for the Discussions to Follow
Figure 3‐4: Problem formulation: Range‐ and bearing‐based navigation
As for the range measurements, we assume that the measured bearing angle 𝛼̀ is composed of the true bearing and a zero mean Gaussian stationary measurement error with a variance of 𝜎 , , which is a property of the employed USBL‐system: 𝛼̀
α
𝑣
,
; 𝑗
1, … , 𝑛 .
(3‐9)
Based on these discussions, we can now formulate the relevant problems to be tackled within this thesis. The problem formulation for the Range‐Only Localization and the Range‐Only Target Trackingis based on Alcocer, 2009. Problem 1: Range‐Only Localization Let 𝐩 ∈ ℝ be the position of an underwater agent, denoted as target, in a local inertial frame, 𝑚 the number of dimensions to consider, and 𝐩 ∈ ℝ , 𝑖 1, … , 𝑛 the positions of 𝑛 ‖𝐩 static Reference Objects (ROs). With 𝑟 𝐩 ‖ being the true distances between target 𝑟 ⋯ 𝑟 , assume that a set of measurements 𝐫̀ 𝐫 𝐯 and ROs, and the definition 𝐫 is available, where 𝐯 ∈ ℝ is a vector with zero mean Gaussian disturbance with the covariance vector 𝐑 ∈ ℝ . Obtain an estimate of the target position, 𝐩 ∈ ℝ , based on the positions of the ROs and the measurement vector 𝐫̀ that is optimal in some sense. The Range‐Only Localization Problem will be discussed in the scientific part in section 5.1. It will serve as an introduction into the topic. For concrete scenarios, it is of limited interest,
3.2 Problem Formulation
81
especially if cooperative navigation is studied. In these cases, both the target and the ROs cannot be considered as static. This gives rise to the Range‐Only Target Tracking Problem: Problem 2: Range‐Only Target Tracking Let 𝐩 𝑡 ∈ ℝ be the trajectory of an underwater agent, denoted as target, in a local inertial frame, 𝑚 the number of dimensions to consider, and 𝐩 𝑡 ∈ ℝ , 𝑖 1, … , 𝑛 the trajectory of ‖𝐩 𝑡 𝑛 dynamic Reference Objects (ROs). With 𝑟 𝑡 𝐩 𝑡 ‖ being the true distances 𝑟 𝑡 ⋯ 𝑟 𝑡 , assume that between target and ROs at time 𝑡, and the definition 𝐫 𝑡 a set of measurements 𝐫̀ 𝑡 𝐫 𝑡 𝐯 𝑡 is available, where 𝐯 ∈ ℝ is a vector with zero mean Gaussian disturbance with the covariance vector 𝐑 ∈ ℝ . Obtain an optimal estimate of the target trajectory, 𝐩 𝑡 ∈ ℝ , based on the trajectories of the ROs, the measurement vector 𝐫̀ 𝑡 and some basic knowledge about the manoeuvrability of the target. The Range‐Only Target Tracking Problem builds the base for Benchmark Scenario 𝐼 (section 3.3.1) and will be discussed in detail in section 5.2. In the Range‐Only Target Tracking Problem, it is assumed that the ROs can communicate with each other without limitations, that is, all measurements are available at the same time. This can only be assumed if the ROs are surface objects. For a scenario with submerged vehicles, this condition does not hold. This gives rise to a problem in which only one RO is used, which is able to obtain range and bearing measurements to a/the target(s). Problem 3: Range‐ and Angle‐Based Target Tracking Let 𝐩 𝑡 ∈ ℝ , 𝑖 1, … , 𝑛 be the trajectory of 𝑛 underwater agents, denoted as targets, in a local inertial frame, 𝑚 the number of dimensions to consider, and 𝐩 𝑡 ∈ ℝ the trajectory ‖𝐩 𝑡 of a dynamic Reference Object (RO). With 𝑟 𝑡 𝐩 𝑡 ‖ being the true distances 𝑟 𝑡 ⋯ 𝑟 𝑡 , assume that between target and ROs at time 𝑡, and the definition 𝐫 𝑡 a set of measurements 𝐫̀ 𝑡 𝐫 𝑡 𝐯 𝑡 is available, where 𝐯 ∈ ℝ is a vector with zero mean Gaussian disturbance with the covariance vector 𝐑 ∈ ℝ . Also, with 𝛼 𝑡 according to equation (3‐7) being the bearing angle between object 0 and 𝑖 at time 𝑡, as seen by 0, and 𝛼 𝑡 ⋯ 𝛼 𝑡 , a set of measurements 𝛂̀ 𝑡 𝛂 𝑡 𝛂 𝑡 𝐯 𝑡 with the same characteristics as for 𝐯 is available. Finally, assume one has access to either an altitude angle measurement or the measurement of the depth difference between targets and ROs. Obtain an optimal estimate of the target trajectories, 𝐩 𝑡 ∈ ℝ , based on the trajectories of the RO, the measurement vectors and some basic knowledge about the manoeuvrability of the targets. The Range‐ and Angle‐Based Target Tracking Problem will be employed within Benchmark Scenario 𝐼𝐼 (section 3.3.2) and will be discussed in detail in section 5.3. With the intention to build a single target positon or trajectory estimation only on Range‐ Measurements, one can assume that several ROs should be necessary. At the same time, the question arises where they should be with respect to the target in order to optimize some objective function. We formulate this as the Static Optimal Sensor Placement Problem: Problem 4: Static Optimal Sensor Placement Let 𝐩 ∈ ℝ be the position of an underwater agent, denoted as target, in a local inertial frame, 𝑚 the number of dimensions to consider, and 𝐩 ∈ ℝ , 𝑖 1, … , 𝑛 the positions of 𝑛 static Reference Objects (ROs). For the intent of performing Range‐Only Localization according to Problem 1, find a set of positions 𝐩 as a function of 𝐩 that can be considered optimal with respect to the possibly achievable accuracies for the target position estimation.
82
3. Problem Formulation and Definitions for the Discussions to Follow
We will discuss this further in sections 6.2 and 6.3. From Problem 4, another question arises. Assume that target and ROs are considered as dynamic objects, like marine robots, as it is the common set‐up in cooperative navigation scenarios. In these cases, one is interested in finding optimal trajectories for the ROs. Additionally, one is interested to minimise the number of ROs to reduce costs and efforts in real sea trials. The optimal solution would one comprise of one RO that would move on a trajectory in a way that it is able to perform target estimation based only on range measurements. This will be another problem we shall discuss in detail: Problem 5: Dynamic Optimal Sensor Placement (1 RO only) Let 𝐩 𝑡 ∈ ℝ be the trajectory of an underwater agent, denoted as target, in a local inertial frame, 𝑚 the number of dimensions to consider, and 𝐩 𝑡 ∈ ℝ the trajectory of a Reference Object (RO). For the intent to perform Range‐Only Trajectory Tracking according to Problem 2, find a set a trajectory 𝐩 𝑡 as a function of 𝐩 𝑡 that can be considered optimal with respect to the possibly achievable accuracies for the target position estimation. This problem will be studied in section 6.4.2. As it was discussed before, the practical use of Optimal Sensor Placement methods is often questioned, because the optimal position/ trajectory is a function of the unknown target position/ trajectory. This inspired us to the study on a scenario in which we perform a Range‐ Only Target Tracking according to Problem 2, and a Dynamic Optimal Sensor Placement according to Problem 5, but to be more realistic we assume that the real target trajectory is unknown, so Problem 5 must be solved based on the trajectory estimation resulting from Problem 2. With other words, the position of the target is constantly estimated, and the estimate is simultaneously used to compute an optimal trajectory for the RO. Note that this concept is in some sense similar to the basic idea behind the discussed SLAM principle. For this reason, we suggest the notation STAP (Simultaneous Trajectory Planning and Position Estimation) for the following problem: Problem 6: Simultaneous Trajectory Planning and Position Estimation (STAP) Let 𝐩 𝑡 ∈ ℝ be the trajectory of an underwater agent, denoted as target, in a local inertial frame, 𝑚 the number of dimensions to consider, and 𝐩 𝑡 ∈ ℝ the trajectory of a Reference Object (RO). While performing Range‐Only Trajectory Tracking according to Problem 2, yielding 𝐩 𝑡 ∈ ℝ , simultaneously find a trajectory 𝐩 𝑡 as a function of 𝐩 𝑡 that can be considered optimal with respect to the possibly achievable accuracies for the target position estimation. In our opinion, this problem is very interesting for practical use. Therefore, it might demonstrate the importance of Optimal Sensor Placement methods for real applications. The STAP method builds the base for Benchmark Scenario 𝐼𝐼𝐼 (section 3.3.3) and will be discussed in detail in Section 7.3.
3.3 Benchmark Scenarios In the scientific part of this thesis within chapters 5 ‐ 7, we will discuss the research results employing the following benchmark scenarios.
3.3 Benchmark Scenarios
83
3.3.1 Benchmark Scenario I: Supervision of a Diving Agent Benchmark Scenario I is related to t situation in which a submerged object, denoted as target by the subscript 0, shall be supervised by a number of three ROs, denoted with the subscripts 1 – 3. The reference objects are able to determine their inertial positions (𝐩 𝑡 𝐩 𝑡 ) by the help of GPS, as they are all located at the surface. Based on acoustic communication between the vehicles, range measurements 𝑟 𝑡 𝑟 𝑡 will be available periodically. Also, the target measures its depth with a depth cell and transmits the current value whenever it sends acoustic data, denoted as 𝑧̂ 𝑡 . It has to be considered that the acoustic message from the target might not be received by one or more ROs. Also, the hardware clocks of the target cannot be considered synchronized with the ones of the ROs. This must be taken into consideration. The goal is the constant estimation of the target position in a local inertial frame, 𝐩 𝑡 . With the definitions made so far, the task to solve can be understood as a global and external navigation problem, and it is related to Problem 2 according to section 3.2. In this scenario, the movements of the vehicles and their control are not taken into consideration, which means, it is not accounted for the planning of the trajectory or the control for any vehicle. As mentioned, this problem is an enhancement of the GIB scenario. We will discuss the set‐up and the achieved results in detail in section 5.2. Figure 3‐5 provides an overview.
Figure 3‐5: Benchmark Scenario I: Global and external navigation for target vehicle 0 by three surface reference objects, denoted as vehicles 1 ‐ 3
3.3.2 Benchmark Scenario II: Aided Navigation Within a Small Robot Pack This scenario is a part of the MORPH project, where it is referred to as “The upper MORPH part”. It is depicted in Figure 3‐6. It is assumed that the green Vehicle 2 which is referred to as LSV (Leading Sonar Vehicle) is moving underwater in order to collect sonar data. It might be supported by two more camera vehicles which are not taken into consideration within the upper MORPH part scenario. As a support for the navigation task, the red SSV (Surface Support
84
3. Problem Formulation and Definitions for the Discussions to Follow
Vehicle) operates at the surface, therefore it has GPS access and will follow a predefined path. The green vehicle is intended to estimate the relative position of the SSV with respect to itself, 𝐩 𝑡 , and the current inertial velocity of vehicle 1, 𝐯 𝑡 , and move in way to remain a preplanned formation with vehicle 1. The control algorithms for this task are described in Abreu and Pascoal, 2015. To fulfil this task, the vehicle is aided by the yellow GCV (Global navigation and Communication Vehicle) which is equipped which a USBL system. It uses the same control algorithm to maintain formation with vehicle 1.
Figure 3‐6: Benchmark Scenario II: Relative and internal navigation for vehicles 0 and 2 with respect to the surface vehicle 1
In the communication system, vehicle 1 and 2 will transmit their inertial velocities, 𝐯 𝑡 and 𝐯 𝑡 , and their depth ( 𝑧̂ 𝑡 , only vehicle 2) to vehicle 0. When vehicle 0 receives an acoustic message, the USBL system will also provide range 𝑟 and bearing angle to the sender. Carrying an AHRS and being able to estimate its surge speed through water based on the rotational rate of its propellers, but not using a DVL system, vehicle 0 must estimate the relative positions of vehicle 1 and 2 with respect to itself, 𝐩 𝑡 and p 𝑡 , considering existing sea currents. Using the estimated relative positions, vehicle 0 computes an estimate for the position of vehicle 1 in the reference frame 2, 𝐩 𝑡 , and sends this information together with the current estimate of 𝐯 𝑡 to vehicle 2. Vehicle 2, carrying the same navigation equipment as vehicle 0 (except for the USBL system), has to use the information received by acoustic communication in order to constantly estimate 𝐩 𝑡 , and to provide an estimate for its own inertial velocity, 𝐯 𝑡 , to be sent to vehicle 0. The scenario is related to Problem 3 in section 3.2, and it comprises the task of relative and internal navigation, according to the definitions made so far. It is interesting that during mission execution, the submerged vehicles do not have any information on their global positions in a local inertial frame. All the control algorithms are based on relative position estimates.
3.3 Benchmark Scenarios
85
The described set‐up requires a set of different filters to merge the different data and to provide the information required by the control systems in a continuous manner, even in cases of failing communications. We will discuss the scenario in detail within section 5.3. 3.3.3 Benchmark Scenario III: Range‐Based Navigation Within a Robot Pack With a Minimal Number of Members The third Benchmark scenario is an enhancement of scenario I. The major difference is the reduction of the number of reference objects to the absolute necessary number of one. Equipment and notation remains unchanged. In order to be able to estimate the position of a submerged vehicle based on Range‐only measurements of a single vehicle, the trajectory of the surface craft must be planned carefully. That means, at a given moment in time, denoted as 𝑡 , the surface craft has to estimate the current target position, 𝐩 𝑡 , and simultaneously its own trajectory for the future time, 𝐩 𝑡 𝑡 , in order to enable future position estimations of the target position in good quality. Details are provided in Figure 3‐7. As scenario I, the current set‐up can be denoted as a global and external navigation problem. It represents the Simultaneous Trajectory Planning and Position Estimation (STAP) Problem 6 and thereby merges Problems 2 and 5 of section 3.2. We will discuss this challenging scenario in section 7.3, fusing methods of state estimation and Optimal Sensor Placement and thereby providing a contribution to show the usability of OSP also from a practical point of view.
Figure 3‐7: Benchmark Scenario III: Global and external navigation for vehicle 0 and simultaneously trajectory planning for the surface vehicle 1
4 Mathematical Tools Used From the Areas of Control and Systems Engineering This chapter is dedicated to the introduction of all mathematical methods to be employed in the scientific part which starts afterwards. The purpose is to demonstrate that the author is capable to prepare the ambitious material in an understandable way to prove his teaching abilities. To this extend, the discussion will start at a very low level with the introduction of the term system, the state space description and the time discretization within section 4.1. In what follows, two important concepts are introduced, discussed and compared: the observation and the estimation within dynamic systems based on mathematical models. A more detailed introduction is given at the beginning of section 4.2. In literature, the differences are not always clear. Estimation is often understood as a part of observation. In fact, we have introduced the notation ‘observers’ for the parts in the navigation system responsible to merge different data and to output the navigation data vector 𝐧 which is afterwards used by the different control systems. According to the definitions we will make in this chapter, their tasks are more related to the notation ‘estimation’. Within this chapter, we will distinguish between the terms ‘observation’ which is only related to deterministic signals and systems, and ‘estimation’ which also allows for the consideration of stochastic elements. The text is intended to be used by students within a systems engineering study course who already have basic knowledge in control theory, especially in the design and evaluation of single‐loop feedback systems in the Laplace domain. Different literature was employed by the author as source for the description of the methods in this chapter. The most important ones were (in German language): Föllinger, 1994, Unbehauen, 2009, Lunze, 2010, and Brammer and Siffling, 1994. Sources in English language were Levine et al., (2011), Golnaraghi and Kuo, 2010, Fairman, 1998, and Rugh, 1995.
4.1 Basic Ideas and Concepts In this basic section, we will define the notations ‘signal’, ‘system’, and ‘model’, introduce the state space description and demonstrate the time discretization. 4.1.1 The Terms ‘Signal’, ‘System’, and ‘Model’ and Their Most Important Features 4.1.1.1 Basic Definitions The term ‘system’ has an important meaning within the control and systems theory. In general, we use the notation ‘system’ for an enclosed connection of components which may interact with each other and which interact with components outside of the system following clearly defined interfaces. For a proper introduction, we need to define the term ‘signal’ before. The following definitions of signal and system were inspired by the ones in Beucher, 2015, Frey and Bossert, 2008, and Werner, 2008. Definition: Signal A signal is the abstract description of a (usually) variable physical quantity, that is, a qualitatively definable property of a physical object, like temperature, velocity, or force, for instance. The abstract description is in most cases given as a mathematical function, in which the time 𝑡 or the current time step 𝑘 serves as the independent variable.
© Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2020 T. Glotzbach, Navigation of Autonomous Marine Robots, https://doi.org/10.1007/978-3-658-30109-5_4
88
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
Looking back at our discussions in the former chapters, we can consider the pose and velocity of marine objects, which we want to estimate in the navigation process, as signals. Based on this, the following definition can be used for the term ‘system’: Definition: System A system represents a technical or non‐technical process that leads to the transformation of signals. One can differentiate between input and output signals that serve as interfaces between the system and the environment. Input signals have a source outside of the system and must not be directly influenced by the system. They influence the output signals, which on the other hand impact on the environment, for instance as input signals for other systems.
Figure 4‐1: The system with its interfaces to the environment
According to Figure 4‐1, by letting 𝐮 𝑡
𝑢 𝑡 ⋯𝑢 𝑡
be the vector of input signals (or
inputs), and 𝐲 𝑡 𝑦 𝑡 ⋯ 𝑦 𝑡 be the vector of output signals (or outputs), the system can also be described to be a transformation vector 𝐇, and the following relation holds: 𝐲 𝑡
𝐇𝐮 𝑡
.
(4‐1)
Systems with multiple inputs and outputs are referred to as MIMO‐ (Multiple Input, Multiple Output) systems. A system with only one input and one output is called SISO (Single Input, Single Output). In this case, equation Figure 4‐1 simplifies by only displaying the single input and output signal rather than the vectors, and the transformation is displayed by the operator 𝐻. For the sake of simplicity, we will look at SISO‐systems for the discussions of system classifications within this section. It shall be mentioned, that of course also combination of the two system classes can be defined, namely SIMO‐ (Single Input, Multiple Output) and MISO‐ (Multiple Input, Single Output) systems. Both for signals and systems, we can formulate models. The following definition and characterization is inspired by Stachowiak, 1973, and Brockhaus, 1999: Definition: Model (in science and technology) A model is a representation of an original, displaying only those properties considered as important in order to fulfill a certain task. The following three typical features of technical models can be derived: 1. Mapping feature: Each model is an illustration of an existing or an imaginary object. 2. Reduction feature: A model only contains those attributes of the object that are considered to be of importance. As a consequence, there can never be an ‘exact’ model. 3. Pragmatic feature: Every model is designed for a specific purpose, and only needs to be useable for it. The purpose must be known when the model is build. Typical applications that
4.1 Basic Ideas and Concepts
89
require technical models are simulation, forecast and estimation (as in this thesis), controller design, diagnosis or monitoring, or optimization of system design. Furthermore, technical models can be classified as physical models (like architectural models, model railways) and conceptual models which exist only in the mind of the designer/user. As a subgroup of the latter ones, we will employ mathematical models of the systems under discussions as a base for the estimation of navigation data. 4.1.1.2 Classification of Systems and Models Systems (and associated models) can be classified according to the transformation process from the input to the output signals. In what follows, we shall discuss the most important classifications, and the structure a mathematical model will have.
Figure 4‐2: Continuous time (left) and discrete time (right) output signal
A first important distinction can be made between continuous time and discrete time systems. For the former ones it is assumed that input and output signals can be formulated as functions of time t, that is, a signal carries a certain value only for an infinitesimally short amount of time before it might change to another value. In a discrete time system, the values of the signals are only evaluated at discrete time steps, with usually exhibit a constant time step 𝑇 between them. With 𝑘 being a counter variable, only times 𝑡 𝑘 ∙ 𝑇 can be evaluated. Therefore, it is common to write the function of the signals as functions of 𝑘, e. g. 𝑢 𝑘 𝑢 𝑡 𝑘 ∙ 𝑇 . Since the introduction of digital computers, the discrete time view has gained a lot of importance, as the handling of a system by a computer has to be performed in a discrete time manner. While we can assume that real world systems usually exhibit a continuous time behavior, it might be of importance to create a discrete time model for simulation and estimation on a computer, which is called time discretization. We will discuss this issue in section 4.1.3. Figure 4‐2 provides an overview of a continuous time and a discrete time output signal.
Figure 4‐3: A stochastic signal: White noise with mean of 0 and variance of 1
Another distinction can be made between deterministic and stochastic systems. In deterministic systems, the courses of the signal are explicitly determined. For identical input
90
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
signals and identical initial conditions, one will always receive identical output signals. If a system is assumed to be stochastic, some of the signals or of the system parameters exhibit stochastic features that cannot be estimated with arbitrary accuracy. For instance, every measurement of a physical quantity will add some inaccuracy that is usually modelled as an additional stochastic signal like the one displayed in Figure 4‐3. Stochastic signals can be described by statistic properties, like mean value and variance, but they cannot be forecasted exactly. The handling of measurements with stochastic inaccuracies is an important part within the navigation of marine robots. We will further discuss the procedures to deal with these issues in section 4.3. For the further discussions in this section, we will put the focus on deterministic systems, and we will use the continuous time view to introduce further classifications of systems, while the statements can also be transferred to discrete time system views. A very basic distinction can be made between static and dynamic systems, also referred to as systems without and with memory. For a static system, the current value of the output signal, 𝑦 𝑡 , only depends on the current value of the input signal, 𝑢 𝑡 . Hence we can write 𝑦 𝑡
𝑓 𝑢 𝑡
,
(4‐2)
where 𝑓 ∙ represents an arbitrary function. For dynamic systems, the current value of the output signal can also depend on values of the input signal from other times. A dynamic system is called causal, if the output signal only depends on the current and past values of the input signal, that is, if the following condition holds: 𝑦 𝑡
𝐻 𝑢 𝜏 , 𝜏
𝑡
(4‐3)
In signal and systems theory, there is a further distinction between anticausal and acausal systems. In an anticausal system, the current output value depends only on future values of the input, whereas some definition can be found in literature (e. g. Oppenheim, 1998) that also allow for the dependence on the current input value. An acausal system exhibits an output signal where the current value may depend on past, present, and future values of the input system. As one can imagine, processes in the real world that are treated as systems are always causal systems, as the output signal cannot be dependent on future values of the input system. Therefore, we will restrict the further discussion on causal dynamic systems. Systems that are not causal cannot be realized in real world. The definition of acausal and anticausal systems is of importance within the area of signal processing in the information technology domain, because often signals are processed whose complete course is already stored in the memory. Another example is image coding, where the complete two‐dimensional picture is available, so for the processing of a certain pixel, both data of preceded and subsequent pixels can be employed (Werner, 2008). Figure 4‐4 shows examples of the signal course of a causal and an anticausal system. It can be stated that static systems can be described in a mathematical manner by an algebraic equation. A dynamic system can be described by a differential equation in the continuous time case, or by a difference equation in the discrete time case.
4.1 Basic Ideas and Concepts
91
Figure 4‐4: Input (dark) and output (bright) signal of a causal (left) and of an anticausal system right
In the discussions so far we have assumed that the relevant signals linked with a system only depend on one variable, which is usually the time t. Such systems are referred to as lumped systems with can be described by so called Ordinary Differential Equations (ODE). In a strict manner, this definition already includes a simplification, as a physical quantity is usually not concentrated at one point, but may change along a line, an area, or a volume. In many cases, it may be justified to imagine a relevant physical quantity as concentrated within one point. For instance, in order to model the movement of a pendulum, it is often assumed that all mass is concentrated in one point. For other systems, it might not be justified to ignore the dependency from the concrete spatial position. In this case, the notation distributed system is used, and the mathematical description requires the employment of a Partial Differential Equation (PDE), which contains functions of more than one variable and their partial derivatives. As an example, we might look at the heat sink of an electrical device like a Central Processing Unit (CPU) within a computer, whose task is to transfer the heat created at the CPU to the environment. We consider the heat sink as a system with the thermal energy provided by the CPU as input signal, and the temperature of the heat sink 𝜗 as output signal. It is straightforward to imagine that at every time 𝑡, the output signal is also a function of the spatial position 𝐱 at the heat sink, where a position closer to the CPU will yield a higher temperature. Therefore, the output signal 𝜗 𝑡, 𝐱 is a function of time 𝑡 and position vector 𝐱 and contains partial derivatives of these variables. As the handling of PDEs is quite complex, it is common to describe these systems with lumped models by employing methods for spatial discretization. Another important distinction is made between stable and unstable systems. This topic is of great importance in the domain of controller synthesis, as it must be guaranteed that every control circuit is stable, even if the underlying plant has an unstable behavior. Basically, one can differentiate between two general stability conditions. If a system fulfills the condition for the so‐called Bounded‐input, Bounded‐output (BIBO) stability, and if it is considered to be free of energy at time 𝑡 0 (that is, all initial conditions of the differential equation describing the relation between output and input signal are zero), then the following can be assumed: If the input signal is bounded, that is, it does not exceed a finite value, than the output signal is also bounded. Let 𝛿, 𝜀 0 be arbitrary numbers, a system which can be described by a differential equation of the order 𝑚 is called BIBO stable if the following implication holds: 𝜕𝑖 𝑦𝑡 𝜕𝑡𝑖
0
0; 𝑖
0, … , 𝑚
1 ∧ |𝑢 𝑡 |
𝛿 for all 𝑡
0
(4‐4) ⇒ |𝑦 𝑡 |
𝜀 for all 𝑡
0 ,
92
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
There are even stricter stability definitions. Consider an autonomous dynamical system (that is, without inputs), that has an equilibrium point at which the system will no longer change its states, were ‘state’ refers to the state space representation according to 4.1.2.2. Suppose that all initial conditions do not exceed a finite value 𝛿 0. Then, the equilibrium is said to be Lyapunov stable, if for all times 𝑡 0 the system states do not exceed a finite value 𝜀 0. With other words, if the states of the system are in the close vicinity of an equilibrium at 𝑡 0 (but not exactly at the equilibrium), the equilibrium is denoted as Lyapunov stable is the states remain in the vicinity of the equilibrium for all times 𝑡 0. A Lyapunov stable equilibrium is referred to as asymptotically stable, if the states even approach the equilibrium for 𝑡 → ∞. It can be stated that all asymptotically stable equilibriums are also Lyapunov stable, and all systems with a Lyapunov stable equilibrium are also BIBO stable. For a more descriptive understanding, Figure 4‐5 depicts three different systems in which a ball is placed on a differently shaped floor (black), while the vertical position 𝑥 𝑡 of the ball serves as output signal. To investigate the stability of the system, the ball is placed at a position 𝛿 0, and its behavior is observed. In the left example, the ball will always roll down the slopes and finally rest at the position 𝑥 0. The system is asymptotically stable. In the middle picture, the ball will remain at the position it is placed initially. Therefore, the system is Lyapunov stable, and it can even be stated that the relation 𝛿 𝜀 always holds true. In the right example, the ball would (theoretically) only remain at the position 𝑥 0 if it is placed there exactly and no external disturbances like wind or vibration occurs. For any staring position not equal 0, the ball will roll down the slope, showing the instability of the system.
Figure 4‐5: A ball on a floor as example for stable and unstable systems
The next importance distinction is between linear and nonlinear systems. A system is linear if it fulfills the Superposition principle. Assume that the input signal of a system can be expressed as a linear combination of two or more base signals. In this case, the principle is fulfilled if the overall output signal equals the same linear combination of the single output signals which would have been caused by the base signals. For any system 𝑘
𝑦 𝑡
𝐻𝑢𝑡
and 𝑢 𝑡
𝑐𝑖 ∙ 𝑢𝑖 𝑡 ,
(4‐5)
𝑖 1
the Superposition principle is fulfilled if and only if 𝑘
𝑦 𝑡
𝑐𝑖 ∙ 𝐻 𝑢 𝑖 𝑡 𝑖 1
(4‐6)
4.1 Basic Ideas and Concepts
93
holds. Figure 4‐6 shows an example. On the left, the signals displayed by two bright curves are added to yield the signal displayed in dark. All signals are used as inputs for the same linear system; the associated output signals are displayed on the left. As it is implied by the Superposition principle, again the sum of the corresponding bright output signals yield the dark one. In fact, lots of tools and algorithms exist for computations with linear systems. Therefore, it is common to try and develop linear models, even if the base system exhibits nonlinear behavior, by the method of linearization. We will come back to this issue when we introduce the Extended Kalman Filter in section 4.3.3.6.
Figure 4‐6: Input signal (left) as a combination of two parts and output signals of a linear system
The last distinction to be discussed at this point is between time‐variant and time‐invariant (TIV) systems. For TIV systems, the response of the system to a certain input signal, given identical initial conditions, is always identical. Therefore, a shift of an input signal results in an identical shift of the output signal, and the condition 𝑦 𝑡
𝑡0
𝐻𝑢𝑡
𝑡0
(4‐7)
holds. An example is shown in Figure 4‐7. For time‐variant systems, the transformation operator 𝐻 is itself a function of time. For instance, a machine in material processing might comprise a tool which wears out during the process. In the course of time the resulting work pieces might differ in quality even if all settings at the machine remain identically. Systems that are both linear and time‐invariant are also denoted as LTI‐systems. These systems can be modelled mathematically by so called linear ordinary differential equations (ODE) with constant coefficients, which exhibit the following form (for a SISO system): 𝑎 𝑦
𝑡
with 𝑦
𝑡
⋯ 𝑎 𝑦 𝑡 𝑦 𝑡 ,𝑦 𝑡
𝑎 𝑦 𝑡 𝑦
𝑏 𝑢 𝑡 𝑡 , 𝑎
𝑏 𝑢 𝑡
⋯
0 , and 𝑎 , 𝑏 ∈ ℝ
𝑏 𝑢
𝑡 , (4‐8)
In this equation, the elements 𝑎 , … , 𝑎 , 𝑏 , … , 𝑏 are denoted as the coefficients. At this point, the difference between signals and coefficients shall be emphasized. Signals are usually time varying quantities that serve as interfaces between different systems. Coefficients are system internal properties and might be influenced by physical quantities like mass, density, or energy conversion efficiency, for instance. If these quantities remain constant, so will the coefficients, and the system is time‐invariant. If the quantities and therefore the coefficients
94
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
change with time, the system is time‐variant. For systems which are linear but time‐variant, these parameters are functions of time 𝑡; the resulting equation is called linear ordinary differential equations with variable coefficients.
Figure 4‐7: For a time‐invariant system, an identical input signal and identical initial conditions will always result in an identical output signal, independent of the time t
The handling of systems described by differential equations in the time domain and the hence deduced Impulse Response Function (IRF) is difficult, especially if complex structures of systems are employed as it is common in control theory, see Figure 2‐9, for instance. For this reason, it is common to transfer the system description into the frequency domain. This is usually done resorting to Fourier or Laplace transforms. The latter one is used in control theory. The Laplace transform of the IRF of a system exists definitively, if a system is causal; it does not have to be stable. The sufficient condition for the existence of a Fourier transform is that the system is stable; it does not have to be causal. For this reason, the Fourier transform is of bigger importance within the information technology domain. In the frequency domain, several tools are available for typical tasks within the control theory domain like investigation of stability or controller design, especially for LTI‐systems. An overview of these tools can be found in literature, like Golnaraghi and Kuo, 2010, or Levine et al., 2011, to name but a few. The solution of control tasks within the frequency domain can be considered to be the classical way in the control theory domain since the 1940s. However, with the state space description an important alternative procedure has been established since the 1960s, especially based on the work of the Hungarian‐American mathematic Rudolf E. Kálmán. Both techniques are widely used until today, because both exhibit typical advantages for concrete application scenarios. We will introduce the state space representation in the next section and distinguish it from the frequency domain approach 4.1.2 State Space Representation 4.1.2.1 Necessity for the Introduction and Comparison With Frequency Domain Approach The classical way to solve problems in control theory for SISO systems contains the Laplace Transfer of the system descriptions and the solving within the frequency domain. Problem solution in the frequency domain is usually easier to achieve for SISO systems, but if the results need to be available in the time domain, it is necessary to perform an inverse Laplace transform, which might require some cumbersome pre‐calculations. Employing the state space approach will enable us to solve problems without leaving the time domain. Figure 4‐8 displays both ways in comparison.
4.1 Basic Ideas and Concepts
95
As it can be observed in Figure 4‐8, there is a dashed arrow leaving from the ‘Solution in Frequency Domain’ box downwards. This is to show that in some scenarios dealing with SISO systems, there might not be the need to perform the complex inverse Laplace transform. For instance, to design a controller and to compute the parameters of the controller in order to fulfil predefined quality criteria, it might be sufficient to transform the system descriptions into the frequency domain and employ an adequate procedure for controller design, e. g. the root locus method (see Golnaraghi and Kuo, 2010, chapter 9, for instance). Also, for simple systems the transform and inverse transform might be easy to perform.
Figure 4‐8: Comparison of problem solution by state space description (solid arrows) and by transfer into the frequency domain (dashed arrows)
For several scenarios, the employment of the state space description exhibits several advantages. According to Föllinger, 1994, the following aspects might justify the usage of the state space description in certain scenarios:
The state space representation offers various tools for the computations of non‐LTI‐ systems, while the frequency domain is best suited for LTI‐systems.
The classical approach to describe systems with differential or difference equation based on their input‐/output relation does not allow for a deeper insight on what is going on inside of the system. The state space representation is suited to describe the processes within the system in more detail and can deliver information on internal sizes that are not directly measureable.
The frequency domain method is best suited for SISO systems. Methods for MISO systems exist, but the complexity is enormously rising. The state space representation is explicitly developed for MIMO systems. SISO systems can nevertheless be treated as special case.
The state space representation results in a clear representation especially for systems of higher order in a vector differential form that can directly be processed by digital computers.
96
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
4.1.2.2 Mathematical Introduction of The State Space Representation In the following, we will introduce the state space representation for a system with 𝑟 input and 𝑝 output sizes. For every output size, there exists a differential equation to describe the relation between the 𝑟 inputs and the particular output. Employing the notation shown in equation (4‐8) for a LTI‐system, we let 𝑛 be the number of the highest existing derivative of output size 𝑖 of 𝑝, while 𝑚 , is the number of the highest existing derivative of any input size (𝑗 1, … , 𝑟). As we have limited ourselves to causal systems, the relation 𝑛
𝑚,
(4‐9)
holds true for all 𝑗. As a consequence, we can state that the system is described by 𝑝 differential equations, which maximum order 𝑛 can be computed to be 𝑛
max 𝑛 , 𝑖
1, … , 𝑝 .
(4‐10)
The goal of the state space representation is to replace these differential equations by 𝑛 first‐ order differential equations which are referred to as state equations, and 𝑝 simple algebraic equations which are referred to as output equations. In order to do so, it is necessary to introduce at least n so‐called state variables, 𝑥 , 𝑖 1, … , 𝑛, which are summarized in the state 𝑥 ⋯ 𝑥 vector 𝐱 ∈ ℝ . Introducing consequently the input vector 𝐮 𝑦 ⋯ 𝑦 𝑢 ⋯ 𝑢 ∈ ℝ as well as the output vector 𝐲 ∈ ℝ , it can be quoted that every state is influenced by itself, the other states, and the inputs, following the state equations. On the other hands, the outputs are influenced by the states and possibly (albeit rarely) directly by the inputs, following the output equations which are strictly algebraic. With the discussions made so far, the state and output equations usually exhibit the following structure in the continuous‐time domain: 𝑥 𝑡
𝑓 𝑥 ,…,𝑥 ,𝑢 ,…,𝑢
, 𝑖
1, … , 𝑛 ,
𝑦 𝑡
𝑔 𝑥 ,…,𝑥 ,𝑢 ,…,𝑢
, 𝑗
1, … , 𝑝 .
(4‐11)
Note that the functions 𝑓 ∙ and 𝑔 ∙ do not contain any derived sizes, especially not from input variables. It can be stated that the whole dynamic of the system is summarized in the states. Limiting ourselves to the handling of LTI‐systems for the time being, state and output equations can be written in the following form: 𝑥 𝑡
𝑎 𝑥 𝑡
⋯
𝑎 𝑥 𝑡
𝑏 𝑢 𝑡
⋯
𝑏 𝑢 𝑡 ,
𝑖
1, … , 𝑛 ,
𝑦 𝑡
𝑐 𝑥 𝑡
⋯
𝑐 𝑥 𝑡
𝑑 𝑢 𝑡
⋯
𝑑 𝑢 𝑡 ,
𝑗
1, … , 𝑝 .
(4‐12)
It is straightforward to summarize the large amount of equations in two vector equations. To this extend, the parameters are summarized in the following parameter matrices: the system matrix 𝐀 ∈ ℝ , the input matrix 𝐁 ∈ ℝ , the output matrix 𝐂 ∈ ℝ , and the feedthrough (or feedforward) matrix 𝐃 ∈ ℝ which exhibit the following structure:
4.1 Basic Ideas and Concepts 𝐀
𝐂
𝑎 𝑎
𝑎 𝑎
𝑎
𝑎
𝑐 𝑐 𝑐
⋮
⋮
𝑐 𝑐 𝑐
⋮
⋮
⋯ 𝑎 ⋯ 𝑎 ⋱ ⋮ ⋯ 𝑎 ⋯ 𝑐 ⋯ 𝑐 ⋱ ⋮ ⋯ 𝑐
, 𝐁
, 𝐃
97 𝑏 𝑏
𝑏 𝑏
𝑏
𝑏
⋮
𝑑 𝑑
𝑑 𝑑
𝑑
𝑑
⋮
⋮
⋮
⋯ ⋯ ⋱ ⋯ ⋯ ⋯ ⋱ ⋯
𝑏 𝑏 𝑏 𝑑 𝑑 𝑑
⋮
⋮
, (4‐13) .
With these definitions, the equations (4‐12) can be transformed into the state vector differential equation and the vector output equation: 𝐱 𝑡
𝐀𝐱 𝑡
𝐁 𝒖 𝑡 ,
𝐲 𝑡
𝐂𝐱 𝑡
𝐃 𝒖 𝑡 .
(4‐14)
Figure 4‐9: Block diagram of the state space representation of a LTI‐system
Figure 4‐9 displays the structure of the discussed state space representation in a block diagram. The following statements can be made at this point:
The term ‘state space’ is related to the 𝑛‐dimensional vector space which is allocated to the state vector. Every current value of the state vector can be assumed to be a point in this vector space, and the function 𝐱 𝑡 represents a trajectory. For the sake of simplicity, we will use the terms input and output trajectory also for the courses of 𝐮 𝑡 and 𝐲 𝑡 .
In many real systems, there is no direct influence on the outputs by the inputs. It can be stated that this is the case if in the associated differential equation, the condition 𝑛 𝑚 holds. In these cases, 𝐃 becomes a zero matrix. This will usually be the case in the discussions to follow. A system in which the condition 𝑛 𝑚 holds is also referred to as biproper.
98
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
For systems with only one input, the matrix 𝐁 becomes a column vector 𝐛 ∈ ℝ . For systems with only one output, the matrix 𝐂 becomes a row vector 𝐜 ∈ ℝ . In both cases, also matrix 𝐃 becomes either a column or a row vector. In SISO systms, 𝐃 even becomes a scalar 𝑑.
For systems which are linear but time‐variant, the same equations according to (4‐14) can be used; but as the parameters are functions of time 𝑡 in this case, also the four parameter matrices are time‐dependant and must be replaced adequately (e.g. 𝐀 by 𝐀 𝑡 ).
A system is denoted as autonomous if it does not possess any input value. An autonomous system is solely driven by the initial values of the state vector, which is also denoted as free motion in comparison of the forced motion which is caused by inputs.
4.1.2.3 Solution of The Vector State Space Differential Equation Now we will discuss the solution of the vector state space differential equation. To that extend, we will look at a (non vectorial) ODE of first degree with the same structure, namely 𝑥 𝑡
𝑎𝑥 𝑡
𝑏𝑢 𝑡 ; 𝑥 0
𝑥 ; 𝑎, 𝑏 ∈ ℝ .
(4‐15)
This equation can easily be solved after transfer into the Laplace space: 𝑠𝑋 𝑠
𝑥
𝑋 𝑠
𝑎𝑋 𝑠 1 𝑠
𝑎
𝑏𝑈 𝑆 1
𝑥
𝑠
𝑎
(4‐16)
𝑏 𝑈 𝑠 .
By inverse Laplace transformation, we obtain the solution in the time domain: 𝑥 𝑡
e
𝑥
e
𝑏𝑢 𝜏
d𝜏 .
(4‐17)
It is straightforward to employ the same computation to solve the vector state differential equation (4‐14) (a), which gives rise to the following solution: 𝐱 𝑡
e𝐀 𝐱
e𝐀
𝐁𝐮 𝜏
d𝜏 ,
(4‐18)
where the matrix exponential function is defined by the series expansion of the exponential function, which leads to the following (𝐈 equals the unity matrix): e𝐀
𝐈
𝐀𝑡
𝐀
𝑡 2!
𝐀
𝑡 3!
⋯
𝐀
𝑡 . 𝑖!
(4‐19)
In order to detect another interesting issue, we will again use the Laplace transformation, this time on the vector state differential equation according to equation (4‐14), which yields:
4.1 Basic Ideas and Concepts 𝑠𝐗 𝑠
𝐱
⇔𝐗 𝑠
𝐀𝐗 𝑠 𝑠𝐈
𝐀
99
𝐁 𝐔 𝑠 , 𝐱
𝑠𝐈
𝐀
𝐁𝐔 𝑠
𝑠𝐈
𝐀
𝐱
𝐁 𝐔 𝑠 .
(4‐20)
It is straightforward to say that the term 𝑠𝐈 𝐀 describes the dynamic behaviour of the states, like a generalisation of the scalar transfer function. The poles of this polynomial, that means the zeros of 𝑠𝐈 𝐀 , can be used to evaluate the transfer behaviour between the inputs and the states or the initial values of the states and the course of the states for 𝑡 0, e.g. in terms of stability. Interestingly enough, in order to find the poles, we have to solve the equation det 𝑠𝐈
𝐀
𝟎 .
(4‐21)
The solutions of this equation are also the eigenvalues of matrix 𝐀. We can conclude: All poles of the transfer function of a system in state space representation are eigenvalues of the system matrix 𝐀. For the sake of completeness, it shall be mentioned that not necessarily every eigenvalue of 𝐀 is also a pole of the transfer function, as poles might be compensated by zeros. 4.1.2.4 Transfer of An ODE into A State Space Representation Finally, we will discuss the transformation of a LTI system description from a display as ODE into a state space description. This can be done in several ways which differ especially in the concrete selection of the states. In several applications, it is desirable to choose selected real sizes as states, in order to be able to observe or control them later. If the states can be chosen arbitrary, this gives rise to some canonical displays in the state space. We look at a SISO LTI‐system described by an ODE like specified in equation (4‐8). For the sake of simplicity, we want to assume that coefficient 𝑎 equals 1; if this is not the case, it is straightforward to divide the equation by 𝑎 . Also, as the condition 𝑛 𝑚 always holds for causal systems, we can use 𝑛 as the highest possible derivative of 𝑢. For 𝑛 𝑚, it is straightforward to simply set the coefficient 𝑏 , … , 𝑏 to zero. This gives rise to the following ODE, or, after Laplace transformation, to the following transfer function: 𝑦 𝑡 𝑎 𝑡 , 𝑏 𝑢 𝐺 𝑠
𝑌 𝑠 𝑈 𝑠
𝑦 𝑏 𝑠 𝑠
⋯ 𝑎 𝑦 𝑡 𝑏 𝑎
𝑠 𝑠
𝑎 𝑦 𝑡
𝑏 𝑢 𝑡
⋯ 𝑏 𝑠 𝑏 ⋯ 𝑎 𝑠 𝑎
𝑏 𝑢 𝑡
⋯ (4‐22)
At first, we assume that the ODE does not contain any derivate of the input size, that is, 𝑏 0 for 𝑖 0, 𝑏 0. In this special case, the transfer into the state space description is trivial. We define the first state, 𝑥 , as output 𝑦 multiplied by 𝑏 , so that 𝑥 𝑡 𝑦 𝑡 ⁄𝑏 holds. Any further state 𝑥 , 𝑖 2, … , 𝑛, is defined to be the derivative of the former state, 𝑥 . Based on this, the state equations can be written as
100
4. Mathematical Tools Used From the Areas of Control and Systems Engineering 𝑥 𝑡 𝑥 𝑡 𝑏 𝑦 𝑡 𝑥 𝑡 𝑥 𝑡 𝑏 𝑦 𝑡 ⋮ 𝑥 𝑡 𝑥 𝑡 𝑏 𝑦 𝑥 𝑡 𝑏 𝑦 𝑡 𝑏
𝑎 𝑦 𝑡 𝑎 𝑥 𝑡
𝑡
𝑎 𝑦 𝑡
𝑎 𝑥 𝑡
⋯ 𝑎
⋯
𝑎
𝑦
𝑥 𝑡
𝑡
(4‐23)
𝑏 𝑢 𝑡
𝑢 𝑡
Note that for the final equation, we have simply solved the ODE in (4‐22) for 𝑦 𝑡 and entered the result in the round brackets. In the second step, 𝑦 𝑡 and its derivatives have been replaced by the accordant state variables, following the scheme 𝑦 𝑡 𝑏 𝑥 𝑡 , 𝑖 0, … , 𝑛 1. With these definitions, the state vector equation of the system is given as: 𝑥 ⎡ 𝑥 ⎢ ⎢ ⎢𝑥 ⎣ 𝑥
𝑡 ⎤ 𝑡 ⎥ ⋮ ⎥ 𝑡 ⎥ 𝑡 ⎦
⎡ ⎢ ⎢ ⎢ ⎣
0 0 ⋮ 0 𝑎
1 0 ⋮ 0 𝑎
0 1 ⋮ ⋯ ⋯
⋯ ⋯ ⋱ 0 𝑎
0 0 ⋮ 1 𝑎
𝑥 ⎤⎡ 𝑥 ⎥⎢ ⎥⎢ ⎥ ⎢𝑥 ⎦⎣ 𝑥
𝑡 ⎤ 𝑡 ⎥ ⋮ ⎥ 𝑡 ⎥ 𝑡 ⎦
0 ⎡0 ⎤ ⎢ ⎥ ⎢ ⋮ ⎥ 𝑢 𝑡 . ⎢0 ⎥ ⎣1 ⎦
(4‐24)
𝐱 𝑡
𝐀
𝐱 𝑡
𝐛
The vector output equation is trivial, as simply the first state times 𝑏 was set as output, and there is no feedthrough because 𝑚 0 𝑛:
𝑏
𝑦 𝑡
0 ⋯
𝑥 ⎡ 𝑥 ⎢ 0 ⎢ ⎢𝑥 ⎣ 𝑥
𝑡 ⎤ 𝑡 ⎥ ⋮ ⎥ . 𝑡 ⎥ 𝑡 ⎦
(4‐25)
𝐜
𝐱 𝑡
If the system is not considered to be free of energy at time 𝑡 0, the initial state vector 𝐱 0 must be set to contain the defined initial value of the output and its relevant derivatives. 4.1.2.5 Controller Canonical Form If the original ODE contains derivatives of the input signal, the described procedure cannot be used: The equation for 𝑥 𝑡 in (4‐23) would contain the derivatives, which violated the general structure of the state space representation, in which all system dynamics must be represented within the states. Therefore, we now look at a system which is described by an 0 is fulfilled for at least one 𝑖 ∈ ODE according to (4‐22), and where the condition 𝑏 1, … , 𝑛 . In order to find a suitable state space representation, we now introduce an auxiliary quantity 𝑣 𝑡 . This quantity has to fulfill the following differential equation: 𝑣
𝑡
𝑎
𝑣
𝑡
⋯ 𝑎 𝑣 𝑡
𝑎 𝑣 𝑡
𝑢 𝑡 ,
(4‐26)
4.1 Basic Ideas and Concepts
101
which does not contain any derivatives of the input. Thus it is possible to transfer this equation into a state space representation, starting with the definition of 𝑥 𝑡 𝑣 𝑡 and further one following the procedure shown in equation (4‐23): 𝑥 𝑡 𝑥 𝑡 𝑣 𝑡 𝑥 𝑡 𝑥 𝑡 𝑣 𝑡 ⋮ 𝑥 𝑡 𝑥 𝑡 𝑣 𝑡 𝑥 𝑡 𝑣 𝑡 𝑎0 𝑣 𝑡 𝑎1 𝑣 𝑡 𝑎0 𝑥 𝑡 𝑎1 𝑥 𝑡
(4‐27)
⋯ 𝑎𝑛 ⋯
𝑎𝑛
1
𝑣 1
𝑛 1
𝑥 𝑡
𝑡
𝑢 𝑡 𝑢 𝑡
Again, the equation for 𝑥 𝑡 was found by solving equation (4‐26) for 𝑣
𝑡 .
By looking at equation (4‐27) we can conclude that the transfer the corresponding state vector equation is exactly the same as the one shown in (4‐24). We now have to design the output row vector and the feedthrough scalar in a way that the overall system fulfills the original ODE from (4‐22). Therefore, we have to find a relation between 𝑦 𝑡 and 𝑣 𝑡 . To this extend, we transfer equation (4‐26) into the Laplace space to obtain 𝑈 𝑠
𝑠
𝑎
𝑠
⋯
𝑎 𝑠
𝑎
𝑉 𝑠 ,
(4‐28)
which we insert for 𝑈 𝑠 into the transfer function in (4‐22). Solving for 𝑌 𝑠 , we can cancel the term in squared brackets in (4‐28). The result is afterwards transferred back into the time domain: 𝑌 𝑠 𝑦 𝑡
𝑏 𝑠
𝑏
𝑏 𝑣
𝑠
𝑡
⋯
𝑏
𝑣
𝑏 𝑠 𝑡
𝑏 𝑉 𝑠 ,
⋯
𝑏 𝑣 𝑡
𝑏 𝑣 𝑡 .
(4‐29)
By replacing 𝑣 𝑡 and its derivatives by according elements from the state vector, one obtains: 𝑦 𝑡
𝑏 𝑥 𝑡
𝑏
𝑥 𝑡
⋯
𝑏 𝑥 𝑡
𝑏 𝑥 𝑡 .
(4‐30)
At this spot, we have to differentiate again between two different possibilities. If the system fulfils the condition 𝑛 𝑚, then at least the first summand in (4‐30) is zero. In this case, it is straightforward to formulate the vector output equation as
𝑦 𝑡
𝑏
𝑏
⋯ 𝑏
𝑥 𝑡 𝑥 𝑡 . ⋮ 𝑥 𝑡
(4‐31)
𝐜
𝐱 𝑡
If the system is biproper, the summand 𝑏 𝑥 𝑡 in equation (4‐30) does not disappear, but 𝑥 𝑡 is not an element of the state vector. We need to replace it according to the last line in equation (4‐27):
102
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
𝑥 𝑡
𝑎0 𝑥 𝑡
𝑎1 𝑥 𝑡
⋯
𝑎𝑛
1
𝑥 𝑡
𝑢 𝑡 ,
(4‐32)
which we insert in equation (4‐30) to obtain 𝑦 𝑡
𝑏
𝑎 𝑏
𝑎 𝑏
𝑏
𝑥 𝑡 𝑥 𝑡
⋯
𝑏
𝑎 𝑏
𝑥 𝑡
𝑏 𝑢 𝑡 .
(4‐33)
Figure 4‐10: Block diagram of a state space representation in the controller canonical form
Consequently, the vector output equation can be written as:
𝑦 𝑡
𝑏
𝑎 𝑏
⋯
𝑏
𝑎
𝑏
𝑥 𝑡 𝑥 𝑡 +𝑏 𝑢 𝑡 . ⋮ 𝑥 𝑡
(4‐34)
𝐜
𝐱 𝑡
𝐝
The canonical form of the state space representation which was demonstrated here, especially the structure of the system matrix in equation (4‐24), is referred to as controller canonical form, as it exhibits some advantages when designing controllers for state control. Also, it can easily be obtained from the ODE of the system. A block diagram representation the adequate state space description is shown in Figure 4‐10, based on a concept from Unbehauen, 2009. Note that the last row in the system matrix 𝐀 contains the elements of the characteristic equation of 𝐀.
4.1 Basic Ideas and Concepts
103
4.1.2.6 Observer Canonical Form In order to obtain the so‐called observer canonical form, we start with integrating the ODE in equation (4‐22) 𝑛‐times which gives 𝑦 𝑡
𝑏 𝑢 𝑡
𝑏
𝑢 𝜏
𝑎
𝑦 𝜏
d𝜏
⋯ (4‐35)
𝑛
times ⋯
𝑏 𝑢 𝜏
𝑎 𝑦 𝜏
d𝜏 .
Figure 4‐11: Block diagram of a state space representation in the observer canonical form
Figure 4‐11 which borrows again from Unbehauen, 2009 shows the block diagram that can easily be developed from equation (4‐35). It is straightforward to define the outputs of the integrators as states, as done in the picture. This gives rise to the following equation system: 𝑥 𝑡 𝑥 𝑡 ⋮ 𝑥 𝑡 𝑥 𝑡
𝑥 𝑡 𝑥 𝑥
𝑡 𝑡
𝑎 𝑦 𝑡 𝑎 𝑦 𝑡
𝑏 𝑢 𝑡 𝑏 𝑢 𝑡
𝑎 𝑎
𝑏 𝑏
𝑦 𝑡 𝑦 𝑡
𝑢 𝑡 𝑢 𝑡
,
(4‐36)
and the output equation: 𝑦 𝑡
𝑥 𝑡
𝑏 𝑢 𝑡 .
(4‐37)
By inserting equation (4‐37) into the system (4‐36), we receive the following state equations: 𝑥 𝑡 𝑥 𝑡 ⋮ 𝑥 𝑡 𝑥 𝑡
𝑎 𝑥 𝑡 𝑎 𝑥 𝑡 𝑎 𝑎
𝑥 𝑡 𝑥 𝑡
𝑏 𝑏
𝑥 𝑡 𝑥 𝑥
𝑡 𝑡
𝑏 𝑏
𝑏 𝑎 𝑢 𝑡 𝑏 𝑎 𝑢 𝑡 𝑏 𝑎 𝑏 𝑎
𝑢 𝑡 𝑢 𝑡
.
(4‐38)
104
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
Summing this up, the final vector state equation and vector output equation for the observer canonical form of the state space representation can be written as: 𝑥 ⎡ 𝑥 ⎢ ⎢ ⎢𝑥 ⎣ 𝑥
𝑡 ⎤ 𝑡 ⎥ ⋮ ⎥ 𝑡 ⎥ 𝑡 ⎦
0 ⎡1 ⎢ ⎢⋮ ⎢0 ⎣0
0 0 ⋮ ⋯ ⋯
0 0 ⋮ 1 0
⋯ ⋯ ⋱ 0 1
𝑎 𝑎 ⋮ 𝑎 𝑎
𝑥 ⎤⎡ 𝑥 ⎥⎢ ⎥⎢ ⎥ ⎢𝑥 ⎦⎣ 𝑥
𝑡 ⎤ 𝑡 ⎥ ⋮ ⎥ 𝑡 ⎥ 𝑡 ⎦
𝑏 ⎡ 𝑏 ⎢ ⎢ ⎢𝑏 ⎣𝑏
𝑏 𝑏 ⋮ 𝑏 𝑏
𝑎 𝑎 𝑎 𝑎
⎤ ⎥ ⎥ 𝑢 𝑡 , ⎥ ⎦
(4‐39)
𝐱 𝑡
𝑦 𝑡
𝐀
0
0 ⋯
𝑥 ⎡ 𝑥 ⎢ 0 1 ⎢ ⎢𝑥 ⎣ 𝑥
𝐱 𝑡
𝐛
𝑡 ⎤ 𝑡 ⎥ ⋮ ⎥+𝑏 𝑢 𝑡 . 𝑡 ⎥ 𝑡 ⎦
(4‐40)
𝐜
𝐱 𝑡
𝐝
Comparing the equations (4‐24), (4‐34), (4‐39), and (4‐40), one can see that the two discussed canonical forms have a dual relationship. It is possible to transfer one form to the other by transposing the system matrix 𝐀 and by exchanging vectors 𝐛 and 𝐜. We will further exploit the observer canonical form within section 4.2.2. At this point we can detect, similar as we did before in the controller canonical form, that in the observer canonical form, the last column of the system matrix 𝐀 contains the elements of its own characteristic equation. 4.1.3 Time Discretization So far, we have limited our discussions to continuous‐time systems and models. Due to the usage of digital computers in many real world applications, it is desirable to model the system behaviour in a discrete time manner. That means that signals and systems are only evaluated at discrete time steps which usually exhibit an equidistant time in between, the so called step time 𝑇. It is common to denote the current time as product of the step time 𝑇 and a counting variable 𝑘. This gives rise to the following notation for a discrete time output signal, for instance: 𝑦
𝑦 𝑘
𝑦 𝑡
𝑘𝑇 , 𝑘
0, 1, 2, … .
(4‐41)
Figure 4‐12: Continuous time and discrete time signals
4.1 Basic Ideas and Concepts
105
Figure 4‐12 provides an overview of the continuous time signal and the corresponding discrete time realisation. As one can see, it is possible to transfer an algebraic continuous equation into a discrete time representation which is absolutely exact at the sampled points in time. It is not straightforward to do this with a differential equation, because each discrete time representation is per definition discontinuous. Instead, difference equations are used. That means, the value of the signal at the current time step can be described as a function of the signal values of past times. 4.1.3.1 Discretizing Employing Difference Quotients In order to transfer a differential equation into a difference equation, the time derivatives need to be replaced. A standard definition of the derivatives employing the difference quotient according to the following equation: d𝑦 d𝑡
lim 𝑦 𝑡 𝑡 →𝑡 𝑡
𝑦 𝑡 𝑡
.
(4‐42)
Figure 4‐13: Definition of the derivative via the difference quotient
The process is depicted in Figure 4‐13: The derivative of a function at time 𝑡 is computed as the slope of a line through 𝑦 𝑡 and 𝑦 𝑡 , where 𝑡 is set as close as possible to 𝑡 . For a discrete time signal, the closest possible time is one time step away. This might result in some inaccuracies, especially if the step time is large with respect to the time constants describing the signal behavior. Employing equation (4‐42), we can formulate the following rule to replace first order derivatives in ODEs: d𝑦 d𝑡
𝑦 𝑘 1 𝑦 𝑘 𝑘 1 𝑇 𝑘𝑇
𝑦 𝑘
1 𝑇
𝑦 𝑘
.
(4‐43)
For derivatives of higher orders, different difference quotients exist. For example, a second order derivative can be discretized the following way: d 𝑦 d𝑡
𝑦 𝑘
1
2𝑦 𝑘 𝑇
𝑦 𝑘
1
.
(4‐44)
To discretize an ODE describing a LTI system, all derivatives of output and input signals have to be replaced accordingly. The discrete time description is given by the following difference equation, according to the ODE in equation (4‐22):
106 𝑦 𝑘
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
𝑎
,
𝑦 𝑘
1
⋯ 𝑎
,
𝑦 𝑘
𝑛
𝑏
𝑢 𝑘
,
⋯
𝑏
,
𝑢 𝑘
𝑛 .
(4‐45)
The subscripted 𝑑s are used to distinguish the coefficients from those of the continuous time representation. The state space representation also exists for discrete time systems. The vector difference equation and the output equation can be written as 𝐱 𝑘 𝐲 𝑘
1
𝐀 𝐱 𝑘
𝐂𝐱 𝑘
𝐁 𝒖 𝑘 ,
(4‐46)
𝐃 𝒖 𝑘 .
Again, the subscripts in 𝐀 and 𝐁 show that the parameter matrices contain the parameters for the discrete time representation. It is straightforward to develop the state space representation for any system from its difference equation (4‐45) according to the procedures discussed before. Also the controller and observer canonical forms exist for discrete time systems. 4.1.3.2 Precise Time Discretization for A System in State Space Representation Finally we will discuss how to transfer a continuous time state space representation according to equation (4‐14) into a discrete time one according to equation (4‐46). It becomes obvious that we did not introduce specific discrete time versions of the output matrix 𝐂 and the feedthrough matrix 𝐃. This is due to the fact that these matrices do not change when being transferred from the continuous to the discrete time version, because the output equation is algebraic. For the non‐algebraic vector state space equation in the continuous form, we are looking for a discrete representation that is exact in the sample points. Therefore, it is straightforward to solve the vector state differential equation first, and then to discretize the (algebraic) solution to obtain the vector difference equation we are looking for. The general solution of the vector state equation was given in equation (4‐18). It is straightforward to discretize this equation for 𝑡 𝑘 𝑇 and 𝑡 𝑘 1 𝑇 to obtain: 𝐱 𝑘
e𝐀
e𝐀
𝐱
𝐁𝐮 𝜏
d𝜏 , (4‐47)
𝐱 𝑘
1
e𝐀
e𝐀
𝐱
𝐁𝐮 𝜏
d𝜏 .
For what follows, we assume that the continuous input signals are discretized by a zero order hold, that means, the following relation holds: 𝑢 𝑡
𝑢 𝑘
𝑐𝑜𝑛𝑠𝑡.,
𝑘𝑇
𝑡
𝑘
1 𝑇 .
(4‐48)
By splitting the integral in the above equation for 𝐱 𝑘 1 𝑇 into two parts and factoring out the term e𝐀 in the first part, we find an expression in which we can replace the term in round brackets by 𝐱 𝑘 𝑇 according to equation (4‐47). We obtain:
4.1 Basic Ideas and Concepts 𝐱 𝑘
1
e𝐀
e𝐀
107
e𝐀
𝐱
e𝐀
𝐁𝐮 𝜏
𝐁𝐮 𝑘
e𝐀 𝐱 𝑘
e𝐀
d𝜏
d𝜏
(4‐49)
𝐁𝐮 𝑘
d𝜏 .
The integral in the last equation expresses the contribution of the (constant) input signal during one time step. We can transform the integral by employing the substitution 𝜃 𝑘 1 𝑇 𝜏, which results in d𝜃 d𝜏: e𝐀
𝐁𝐮 𝑘
d𝜏
e𝐀 d𝜃 𝐁 𝐮 𝑘 .
(4‐50)
By this, we finally obtain: 𝐱 𝑘
1
e𝐀 𝐱 𝑘
e𝐀 d𝜃 𝐁 𝐮 𝑘 .
(4‐51)
By comparing this equation with the discrete vector state equation in (4‐46), we can find the following equations for the discrete time system and input matrices: 𝐀
e𝐀 ;
𝐁
e𝐀 d𝜃 𝐁 .
If the continuous system matrix fulfills the condition det 𝐴 simplifies to 𝐁
𝐀
𝐀
(4‐52) 0, the expression for 𝐁
𝐈 𝐁 .
(4‐53)
4.1.3.3 Comparison of The Discussed Approaches Using An Example We discuss the time discretization in an example, adapted from Ament and Glotzbach, 2016b. We look at a one‐dimensional mechanical system of a mass 𝑚, a linear damper with a viscous damping coefficient 𝑑, and a spring with a spring constant 𝑐, as displayed in Figure 4‐14. In this example, the right loose end of the spring can be moved, and the distance of movement works as input variable 𝑢 𝑡 . The position of the mass is denoted as output 𝑦 𝑡 . Employing some basic mechanical knowledge, the forces 𝐹 𝑡 and 𝐹 𝑡 of spring and damper affecting the mass in the direction of positive input values can be computed to be:
108
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
𝐹𝑐 𝑡
𝑐 𝑢𝑡
𝑦 𝑡 ; 𝐹𝑑 𝑡
𝑑𝑦 𝑡
(4‐54)
Figure 4‐14: Mechanical system
Employing Newton’s second law of motion gives: 𝑚𝑦 𝑡
𝑐 𝑢 𝑡
𝑦 𝑡
𝑑𝑦 𝑡
(4‐55)
By this, we have a continuous‐time model of the system. In what follows, we will show a time discretization, using the two discussed approaches. We will use a sample time time 𝑇 1 s, and will further let 𝑚 2 kg, 𝑐 0.25 N⁄m, and 𝑑 0.5 N s⁄m. The first discussed approach was the employment of difference quotients. We can replace the derivations in the ODE we have obtained according to equations (4‐43) and (4‐44): 𝑦 𝑘
1
⇒𝑦 𝑘
1
⇒𝑦 𝑘
1
𝑚
2𝑦 𝑘 𝑇
𝑦 𝑘
𝑇 𝑚
𝑑𝑇
1
2𝑚 𝑇
0.4 4.25 𝑦 𝑘
𝑐 𝑢 𝑘 𝑑 𝑇
𝑐 𝑦 𝑘
2𝑦 𝑘
1
𝑦 𝑘 𝑚 𝑦 𝑘 𝑇
𝑑
𝑦 𝑘
1
1 𝑇
𝑦 𝑘
𝑐𝑢 𝑘
,
(4‐56)
0.25 𝑢 𝑘
For the second approach, we can transfer the ODE into the state space representation, 𝑦 𝑡 𝑣 𝑡 , where 𝑣 𝑡 introducing a state vector 𝐱 𝑦 𝑡 represents the velocity of the mass. From the above ODE, we can see that 𝑣 𝑡
𝑐 𝑦 𝑡 𝑚
𝑑 𝑦 𝑡 𝑚
𝑐 𝑢 𝑡 , 𝑚
(4‐57)
which leads to the following state space representation: 0 𝑐 𝑚 𝐱 𝑡
𝐀
1 𝑑 𝐱 𝑡 𝑚
0 𝑐 𝑢 𝑡 . 𝑚 𝐛
(4‐58)
4.2 Evaluation of Observability in State Space 𝑦 𝑡
109
0 𝐱 𝑡 .
1
(4‐59)
𝐜
Note that the system is time‐invariant, as all matrices are constants. Time discretization leads to a model according to equation (4‐46) with the following matrices respectively vectors: 𝐀
e𝐀
0.943 0.867 ; 𝐛 0.108 0.726
𝐀
𝐀
𝐈 𝐛
0,057 . 0,108
(4‐60)
The continuous‐time system and both discrete time models have been simulated with 𝑢 𝑡 𝜎 𝑡 , where 𝜎 𝑡 represents the unit step function. The results are shown in Figure 4‐15. It becomes clear that the model which was discretized employing the state space representation (bright solid line) is more precise, as it equals the results of the continuous‐time system exactly in the scan points. The other model (dashed line) exhibits a larger inaccurateness, which would even grow for larger sample times due to the employed difference quotients.
Figure 4‐15: Step responses of the continuous‐time system and the two derived discrete‐time models
In the chapters 5‐7, we will mainly employ the discrete state space representation according to equation (4‐46). Usually, the state space models for the movement of the marine robots will directly be formulated in the discrete time domain. Nevertheless, with the discussions given so far, any continuous LTI system can be transferred into the discrete state space representation. For the sake of simplicity, we will discard the subscripts 𝑑 in the system and input matrices, as long as it is clear that all descriptions are given in the discrete time domain.
4.2 Evaluation of Observability in State Space Summing up the discussions on the state space representation made so far, and recapitulating the problem formulation from section 3.2, it becomes already obvious that the state space
110
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
representation might be suitable as a base for the solution of the given problems. It allows us to employ models of the marine robots, where those parameters of the navigation data vectors 𝛈 and 𝐯 that are of interest in a certain scenario can be used as states. For scenarios that belong to the group of Internal Navigation problems according to the definition in section 3.1, information is available about the control commands of the actuators, which might enable us to employ forces and moments as inputs for dynamical models, or velocities as inputs for kinematic ones. In scenarios related to the External Navigation, we will usually not have access to detailed control commands of the robot; however, we might still be able to consider some basic limitations in the abilities of a robot to change its speed or course angle within a certain time frame. Finally, the measureable outputs like distanced in a LBL scenario or angles in a USBL‐scenario can be derived as output data from the states. In this case, the task would be to determine the states, based on knowledge on the input and output values. The theory on the state space representation offers several tools especially for this task. However, systems exist for which the task is not solvable. This gives rise to the observability analysis. Within this section, we will at first introduce the term ‘observability’ for linear systems. In section 4.2.1, this term will be defined, compared with the similar concept of ‘controllability’, and it will be discussed how the observability of a system can be checked. In section 4.2.2, we will describe a possible linear observer for the described task. However, the observer theory is limited to deterministic signal considerations. In the described navigation systems, we need to be able to take measurement disturbances of stochastic behavior into consideration. Also, especially for the External Navigation, we need to be able to express uncertainties for the behavior of the system, namely for the concrete input signals. For this extend, we will need another possibility to incorporate stochastic behavior. This is the reason why the classical observer design is not the method of choice for us. We will continue to discuss the observability analysis for nonlinear systems in section 4.2.3, but for the reasons just stated we will not discuss nonlinear observers, as we need a different methodology. Nevertheless, the observability theory might contribute to the stated problems 4 – 6 in terms of Optimal Sensor Placement (OSP). For these tasks, we are looking for solutions in terms of placement and/or trajectories for sensors that result in optimal situations for the estimation of selected states. However, as we will see in sections 4.2.1 and 4.2.3, the classical observability theory is not suited to qualitatively evaluate and to compare the ‘level of observability’ of different systems. Instead, it is only possible to determine whether a system is observable or not. This is the reason why we need to apply a different approach. To this extend, we introduce the Gramian matrix for linear systems in section 4.2.4, and will extend the concept for nonlinear systems in section 4.2.4.2. The empirical Gramian will later be our method of choice for the enhanced OSP scenarios. In section 4.3, we will then bring the stochastics into the equation in order to develop tools for the estimation of relevant navigation data. The similarities and differences between the observer theory and the filter theory will become clear that way. 4.2.1 Observability and Controlability of Linear Systems The introduction of the state space representation has enhanced the possibilities to describe complex system structures in a great level of detail. In comparison to the classical control loop, this also requires new methods for the analyses of systems. In the classical control theory, systems are described by their input/ output relation, thus it is usually assumed that the control variable as output of the plant can be influenced by the actuating variable, and that the current value of the control variable is available to be compared with the reference variable.
4.2 Evaluation of Observability in State Space
111
4.2.1.1 Observability and Its Evaluation In the state space representation, the relevant variables to be controlled and/ or observed might be part of the state vector. It is not straightforward to assume that they can always be influenced by the inputs or that they can be observed by knowing the input and outputs over a certain period of time. In fact, these two mentioned properties are important characteristics of every system in state space representation. They are referred to as controllability and observability and have been introduced by Kalman, 1960b. For our discussion, mainly the observability is of interest. In what follows, we will discuss these terms, based on continuous time systems. Nevertheless, the statements are also valid for discrete time systems, as long as the employed times 𝑡 are replaced by their discrete equivalents 𝑘 𝑇. The following definition is based on Unbehauen, 2009, and Golnaraghi and Kuo, 2010: Definition: Observability Given a system with a state‐space representation given by equation (4‐14), the system is referred to be completely observable, or simple observable, if for any given input 𝐮 𝑡 , there exists a finite time 𝑡 𝑡 such that the knowledge of 𝐮 𝑡 and 𝐲 𝑡 both for 𝑡 𝑡 𝑡 as well as of matrices 𝐀, 𝐁, 𝐂, and 𝐃 allows for the unique determination of 𝐱 𝐱 𝑡 . If a system is observable, it is possible to design an observer. The procedures will be discussed in section 4.2.2. To check whether a system is observable, it is possible to use the observability criteria of Kalman: Theorem: Observability criteria of Kalman Given a system in state space representation of order 𝑛 with 𝑟 outputs, the system is observable, if and only if the observability matrix 𝐒 , defined as 𝐂 ⎡ 𝐂𝐀 ⎢ 𝐒 ∶ ⎢ 𝐂𝐀 ⎢ ⋮ ⎣𝐂 𝐀
⎤ ⎥ ⎥∈ℝ ⎥ ⎦
,
(4‐61)
has rank 𝑛, that is: rank 𝐒
𝑛 .
(4‐62)
In this case, also the pair 𝐀, 𝐂 is said to be observable. For systems with only one output, 𝐒 is a squared matrix. In this case, the condition for observability is fulfilled if and only if the matrix is nonsingular: det 𝐒
0 .
(4‐63)
In what follows, we will proof this theorem, following some discussions in Lunze, 2010. To this end, we look at the solution of the vector state space equation, as given in equation (4‐18). According to equation (4‐14) b), it is straightforward to compute 𝐲 𝑡 by multiplying (4‐18) with 𝐂 and adding the term 𝐃 𝐮 𝑡 :
112
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
𝐂 e𝐀 𝐱
𝐲 𝑡
𝐂 e𝐀
𝐁𝐮 𝜏
d𝜏
𝐃 𝐮 𝑡 .
(4‐64)
Looking at this equation, it is obvious that the values of 𝐲 𝑡 are influenced by two different causes: There is the forced motion 𝐲 𝑡 of the system, forced by the input value which is added from ‘outside’, and the free motion 𝐲 𝑡 which is caused by the initial internal states at time 𝑡 0, 𝐱 . Therefor we can write: 𝐲 𝑡
𝐲
𝐲
𝑡
𝑡
𝐲
𝑡 ,
𝐂 e𝐀 𝐱 ; 𝐲
𝐂 e𝐀
𝑡
𝐁𝐮 𝜏
d𝜏
𝐃 𝐮 𝑡 ; (4‐65)
𝐲
𝑡
𝐲 𝑡
𝐂 e𝐀
𝐁𝐮 𝜏
d𝜏
𝐃 𝐮 𝑡 .
We can see from the last equation: By knowing 𝐮 𝑡 and 𝐲 𝑡 both for 𝑡 𝑡 𝑡 as well as the matrices 𝐀, 𝐁, 𝐂, and 𝐃, we can compute 𝐲 𝑡 . Therefore, the system is observable if and only if the equation 𝐲
𝑡
𝐂 e𝐀 𝐱
(4‐66)
can be solved for 𝐱 . That proofs that neither matrices 𝐁, 𝐃 nor the concrete course of the input signal have any influence on the observability of a linear system. As 𝐱 contains 𝑛 unknown variables, we need at least the same number of independent equations. The vector equation (4‐66) gives us 𝑟 single equations, but usually the relation 𝑟 𝑛 is fulfilled for real systems. Nevertheless, the knowledge of 𝐲 𝑡 in an arbitrary long interval allows us to use equation (4‐66) at 𝑛 different points in time. If we look for instance at a system with a single output, which is characterized by its system matrix 𝐀 ∈ ℝ and output vector 𝐜 ∈ ℝ , we can write equation (4‐66) in the following form: 𝑦 ⎛𝑦 ⎜ ⎝
𝑦
𝑡 𝑡 ⎞ ⎟ ⋮ 𝑡 ⎠
𝐌𝐱 ; 𝐌
𝐜 e𝐀 𝐜 e𝐀 ⋮ 𝐜 e𝐀
.
(4‐67)
Now the system is observable if the 𝑛 points in time can be selected in a way that matrix 𝐌 is invertible, that is, the following condition must hold: rank 𝐌
𝑛 .
(4‐68)
Every row of 𝐌 contains the expression 𝐜 e𝐀 . Using the time series expansion of the exponential function which was introduced in equation (4‐19), it is possible to transfer this expression into a sum with infinite elements:
4.2 Evaluation of Observability in State Space 𝐜 e𝐀
𝐜
𝐜 𝐀𝑡
𝐜 𝐀
𝑡 2!
𝐜 𝐀
113 𝑡 3!
⋯
(4‐69)
According to the Cayley‐Hamilton theorem, every square matrix satisfies its own characteristic equation. That means, for a polynomial 𝑃 𝐀 which is defined as the characteristic polynomial of 𝐀, just inserting 𝐀 instead of 𝜆, it holds true that 𝑃 𝐀
𝐀
𝑎
𝐀
⋯
𝑎 𝐀
𝑎 𝐀
𝟎
(4‐70)
As this equation can be solved for 𝐀 , it is straightforward to say that 𝐀 and even any higher exponentiation of 𝐀 can be computed as linear combinations of the exponentiations 𝐀 , 𝐀 , 𝐀 , … , 𝐀 . As a consequence of this, it is possible to write the infinite sum in equation (4‐69) as a sum with a finite number of summands in the following form: 𝐜 e𝐀
𝑐 𝑡 𝐜
𝑐 𝑡 𝐜 𝐀
𝑐 𝑡 𝐜 𝐀
⋯
𝑐
𝑡 𝐜 𝐀
,
(4‐71)
where 𝑐 𝑡 , 𝑗 0,1, … , 𝑛 1 are functions of time 𝑡 . Hence, it is straightforward to state that every row in 𝐌 is a linear combination of the row vectors 𝐜 , 𝐜 𝐀, 𝐜 𝐀 , … , 𝐜 𝐀
.
(4‐72)
For 𝐌 to exhibit the rank 𝑛, these row vectors have to be linearly independent. This is exactly what is evaluated in the observability criterion, namely in the equations (4‐61) and (4‐62). This proofs the observability criteria of Kalman for single output systems. If a system possesses 𝑟 outputs, it can be shown that every row vector in 𝐌 ∈ ℝ is a linear combination of 𝐜 , 𝐜 𝐀, 𝐜 𝐀 , … , 𝐜 𝐀
, 𝑘
1,2, … , 𝑟 ,
(4‐73)
where 𝐜 equals the 𝑘th row of output matrix 𝐂. This shows that the above mentioned theorem is also valid for multi output systems. 4.2.1.2 Controllability and Duality to Observability Having introduced the observability, we will now discuss shortly the controllability. As mentioned before, we will not need this concept in our further discussions. But actually there are some interesting aspects in the relation between observability and controllability. The following definitions are again based on Unbehauen, 2009, and Golnaraghi and Kuo, 2010: Definition: Controllability Given a system in state space representation, the system is referred to be completely state controllable, or simple controllable, if there exists a control input 𝐮 𝑡 , that will drive any state from the initial values at 𝑡 , 𝐱 𝐱 𝑡 , to any final state 𝐱 𝑡 in a finite time 𝑡 𝑡 0 . Like before, there is the controllability criteria of Kalman which can be used to evaluate controllability. It is similar to the one stated before for the observability evaluation, and it can be proven in a similar way. The concrete proof is omitted here.
114
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
Theorem: Controllability criteria of Kalman Given a system in state space representation of order 𝑛 with 𝑚 outputs, the system is controlable, if and only if the controllability matrix 𝐒 , defined as 𝐒 ∶
𝐁 𝐁𝐀 𝐁𝐀
𝐁𝐀
∈ℝ
,
(4‐74)
has the rank n: rank 𝐒
𝑛 .
(4‐75)
In this case, also the pair 𝐀, 𝐁 is said to be controllable. For systems with only one input, 𝐒 is a squared matrix. In this case, the condition for observability is fulfilled if and only if the matrix is nonsingular: det 𝐒
0 .
(4‐76)
Looking at the discussions above, it becomes clear that observability and controllability are in some relation to each other. In fact, these properties are dual towards each other. As stated by Lunze, 2010, we can look at the systems 1 and 2 described by the following equations: 𝐱 𝑡
𝐀𝐱 𝑡
𝐲 𝑡
𝐂 𝐱 𝑡 ,
𝐱 𝑡
𝐀 𝐱 𝑡
𝐲 𝑡
𝐁 𝐱 𝑡 .
𝐁𝒖 𝑡 , 𝐱 0 𝐂 𝒖 𝑡 , 𝐱 0
𝐱
,
𝐱
,
,
,
(4‐77)
It can be stated that system 2 was derived from system 1 by transposing all matrices, and by exchanging input and output matrix. In this case, the following statement holds true: System 2 is observable respectively controllable if and only if system 1 is controllable respectively observable. 4.2.1.3 Examples for Evaluation of Observability As we now that clearly defined the terms observability and controllability in a formal way, we will now look at some examples to deepen the understanding on their meaning. Figure 4‐16 which was inspired by Unbehauen, 2009, and Golnaraghi and Kuo, 2010 shows the block diagram of a 4th order MISO system which is split into subsystems, one for every state. Let the system matrix be a diagonal matrix, that is, all elements outside of the main diagonal are zero. Let us further assume that the elements in the main diagonal are different from zero and mutually different. The output matrix 𝐜 equals 𝑐 0 𝑐 0 . Due to the described structure of the system matrix, it can be concluded that the states do not influence each other. For this reason, we can easily see from Figure 4‐16: Subsystem 1 is influenced by the inputs; furthermore it has an influence on the output. Therefore, it is both controllable and observable. Subsystem 2 is controllable; however, as it has no influence on the output or on any other state, it is unobservable. Note that this also limits the possibilities to design a close loop control for this subsystem, even though it is controllable. Subsystem 3,
4.2 Evaluation of Observability in State Space
115
on the other hand, is observable, but not controllable; Subsystem 4 is both uncontrollable and unobservable.
Figure 4‐16: System split into subsystems to demonstrate (un)observable/ (un)controllable parts
Finally, let us look at the autonomous system of 2nd order with the following structure: 𝐱 𝑡
𝐀𝐱 𝑡
𝑦 𝑡
𝐜 𝐱 𝑡
𝑎 𝑎 𝑐
𝑎 𝑎
𝑥 𝑡 𝑥 𝑡
𝑥 𝑥
, 𝐱
, ,
,
(4‐78)
𝑐 𝐱 𝑡 .
We will look at different values for the parameters in the system and output matrices, compute the associated observability matrix, according to 𝐒
𝐂 , 𝐂𝐀
(4‐79)
and determine the observability of the system by checking whether the condition det 𝐒 is fulfilled. For the first case to discuss, we look at the following matrices: 𝐀
1 0
0 ;𝐜 2
1
0 ⇒ 𝐒
1 1
0 ; det 𝐒 0
0 .
0
(4‐80)
It is straightforward to see that within this system, state 2 is unobservable. It influences neither the output nor state 1, so its initial value cannot be concluded. With a little change in 𝐜 , we obtain 𝐀
1 0
0 ;𝐜 2
1
1 ⇒ 𝐒
1 1
1 ; det 𝐒 2
1 .
(4‐81)
This system is observable, as both states have an influence on the output. If we look at the system 𝐀
1 0
0 ;𝐜 1
1
2 ⇒ 𝐒
1 1
2 ; det 𝐒 2
0 ,
(4‐82)
116
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
we can see that it is not observable. This might come as surprise on first view, as both states have an influence on the output, even with different parameters in 𝐜 . However, as both states exhibit the same dynamical behaviour due to matrix 𝐀, they are not uniquely distinguishable. Figure 4‐17 shows the course of both states and the output for two different initial conditions. In the upper case, 𝐱 was set to 1 2 , while in the lower case, 𝐱 3 1 was used. It is easy to see that for both cases, the course of 𝑦 𝑡 displayed as dark solid line is the same. When two different initial conditions result in the same output behaviour, they are not distinguishable, and the system is not observable.
Figure 4‐17: States and output of a selected system for different initial conditions
Looking at the system 𝐀
1 0
1 ;𝐜 1
1
0 ⇒ 𝐒
1 1
0 ; det 𝐒 1
1 ,
(4‐83)
we see that it is observable. Even though the second state does not influence the output directly, it has influence on the other state; therefore it can also be detected in the output. Also, the system 𝐀
1 1
1 ;𝐜 1
1
0 ⇒ 𝐒
1 1
0 ; det 𝐒 1
1 ,
(4‐84)
is observable. In fact, both states have the same dynamical behaviour, but as long as the parameters in 𝐜 are different from each other, the knowledge of 𝑦 𝑡 allows for the detection of 𝐱 uniquely. Consequently, the system 𝐀
1 1
1 ;𝐜 1
1
1 ⇒ 𝐒
1 2
1 ; det 𝐒 2
0 ,
(4‐85)
is unobservable. As the concept of observability has now been introduced and explained in detail, we will continue to discuss how an observer can be designed that will output an estimate for the current state vector values immediately.
4.2 Evaluation of Observability in State Space
117
4.2.2 Design of Linear Observers If a state‐space model is observable, it is possible to design an observer which is capable of computing the current state value. We will use the notation 𝐱 𝑡 for the ‘estimated’ state vector to distinguish it from the real one. The ability to compute 𝐱 𝑡 in a good quality is important for state control, as it is necessary to have information about the current state value at all time. For the sake of simplicity, we will look at a strictly proper system, that means that no feedthrough matrix exists, as it is common for most real system. Hence, the system description equals 𝐱 𝑡
𝐀𝐱 𝑡
𝐲 𝑡
𝐂 𝐱 𝑡 .
𝐁 𝒖 𝑡 ,
(4‐86)
4.2.2.1 Structure of Linear Observers Two simple solutions that might come into thought cannot be used. Firstly, one could think to solve the second equation in (4‐86) for 𝐱 𝑡 and to use this result as observed value: 𝐱 𝑡
𝐂
𝟏
𝐲 𝑡 .
(4‐87)
Figure 4‐18: Block diagram of the linear state observer (Luenberger observer)
But usually, the number of states is greater than the number of outputs, so that matrix 𝐂 is not square and cannot be inverted. Secondly, it might seem reasonable to use the solution of the state differential equation which is given in (4‐18), and to insert the known course of the input vector 𝒖 𝑡 in order to compute an observation for 𝐱 𝑡 . This solution does not consider the usually unknown initial states of the real system, 𝐱 . Additionally, it might suffer from increasing inaccuracies, as the true values for the matrices 𝐀, 𝐁, and 𝐂 can only be identified with some uncertainty. Therefore, a better solution could strive to combine both mentioned procedures: On the one hand, by knowing 𝒖 𝑡 and having good estimations for 𝐀, 𝐁, and 𝐂, it is possible to get some estimation of 𝐱 𝑡 . The quality of that observation can be improved by
118
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
comparing the true measurements 𝐲 𝑡 with those that would arise from the estimated state vector, denoted as 𝐲 𝑡 . The difference between 𝐲 𝑡 and 𝐲 𝑡 could then be used to improve the state vector observation 𝐱 𝑡 . This procedure is referred to as observer, and also called Luenberger observer, named after its inventor (Luenberger, 1971). The principle scheme is depicted in Figure 4‐18. As can be seen, the idea is to run a model of the system parallel to the actual system. The input vales of the real system are assumed to be known; they might be steered by a controller or measured. That way, they can also be used as inputs for the model. The model outputs an observed state vector 𝐱 𝑡 , and, by multiplying it with the assumed output matrix 𝐂, an ‘assumed’ output 𝐲 𝑡 . This assumed output is compared with the real system output 𝐲 𝑡 . Assuming that 𝐂 𝐂, one can conclude that a large difference between 𝐲 𝑡 and 𝐲 𝑡 hints at a large difference between 𝐱 𝑡 and 𝐱 𝑡 . Therefore, this difference is multiplied by a matrix 𝐋 and added to 𝐱 𝑡 . The observer makes usage of both the model and input information as well as the output measurements. Interestingly, the structure of the Luenberger observer is exactly the same as for the continuous time Kalman filter, also referred to as Kalman Bucy filter. The only difference is the approach to compute suitable values for the gain matrix that is multiplied with the output difference. In what follows, we will discuss this task for the Luenberger observer. 4.2.2.2 Parameter Computation for Linear Observers To this extend, we will assume that we are able to identify the relevant matrices of the system with great exactness, so that we can assume 𝐀
𝐀,
𝐁
𝐁, 𝐂
𝐂 .
(4‐88)
We need to analyze which influence matrix 𝐋 has on the quality of the observation, especially on the observation error 𝐞 with 𝐞 𝑡
𝐱 𝑡
𝐱 𝑡 .
(4‐89)
We can read from the block diagram in Figure 4‐18 that 𝐱 𝑡
𝐀𝐱 𝑡
𝐁𝐮 𝑡
𝐋 𝐲 𝑡
𝐲 𝑡
𝐀𝐱 𝑡
𝐁𝐮 𝑡
𝐋𝐂 𝐱 𝑡
(4‐90)
𝐱 𝑡 .
By inserting equation (4‐86) a) and (4‐90) into (4‐89), which is derived with respect to time 𝑡 before, we obtain: 𝐞 𝑡
𝐀𝐱 𝑡 𝐀
𝐁𝒖 𝑡 𝐋𝐂
𝐱 𝑡
𝐀𝐱 𝑡 𝐱 𝑡
𝐁𝐮 𝑡
𝐋𝐂 𝐱 𝑡
𝐱 𝑡
.
(4‐91)
We can replace the round brackets according to (4‐89) to obtain: 𝐞 𝑡
𝐀
𝐋 𝐂 𝐞 𝑡 .
(4‐92)
4.2 Evaluation of Observability in State Space
119
Note that this is nothing else but a state space representation of the observation error 𝐞 𝑡 as an autonomous system, with 𝐀 𝐋 𝐂 as system matrix. Figure 4‐19 shows the structure of this system.
Figure 4‐19: Observation error displayed as autonomous system
We have seen in the discussions around equation (4‐20) that the poles of the transfer function of the system are also eigenvalues of the system matrix. As a conclusion of this, we need to do the following: At first we need to decide where the poles of the transfer function of the 𝑠 ⋯ 𝑠 . Then, observer should be. This will result in a vector with 𝑛 poles, namely 𝐬 we have to chose 𝐋 in a way that the elements in 𝐬 are the solutions of the equation det 𝑠 𝐈
𝐀
𝐋𝐂
0 .
(4‐93)
according to equation (4‐21). Note that, as the system is autonomous, the poles of the transfer function describe the dynamical transition from the initial values 𝐞 0 to the error function 𝐞 𝑡 . We want the observed values 𝐱 𝑡 to strive against the actual values 𝐱 𝑡 , so the relation lim 𝐞 𝑡
0
→
(4‐94)
has to hold true. We can achieve this by placing the poles of the transfer function in the left s‐ plane. This guarantees that equation (4‐94) will be fulfilled. We can determine the dynamical behavior of the observer with the concrete choice of the places where to place the poles. As long as a system is observable according to the definition made in section 4.2.1, it is guaranteed that an observer with arbitrary pole placement can be realized. However, especially for systems of higher order (𝑛 3), the selection of an accordant gain matrix 𝐋 is complicated, because the solution of equation (4‐93) is not trivial. In the following we will look at two simple examples, inspired from Unbehauen, 2010, in order to understand the working principle of observers. Let us assume we have a SISO system described by equation (4‐83); with 𝐁 0 1 and 𝑢 𝑡 2 ∙ 𝜎 𝑡 , where 𝜎 𝑡 represents the unit step function, and 𝐱 2 1 . Evaluating the eigenvalues of 𝐀, we detect that the system has a double pole at 1. How do we have to 𝑙 𝑙 in order to realize an observer with poles at arbitrary design the gain matrix 𝐋 position, denoted as 𝑠 , 𝑠 ? To this extend, we insert all the defined values into equation (4‐93) to obtain det
𝑠 0
0 𝑠
1 0
1 1
𝑙 𝑙
0 0
det
𝑠
1 𝑙
1
𝑙 𝑠
1
0
(4‐95)
120
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
⇔𝑠
𝑠 2
𝑙
𝑙
𝑙
1
0.
On the other hand, in order for the poles to be at the defined position, the following characteristic equation has to hold true: s ⇔𝑠
𝑠
s 𝑠
𝑠 𝑠
0 𝑠
𝑠
𝑠
0.
(4‐96)
By equating coefficients between equations (4‐95) and (4‐96), one obtains two conditional equation for 𝑙 and 𝑙 . This gives rise to the question, where the observer poles should be placed.
Figure 4‐20: Observation (bright lines) of state 1 (solid line) and state 2 (dashed line) and real states (dark lines) of the example system for different pole positions of the observer
Figure 4‐20 shows the course of the states and the observation for the example system with different observer poles. Top left shows the situation for 𝐋 𝑠 2, 2 1 ⇒𝑠 that means, the observer poles are places slightly left of the system poles. This solution can be considered as the best of the displayed ones; the observations tend to the real values relatively quickly without overshooting. The top right picture was obtained with 𝐋 2. As one would expect, this results in an unstable behavior; the 6 9 ⇒𝑠 observations and the true values do no longer converge. The usage of 𝐋 1 0.25 ⇒𝑠 𝑠 0.5 results in the situation displayed bottom left. The observer poles are placed in the left s‐plane, but right of these of the system. As a consequence, the observer dynamic is slower as those of the system. Therefore, the convergence of observation and real state is very slow and might not be suitable for typical applications. Finally, bottom right shows the situation for 𝐋 𝑠 10. In this case, the observer is much 18 81 ⇒ 𝑠 faster as the system. This might result in situation which are undesirable in real scenarios. As it
4.2 Evaluation of Observability in State Space
121
is visible in the picture, there is a large overshooting of the estimation of state 2 at the beginning. If the observations are used as input for a controller, this would result in a very severe reaction of the controller. If there is some noise in one of the signals, e.g. the measurements of the outputs, this would be tremendously amplified. As a consequence of the discussions, we can state that the poles of the observers should be slightly left of the poles of the system in the left s‐plane. If they are place too far left, the observer will react very nervously with high overshoots. If the poles are placed right of those of the system, the observer reacts too slowly, or even exhibits an unstable behavior if at least one pole is placed on the right s‐plane. These statements are summarized in Figure 4‐21 which shows an assessment of possible pole localizations of the observer.
Figure 4‐21: Assessment of the pole placement of the observer
At this point, it shall be investigated what happens if one tries to design an observer for un unobservable system. We will use the unobservable system according to equation (4‐85) for this purpose. Trying to compute the characteristic equation of the observer gives: det ⇔𝑠
𝑠 0
0 𝑠 𝑠 2
1 1 𝑙
𝑙 𝑙
1 1 𝑙
1
𝑙 𝑙
det
𝑠
1 1
𝑙 𝑙
1 𝑠
𝑙 1
𝑙
0 (4‐97)
0 .
We can see that in this case it is not possible to place the observer poles at arbitrary positions by performing equating coefficients with equation (4‐96), which also shows that the system is unobservable. 4.2.2.3 Observer Design for A System in The Observer Canonical Form The method discussed to design an observer for a SISO system is only suitable for systems with a degree of maximal 3, due to the computation of the determinant. However, the computation is straightforward if the state space realization is formulated in the observer canonical form, as introduced in equations (4‐39) and (4‐40). In this case, the system matrix of the autonomous system which describes the development of the observation error according to Figure 4‐19, 𝑙 𝑙 ⋯ 𝑙 , equals: 𝐀 𝐋 𝐂, where 𝐋
𝐀
𝐋𝐂
0 ⎡ 1 ⎢ ⎢⋮ ⎢0 ⎣0
0 0 ⋮ ⋯ ⋯
0 0 ⋮ 1 0
⋯ ⋯ ⋱ 0 1
𝑎 𝑎 𝑎 𝑎
𝑙 𝑙 ⋮
⎤ ⎥ ⎥ , 𝑙 ⎥ 𝑙 ⎦
(4‐98)
that means, the poles of the observer equal the eigenvalues of this matrix. As we see, the system matrix 𝐀 𝐋 𝐂 also fulfills the observer canonical form. As it was discussed at the end
122
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
of section 4.1.2, a system matrix in a observer canonical form contains the elements of its own characteristic equation. Therefore, it is straightforward to say that the eigenvalues of 𝐀 𝐋 𝐂 fulfill the equation 𝑠
𝑎
𝑙
𝑠
⋯
𝑎
𝑙
𝑠
𝑎
𝑙
0 .
(4‐99)
If we want the 𝑛 poles to be placed at 𝑠 , 𝑠 , … , 𝑠 , we simply have to evaluate the characteristic equation 𝑠
𝑠
0
(4‐100)
and perform equating coefficients between equations (4‐99) and (4‐100) in order to compute the elements of the gain matrix 𝐋. To sum up our discussions of the design of linear observers, we have seen that the overall concept is based on deterministic systems and signal description. We need to be able to consider stochastic behavior, both for (sometimes) unknown input values, as well as for measurements which have to be assumed to be superimposed by a noise. We will discuss suitable ways in section 4.3. However, we might still be able to employ the observability concept in the framework of Optimal Sensor Placement tasks. But so far, we have only discussed the concept for linear systems. In what follows, we will introduce the observability for nonlinear systems. 4.2.3 Observability of Nonlinear Systems Many real world systems exhibit a nonlinear behaviour. The adequate notation in state space equals 𝐱 𝑡
𝐟 𝐱 𝑡 ,𝒖 𝑡
,
𝐲 𝑡
𝐠 𝐱 𝑡 ,𝒖 𝑡
,
(4‐101)
where 𝐟 ∙ and 𝐠 ∙ describe arbitrarily nonlinear functions. In what follows, we will discuss how the term 'observability' is defined for nonlinear systems and how it can be detected. The procedure is much more complicated, since for nonlinear systems, we will not find an algebraic necessary condition for observability, and a sufficient condition only exists for a subgroup of observability. Furthermore, the input function is of importance when evaluating observability for nonlinear systems, which was not the case for linear systems. The following discussions are based on Hermann & Krener, 1977, Mangold, 2016, and Adamy, 2014. 4.2.3.1 The Concept of Indistinguishable States As we have seen in the example of equation (4‐82) and Figure 4‐17, we can immediately and without further computation declare a linear system as unobservable if we find two different initial conditions that result in the same output trajectory. We can use a similar concept for definition purposes within nonlinear systems, but we need to be careful, as for linear systems, observability does not depend on the system inputs. This is not true for nonlinear systems. As introduction, we will use the following definition: Definition: Indistinguishable states
4.2 Evaluation of Observability in State Space
123
Given a system in nonlinear state space representation according to equation (4‐101), in which the states can reach values out of the set ℳ, the two initial states 𝐱 , 𝐱 𝒙 𝑡 ;𝐱 ,𝐱 ∈ 𝓜 are denoted as indistinguishable, if for every admissible input 𝐮 𝑡 , 𝑡 𝑡 𝑡 , the output of the system shows identical behavior, that is: 𝐲 𝑡, 𝐱
𝐲 𝑡, 𝐱
, 𝑡
𝑡
𝑡 .
(4‐102)
We use the notation 𝑰 𝐱 for the set of all states that are indistinguishable from 𝐱 . If 𝐱 ,𝐱 equation (4‐102) holds, we can say that 𝑰 𝐱 The concept of indistinguishable states is displayed in Figure 4‐22 which was inspired by Mangold, 2016: Two different initial states result in identical output trajectories
Figure 4‐22: Indistinguishable States
4.2.3.2 Different Concepts of Observability for Nonlinear Systems Based on the indistinguishable states, we can define the concept of (global) observability of nonlinear systems: Definition: (global) Observability of nonlinear systems Given a system in nonlinear state space representation according to equation (4‐101), in which the states can reach values out of the set ℳ, the system is referred to as globally observable at 𝐱 or simply observable at 𝐱 if and only if 𝑰 𝐱 𝐱 . If the condition 𝑰 𝐱 𝐱 holds for all 𝐱 ∈ 𝓜, the system is said to be globally observable, or simply observable. There is an important difference to the definition of observability of linear systems, where the input trajectory was not of any importance. If a nonlinear system is said to the observable, it means that there is at least one possible input trajectory that results in the condition for observability to be fulfilled. There might always be input trajectories which result in two states being indistinguishable, even if the system is observable. The defined concept of observability is referred to as global, there is no statement about the length of the trajectory of 𝐱 𝑡 or the time necessary before it is possible to distinguish between points of ℳ. Especially for real applications, one is interested in a concept that can be fulfilled within a limited space. For this extend, a stricter concept of local observability is
124
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
introduced, that is based on a limited open neighborhood U of 𝐱 . This gives rise to the following definitions: Definition: 𝑈‐Indistinguishable states Given a system in nonlinear state space representation, in which the states can reach values out of the set ℳ, and let 𝑈 be a subset of ℳ. The two initial states 𝐱 , 𝐱 𝒙 𝑡 ;𝐱 ,𝐱 ∈ 𝓜 are denoted as 𝑈‐indistinguishable, if for every admissible input 𝐮 𝑡 , 𝑡 𝑡 𝑡 , whose trajectories 𝐱 𝑡, 𝐱 and 𝐱 𝑡, 𝐱 , 𝑡 𝑡 𝑡 both lie completely in 𝑈, the output of the system shows identical behavior, as defined in equation (4‐102). If equation (4‐102) holds with both trajectories lying in 𝑈, we can say that 𝑰 𝐱 𝐱 , 𝐱 . This gives rise to the stricter concept of local observability: Definition: Local observability of nonlinear systems Given a system in nonlinear state space representation, in which the states can reach values out of the set ℳ, and let 𝑈 be a subset of ℳ. The system is referred to as locally observable at 𝐱 if and only if for any open neighborhood 𝑈 of 𝐱 , the condition 𝑰 𝐱 𝐱 . If the condition 𝑰 𝐱 𝐱 holds for all 𝐱 ∈ 𝓜, the system is said to be locally observable. Due to the nonlinear character of the systems currently under discussions, situation might arise in which observability according to the made definitions is not fulfilled, even if it might be possible to determine the initial system states under some assumptions. The following example is stated from Adamy, 2014: We look at the following nonlinear autonomous SISO system: 𝑥 𝑡 𝑦 𝑡
1 , 𝑥 𝑡
(4‐103)
𝑥 𝑡 .
We can easily state that the system is not observable under the definitions above, because the two initial states 𝑥 √𝑑 and 𝑥 √𝑑, 𝑑 ∈ ℝ will clearly result in the same system output trajectory. However, if we have some additional information on the system, we might be able to discard one of the possible solutions. We might know that 𝑥 𝑡 can only reach positive values, e.g. for absolute temperatures or energy states. To give an example from marine robot navigation, let us look at the GIB scenario as it was described in section 2.5.5 and depicted inFigure 2‐20: Assume there is an underwater target located at 𝑥 𝑦 𝑧 in an inertial NED frame. An arbitrary number of at least three buoys are placed the sea surface. They are capable of determining their own position via GPS, and they can measure the distance to the target. From this scenario and without looking deeper into the mathematics, it is straightforward to say that due to the fact that all buoys are located at the surface and therefore have the same 𝑧‐ coordinate, there will always be two possible positions to be detected of the underwater target that lead to the same measurements: The true one, and the 𝑧 . But as we can surely assume that an underwater target cannot be above one at 𝑥 𝑦 the sea surface, we can easily discard the wrong result. Therefore, in this scenario, the overall system might not fulfil the strict definition of nonlinear observability as given above, but it is still observable in some sense. In other scenarios, we might have some rough guess about the initial state of the system, which might allow us to find the true solution out of several mathematically possible ones. To this respect, we introduce the concept of weak observability: It ensures the possibility to distinguish between two initial states in close vicinity to each other,
4.2 Evaluation of Observability in State Space
125
but allows for indistinguishability between two states which are in some distance towards each other. Definition: Weak observability of nonlinear systems Given a system in nonlinear state space representation, in which the states can reach values out of the set ℳ. The system is referred to as weakly observable at 𝐱 if and only if for some open neighborhood 𝑉 ∈ ℳ of 𝐱 , the condition 𝑰 𝐱 ∩ 𝑉 𝐱 holds true. If the condition 𝑰 𝐱 ∩𝑉 𝐱 holds for all 𝐱 ∈ ℳ for some open neighborhood 𝑉 ∈ ℳ of 𝐱, the system is said to be weakly observable. Finally, we can combine the last two introduced concepts to define the local weak observability: Definition: Local weak observability of nonlinear systems Given a system in nonlinear state space representation, in which the states can reach values out of the set ℳ, and let 𝑈 be a subset of ℳ. The system is referred to as locally weakly observable at 𝐱 if and only if for any open neighborhood 𝑈 of 𝐱 , there exists some neighborhood 𝑉 ∈ 𝑈 of 𝐱 , where the condition 𝑰 𝐱 ∩ 𝑉 𝐱 holds true. If this is fulfilled for all 𝐱 ∈ ℳ, the system is said to be locally weakly observable.
Figure 4‐23: Overview of the different concepts of nonlinear observability and their implications
Figure 4‐23 which borrows again from Mangold, 2016 provides an overview of the different discussed concepts of nonlinear observability and their implications. The local observability is the strongest property. It requires the absence of indistinguishability between any two states
126
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
in ℳ, while the state trajectories may only lie in a subset of ℳ. Local observability implies both (global) observability and local weak observability which are both a weaker property. (Global) observability allows for the state trajectories to cover the whole set ℳ, while local weak observability allows any state to have indistinguishable partners outside of an open neighborhood around it. The weakest property is the weak observability, which allows for state trajectories in the whole set ℳ and for indistinguishable partners outside of an open neighborhood. Therefore, it is implied by both (global) observability and local weak observability. Interestingly enough, for linear systems, there is no differentiation between these four concepts; if an autonomous linear system is observable according to the definition made in section 4.2.1, it fulfills all of the four definitions. 4.2.3.3 Evaluation of Observability for Nonlinear Autonomous Systems The evaluation of nonlinear observability is not straightforward; however, it is possible to derive an algebraic test to check for local weak observability. Following Adamy, 2014, we will start our discussions with the general autonomous nonlinear system 𝐱 𝑡
𝐟 𝐱 ,
𝑦 𝑡
𝑔 𝐱 .
(4‐104)
It is straightforward to see that the derivative of the output, 𝑦 𝑡 , equals d𝒚 𝑡 d𝑡
𝑦 𝑡
𝜕𝑔 𝐱 d𝐱 𝜕𝐱 d𝑡
𝜕𝑔 𝐱 𝐟 𝐱 . 𝜕𝐱
(4‐105)
Employing the so‐called Lie derivative 𝐿𝐟 ∙ , which is defined as 𝐿𝐟 𝑔 𝐱
∶
𝜕𝑔 𝐱 𝐟 𝐱 , 𝜕𝐱
(4‐106)
we can write the derivatives of 𝑦 𝑡 in the following form, introducing the multiple Lie derivatives: 𝑦 𝑡
𝜕𝑔 𝐱 𝐟 𝐱 𝜕𝐱
𝑦 𝑡
𝜕 𝜕𝑔 𝐱 𝐟 𝐱 𝜕𝐱 𝜕𝐱
𝐿𝐟 𝑔 𝐱 𝐟 𝐱
, 𝐿𝐟 𝐿𝐟 𝑔 𝐱
: 𝐿𝐟 𝑔 𝐱
,
(4‐107)
⋮ 𝑦
𝑡
𝐿𝐟 𝐿𝐟
𝑔 𝐱
: 𝐿𝐟
𝑔 𝐱
.
Defining 𝑦 𝑡
𝐿𝐟 𝑔 𝐱
𝑔 𝐱 ,
(4‐108)
we can summarize the stated function in the following vector equation: 𝐲 with
𝐪 𝐱 , 𝐲
𝑦
𝑦
⋯ 𝑦
(4‐109)
4.2 Evaluation of Observability in State Space and 𝐪 𝐱
𝐿𝐟 𝑔 𝐱
𝐿𝐟 𝑔 𝐱
127
⋯ 𝐿𝐟
𝑔 𝐱
,
where 𝐪 𝐱 represents a set of nonlinear equations. Now it is straightforward to say that if there exists the inverse function 𝐪
𝐲
𝐱 ,
(4‐110)
then it is possible to compute 𝐱 𝑡 by knowing 𝐲 in the interval 𝑡 , 𝑡 . However, the computation of 𝐪 might be impossible or at least cumbersome. To this extend, we develop 𝐪 𝐱 around one point 𝐱 , employing a Taylor series which we cancel after the linear member: 𝐲 𝐱
𝐱
𝜕𝐪 𝐱 𝜕𝐱
𝐪 𝐱
⇒𝐲 𝐱
𝐱
𝐪 𝐱
∙ 𝐱
𝐱
⋯
𝐱 𝐱
𝜕𝐪 𝐱 𝜕𝐱
(4‐111) ∙ 𝐱
𝐱
𝐐 𝐱
∙ 𝐱
𝐱
.
𝐱 𝐱
As we can compute the left side of the final equation, it is possible to solve the equation for 𝐱 𝐱 if the Jacobian matrix
𝐐 𝐱
𝜕𝐪 𝐱 𝜕𝐱
𝐱 𝐱
𝜕𝐿𝐟 𝑔 𝐱 ⎡ 𝜕𝐱 ⎢ ⎢ 𝜕𝐿𝐟 𝑔 𝐱 ⎢ 𝜕𝐱 ⎢ ⋮ ⎢ 𝑔 𝐱 ⎢𝜕𝐿𝐟 ⎣ 𝜕𝐱
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦𝐱
(4‐112)
𝐱
exhibits the rank n. As the discussions base on a linearization around the point 𝐱 , we have to keep in mind that the results are only valid in a limited area around 𝐱 , which can be defined as a neighbourhood 𝑉: 𝑉
𝐱 ∈ ℳ|‖𝐱
𝐱 ‖
𝜌 , 𝜌 ∈ ℝ .
(4‐113)
Therefore, it becomes clear that the criterion only hints at weak observability, as this is exactly the given definition. Moreover, as the computation is based on a Jacobian matrix developed around a single point, the definition for local weak observability is fulfilled. Therefore, we can state: Theorem: Evaluation of local weak observability for autonomous nonlinear systems Given an autonomous system in nonlinear state space representation according to (4‐104), in which the states can reach values out of the set 𝑀, and let 𝑈 be a subset of 𝑀. If the Jacobian matrix 𝐐 𝐱 , defined according to equation (4‐112), exhibits the rank 𝑛, rank 𝐐 𝐱
𝑛 ,
(4‐114)
128
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
the system is referred to as locally weakly observable at 𝐱 . If equation (4‐114) holds true for all 𝐱 ∈ 𝓜, the system is locally weakly observable. The matrix 𝐐 ∙ is also referred to as observability matrix. The stated condition is sufficient, but not necessary. It is interesting to evaluate the stated condition for an autonomous linear system in the form of 𝐱 𝑡
𝐀 𝐱 𝑡 ,
𝑦 𝑡
𝐜 𝐱 𝑡 .
With 𝑔 𝐱
(4‐115)
𝐜 𝐱 𝑡 and consequently 𝐿 𝑔 𝐱 ⎡ 𝐟 ⎢ 𝐿𝐟 𝑔 𝐱 ⎢ ⋮ ⎢ 𝑔 𝐱 ⎣𝐿𝐟
𝐪 𝐱
𝐜 𝐱 𝑡 ⎡ 𝐀𝐱 𝑡 𝐜 ⎢ ⋮ ⎢ ⎣𝐜 𝐀 𝐱 𝑡
⎤ ⎥ ⎥ ⎥ ⎦
⎤ ⎥ ⎥ ⎦
(4‐116)
The condition according to equation (4‐114) can be written as
rank 𝐐 𝐱
𝜕𝐪 𝐱 𝜕𝐱
rank
rank
𝐜 𝐜 𝐀 ⋮ 𝐜 𝐀
𝑛 ,
(4‐117)
which completely accords to the Observability criteria of Kalman for linear system, see equation (4‐62). This shows that the observability of linear systems is just a particular case of the nonlinear observability. The limitation of weak observability can be left out, as for linear systems, it was absolutely acceptable to cancel the Taylor series development according to equation (4‐111) after the linear member. 4.2.3.4 Evaluation of Observability for General Nonlinear Systems Finally, we will discuss the evaluation of observability for nonlinear non‐autonomous system according to equation (4‐101). The procedure is the same as for the autonomous systems. We compute 𝑛 1 temporal derivatives of the output and introduce the auxiliary derivatives ℎ ∙ according to: 𝜕𝑔 𝐱, 𝐮 𝜕𝑔 𝐱, 𝐮 𝐟 𝐱, 𝐮 𝐮 : ℎ 𝐱, 𝐮, 𝐮 , 𝜕𝐱 𝜕𝐮 𝜕ℎ 𝐱, 𝐮, 𝐮 𝜕ℎ 𝐱, 𝐮, 𝐮 𝜕ℎ 𝐱, 𝐮, 𝐮 𝐟 𝐱, 𝐮 𝐮 𝐮 𝜕𝐱 𝜕𝐮 𝜕𝐮
𝑦 𝑡 𝑦 𝑡
ℎ 𝐱, 𝐮, 𝐮, 𝐮 , (4‐118)
⋮ 𝑡
𝑦
𝜕ℎ
… 𝜕𝐱
𝜕ℎ 𝜕𝐮
𝐟 𝐱, 𝐮
…
:ℎ
𝐱, 𝐮, … , 𝐮
.
Similar to equation (4‐109), we can write 𝐲 with
𝐪
𝐱, 𝐮, … , 𝐮 𝐲
𝑦
𝑦
, ⋯ 𝑦
(4‐119)
4.2 Evaluation of Observability in State Space and 𝐪
𝐱, 𝐮, … , 𝐮
𝑔 𝐱, 𝐮
ℎ 𝐱, 𝐮, 𝐮
129 ⋯ ℎ
𝐱, 𝐮, … , 𝐮
.
Employing the same arguments than before, we can define the nonlinear observability matrix for non‐autonomous systems, 𝐐 𝐱 , to be
𝐐
𝐱 ∶
𝜕𝐪
𝐱, 𝐮, … , 𝐮 𝜕𝐱
⎡ ⎢ ⎢ ⎢ ⎢ ⎢𝜕ℎ ⎣
𝜕𝑔 𝐱, 𝐮 𝜕𝐱 𝜕ℎ 𝐱, 𝐮, 𝐮 𝜕𝐱 ⋮ 𝐱, 𝐮, … , 𝐮 𝜕𝐱
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
(4‐120)
This gives rise to the following statement: Theorem: Evaluation of local weak observability for general nonlinear systems Given a system in nonlinear state space representation according to (4‐101), in which the states can reach values out of the set M, and let U be a subset of M. If the Jacobian matrix 𝐐 𝐱 , defined according to equation (4‐120), exhibits the rank 𝑛, rank 𝐐
𝐱
𝑛 ,
(4‐121)
the system is referred to as locally weakly observable at 𝐱 . If equation (4‐121) holds true for all 𝐱 ∈ 𝓜, the system is locally weakly observable. The matrix 𝐐 ∙ is also referred to as observability matrix. The stated condition is sufficient, but not necessary. Summing up our discussions on nonlinear observability and its usability for the Optimal Sensor Placement scenarios, we can conclude: If the system under discussion is nonlinear, as it will be for the GIB‐like scenarios due to the nonlinear relation between buoy and target position and the accordant ranges, then we can only evaluate local weak observability for each possible set‐ up. We can neither evaluate global observability, nor compare different set‐ups to detect which one results in a ‘optimal’ observability. To this extend, we need another criterion to base upon. Therefore, we will discuss the Gramian matrices in the following section 4.2.4, both for linear and nonlinear systems. 4.2.4 Observability Gramian Matrix According to Singh and Hahn, 2005, so‐called empirical controllability and observability gramians have been employed for model reduction of nonlinear systems, as reported e.g. in Lall et al., 2002. Up to 2005, no usage for controllability and observability analysis was reported in literature. In our discussions, we will start with the definition of the observability gramian for linear systems, before we expand the discussions to empirical gramians for nonlinear systems. 4.2.4.1 Linear Observability Gramian In order to understand the concept of the gramian observatibility matrix and how it can be employed to evaluate observability, we have to think of energy transfer from the states to the output. Figure 4‐24 which was inspired by a presentation by Prof. Christoph Ament, University of Augsburg, shows the concept where every state is initially charged with '1'. It is then
130
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
evaluated, how the energy is transferred to the outputs, where the energy of the signals 𝑦 𝑡 is measured. This approach is classical, and can be related to the properties of the observability Grammian. This subject became popular with the widespread use of balanced realizations (see e.g. Enns, 1987).
Figure 4‐24: Energy transfer from states to outputs
The following discussion are base on Fairman, 1998. Limiting ourselves to real values for states and outputs, the output energy 𝐸 can be written as 𝐸
𝐲 𝑡 𝐲 𝑡 d𝑡
0
(4‐122)
We can replace 𝐲 𝑡 according to equation (4‐64); as we do not consider any inputs, we can set 𝐮 𝑡 𝟎. Thus we obtain: 𝐸
𝐱
e𝐀
𝐂 𝐂 e𝐀 d𝑡 𝐱
0 ,
(4‐123)
where we define the observability gramian 𝐖 : Definition: Observability gramian for linear systems Given a linear system according to the definition in equation (4‐14), which can assumed to be stable, the observability gramian 𝐖 is defined as : 𝐖 ∶
e𝐀
𝐂 𝐂 e𝐀 d𝑡 ∈ ℝ
Therefore, we can say that for 𝐱 energy.
. 𝟏
(4‐124) 𝟏 , the sum of the elements of 𝐖
equal the output
The gramian 𝐖 is a square symmetric (due to the definition) non‐negative (as the energy has to be positive, even for negative initial values) matrix. We can therefore decompose it as 𝐖 𝐌 𝐌, which allows us to write equation (4‐123) as
4.2 Evaluation of Observability in State Space 𝐸 where 𝐳 𝐌𝐱
𝐱 𝐖 𝐱
𝐱 𝐌 𝐌𝐱
𝐳 𝐳
131
𝑧
𝟎,
𝐌 𝐱 . From that it follows that 𝐸 is zero if and only if 𝐳 𝟎
(4‐125) 𝟎. The equation (4‐126)
only has the trivial solution 𝐱 𝟎 if 𝐌 (and therefore 𝐖 ) is nonsingular. Thus we can conclude: If 𝐸 0 for any 𝐱 𝟎, the observability gramian 𝐖 has to be singular, that is, it exhibits at least one zero eigenvalue. Also, according to equation (4‐122) we can say that 𝐸 0 if and only if 𝐲 𝑡 𝟎 for all times. To summarize these statements, it is straightforward to say that for a system with 𝐱 𝟎, the output can only be 𝐲 𝑡 𝟎 for all times if 𝐖 is singular. If this is the case and if there exist several possible initial state vectors that result in 𝐲 𝑡 𝟎 for all times, we can clearly state that the system is not observable. Discussions in Fairman, 1998 which are not shown here imply that for unobservable, stable systems it is possible to find several initial state vectors that result in the same output trajectories, based on composition of eigenvectors of the system matrix. Therefore, we can make the important conclusion that a linear system is observable if and only if its observability gramian according to equation (4‐124) is nonsingular, therefore not having zero as eigenvalue. On the other hand, the eigenvalues describe the normalized energy transfer of the single states to the output. That means that the smallest eigenvalues represents the state which is least observable. This allows for the comparison of different systems in terms of a ‘better’ observability. In observable systems, the observability Gramian can be used to compute the initial state vector. After multiplying equation (4‐124) with 𝐱 from the right, it is possible to replace the term 𝐂 e𝐀 𝐱 according to equation (4‐66) by the free motion 𝐲 𝑡 to obtain e𝐀
𝐖 𝐱
𝐂 𝐂 e𝐀 𝐱 d𝑡
e𝐀
𝐂 𝐲
𝑡 d𝑡 (4‐127)
⇒𝐱
𝐖
e
𝐀
𝐂 𝐲
As argumented before, 𝐖
𝑡 d𝑡 .
exists if the system is observable.
It is possible to compute 𝐖 for linear stable systems using the Lyapunov equation. Details can be found in Fairman, 1998. 4.2.4.2 Empirical Gramian Matrix for Nonlinear Systems The observability Gramian of a nonlinear system cannot be computed directly. Hence, it can be approached by the co‐called empirical observability Gramian, as suggested by Lall et al., 1999, Hahn and Edgar, 2000, and Hahn and Edgar, 2002. To this extend, it is necessary to compute output trajectories 𝐲 𝑡 by numerical simulations. As we have discussed before, the selection of the employed input signal is of particular interest in the observability analysis of nonlinear systems, if they are not considered autonomous. To cover the nonlinear character, it is necessary to perform simulations for different inputs and/or initial conditions and the
132
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
according state and output trajectories. One also speaks about a normalized/ reference trajectory, and those affected by what is called excitation or perturbation. In what follows, we will define the empirical observability gramian and relate it back to the linear one, based on discussions in Lall et al., 1999, and Hahn and Edgar, 2002. Looking at a nonlinear system according to the definition in equation (4‐101), we define the following sets: 𝒯
𝐓 , 𝐓 , … , 𝐓 with 𝐓 ∈ ℝ
,𝐓 𝐓
ℳ
𝑐 , 𝑐 , … , 𝑐 with 𝑐 ∈ ℝ, 𝑐
0, 𝑖
ℰ
𝑒 ,𝑒 ,…,𝑒
with 𝑒
𝐈, 𝑖
1, … , 𝑟 ,
1, … , 𝑠 ,
(4‐128)
standard unit vectors in ℝ ,
This leads to the following definition of the so‐called Empirical observability gramian 𝐖 : Definition: Empirical observability gramian for nonlinear systems Given a nonlinear system according to the definition in equation (4‐101), which is assumed to be stable, and let 𝒯 , ℳ, and ℰ be sets according to equation (4‐128), the Empirical observability gramian 𝐖 is defined as: 𝐖
1 𝑟𝑠𝑐
∶
𝑡 ∈ℝ
with 𝚿
𝐓 𝚿
𝑡 𝐓 d𝑡 ,
given by Ψ
𝐲
𝑡
𝐲 𝑡
𝐲
𝑡
𝐲 𝑡 ,
(4‐129)
𝐲 𝑡 : Nominal/ reference output trajectory 𝐲
𝑡 : output trajectory for the initial condition 𝑥
0
𝑐 𝐓 𝑒 .
It can be shown that the empirical observability gramian degenerates into the linear one if a linear system is used. If we consider an autonomous linear system, we can say that according to equation (4‐66), it holds true that 𝐲 𝑡
𝐲
𝑡
𝐂 e𝐀 𝐱 .
(4‐130)
For two different realizations of 𝐲 𝑡 , 𝑖 and 𝑗, based on the different initial conditions 𝐱 , and 𝐱 , , and assuming that 𝐱 𝟎 and hence 𝐲 𝑡 0 for all 𝑡, we can say according to the definitions made in equation (4‐129): 𝚿
𝐂 e𝐀 𝑐 𝐓 𝑒
⇒𝚿
𝑡
⇒𝐖
𝑐
𝐂 e𝐀 𝑐 𝐓 𝑒
𝐓 e𝐀 1 𝐀 e 𝑠
𝐂 𝐂 e𝐀 𝐓
𝐂 𝐂 e𝐀 d𝑡
𝑐 ,
𝑒 𝐓 e𝐀
𝐂 𝐂 e𝐀 𝐓 𝑒 , (4‐131)
𝐖 ,
which proves that the empirical gramian equals the linear one when employed on a linear system. By performing a singular value decomposition of the empirical observability Gramian 𝜆 𝐖 , one can assess observability, if a ‘typical’ trajectory in the sense formulated in Lall et al., 1999,
4.3 Parameter and Variable Estimation
133
is used. In this case, the largest singular value 𝜆 𝐖 represents the energy transfer from the single state that yields the best possible observability. On the other hand, the smallest singular value 𝜆 quantifies the energy transfer from the least observable state to the outputs, thus characterizing the ‘bottleneck’ of observability. Comparing two systems with a similar structure, it is straightforward to say that the system which smallest singular value of 𝐖 exhibits the highest value is ‘best observable’. We will consider this method in the discussions in chapters 6 and 7. A system is unobservable, if 𝜆is zero. Note that for a square, symmetric matrix 𝐖 , singular value decomposition is equivalent to diagonalization, or solution of the eigenvalue problem (Wall et al., 2003)
4.3 Parameter and Variable Estimation In the discussions of this chapter, we have so far assumed that all systems and signals exhibit a deterministic behavior, that is, that their behavior can be predicted with any arbitrary exactness, as long as enough labor is put into a precise modelling approach. For real systems, this is not the case. Measurements of real signals might be overlaid by a noise which behavior cannot be predicted with arbitrary exactness. The level of detail with which the modelling of a system can be performed is usually limited by several issues. For instance, in any real application, available resources for the modelling must be kept in mind. On the other hand, as discussed before, in certain applications (like the external navigation) there might be a lack of knowledge what exactly is going on within a system with which only limited communication is possible. In all the mentioned situations, the need arises to describe and to handle the influence of nondeterministic signals. If a signal is classified as stochastic, it means that its behavior cannot be predicted with arbitrary exactness. However, even stochastic signals might follow certain limitations. They might exhibit a constant mean value, or their changing rate might be limited. The exploitation of these characteristics might allow for an estimation of signal values, even if their exact course cannot be predicted. We use the term ‘estimation’ here to distinguish from the ‘observation’ of deterministic signals, as it was discusses in section 4.2. Within this section, we will start with a general introduction into the field of stochastic variables and processes. We will then introduce general possibilities to estimate values based on noisy measurements, where we will distinguish between random and non‐random parameters. This will provide us with the necessary tools to introduce a concept for a model‐ based estimation of system states based on noisy output measurements, which can be performed by a standard Kalman filter, if the system is linear. We will extend the discussions for nonlinear systems by introducing the Extended Kalman filter and the Unscented Kalman filter. 4.3.1 Basics of Stochastic Variables and Signals The concept of probability theory is introduced in section 4.3.1 on a very basic level, following discussions in Brammer and Siffling, 1990 (German) as well as Ross, 2014, and Grimmett and Stirzaker, 2009 (both English). 4.3.1.1 Probability Experiments, Events, and Probability Measures We start by looking at a probability experiment whose result cannot be predicted with certainty; however, the results can only lie within a possible set. For example, when flipping a coin, there are two possible outcomes: (H)ead or (T)ail. Both outcomes build the so called
134
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
sample space 𝒮, that is a set which contains all possible results of the probability experiment. For the stated example, it holds true that 𝒮
𝐻, 𝑇 .
(4‐132)
If we look at the throwing of a dice, there are six different outcomes; we can say that 𝒮
1,2,3,4,5,6 .
(4‐133)
Any subset of the sample space is referred to as event and usually shorted by a capital letter. In the dice example, this can be one concrete outcome (e. g. ‘3’), or also it is possible that several outcomes are summarized into one event. For instance, one could define event 𝐴 as the case that the number on the dice is even, or event 𝐵 that the number of the dice is smaller than 3: 𝐴
𝒮|𝑥 is even
2,4,6 ; 𝐵
𝒮|𝑥
3
1,2 .
(4‐134)
Note that after each realisation of the probability experiment, we can state for every event whether it is true (it has occurred) or it is false (it has not occurred). It is common to use the notations ‘1’ for true and ‘0’ for false. It is possible to combine events to form new ones. In the example above, we could define an event 𝐶 which is only true when both events 𝐴 and 𝐵 are true. This combination is called an intersection and written as 𝐶
𝐴∩𝐵
𝐴𝐵
2 .
(4‐135)
Note that this combination is written with the operator ∩ or without any operator; this is because if the notation with ‘1’ and ‘0’ is used, it behaves like a multiplication. When building an intersection, it is possible that the new formed event has an empty set of possible outcomes. For instance, if we define event 𝐷 as the cases where the number of the dice is larger than 4, it is true that 𝐷
𝒮|𝑥
⇒𝐵𝐷
4
5,6
(4‐136)
∅ ,
where ∅ refers to the empty set. Events which intersection is empty are denoted as mutually exclusive. Another combination of events is referred to as union. It is true if either of the combined events is true: 𝐸
𝐴∪𝐵
𝐴
𝐵
1,2,4,6 .
(4‐137)
Analogously to the situation discussed above, as operator for a union both ∪ and are used, as the union only equals ∅ if all combined events equal ∅.
4.3 Parameter and Variable Estimation
135
Finally, it is possible to select all outcomes of the sample space which are not contained in an event. That operation is called complement. For instance, we could define an event containing all odd numbers of the dice as complement of 𝐴: 𝐹
𝒮|𝑥 is odd
1,3,5
𝐴 .
(4‐138)
Figure 4‐25: Venn diagrams of intersection, union and complement of events
The issues under discussion can be graphically represented by the so‐called Venn diagrams which show the different events and their combinations as sets within the sample space. Figure 4‐25 displays these diagrams for the intersection, union and complement according to the recently discussed examples. It might be of particular interest to find a measurement for the frequency in which a particular event will occur when the probability experiment is executed. It is straightforward to define a relative frequency ℎ 𝐴 of a particular event 𝐴 as the quotient of the number of experiments in which 𝐴 occurred, denoted as 𝑛 𝐴 , by the total number of experiments, 𝑁: ℎ
𝐴
𝑛 𝐴 . 𝑁
(4‐139)
With that, the probability of an event 𝐴, 𝑃 𝐴 , also referred to as probability measure, can be defined to be 𝑃 𝐴
lim ℎ →
𝐴 .
(4‐140)
At this position, it must be kept in mind that it is neither straightforward to assume that ℎ 𝐴 will converge towards a final value for 𝑁 → ∞, nor that this value will always be the same. In fact, these two issues are formulated as assumptions, or axioms, and are mainly based on experience. If in a probability experiment, all possible outcomes are equally likely to occur, it is straightforward to compute the probability of an event according to Laplacian definition of probability: 𝑃 𝐴
number of outcomes in 𝐴 . total number of outcomes in 𝒮
For the examples discussed here, it can be stated that 𝑃 𝐴 𝑃 𝐶 1⁄6.
(4‐141) 3⁄6
0.5, 𝑃 𝐵
1⁄ 3,
136
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
The Russian mathematician Kolmogoroff introduced the three axioms of probability which assign similar properties to the notation of probability as done before by the relative frequency: The three axioms of probability (according to Kolmogoroff): Axiom 1: For every event 𝐴, there is a reel non‐negative probability, denoted as 𝑃 𝐴 , which cannot exceed 1: 0
𝑃 𝐴
1 .
(4‐142)
Axiom 2: The probability of the whole sample space 𝒮 equals 1: 𝑃 𝒮
1 .
(4‐143)
Axiom 3: Given any sequence for mutually exclusive events 𝐴 , 𝐴 , …, the probability of their union equals the sum of their individuals probabilities: 𝑃
𝐴
𝑃 𝐴
if 𝐴 ∩ 𝐴
∅ for all 𝑖, 𝑗 ∈ ℕ, 𝑖
𝑗 .
(4‐144)
It is commom to define a so‐called probability space Ω 𝒮, 𝑀, 𝑃 , that is a triple comprising of a sample space 𝒮, a set of possible events denoted as 𝑀, and a probability measure 𝑃 as a function returning the probabilities of the elements in 𝑀 (Grimmett and Stirzaker, 2009). 4.3.1.2 Conditional Probability At this point, we can introduce one of the most important concepts within the probability theory. In real scenarios, we might often have some particular information about a process, which we want to use. Let us look at the event 𝐴 and 𝐵 according to equation (4‐134). Let us further assume that we already know that event 𝐵 has occurred. What is, in this case, the probability of event 𝐴? We call this the conditional probability that 𝐴 occurs given that 𝐵 has occurred and write this as 𝑃 𝐴|𝐵 . In order to compute this conditional probability, we have to keep in mind that for 𝐴|𝐵 to be true, it is straightforward to say that 𝐴 ∩ 𝐵 has to be true, whereas we already know that 𝐵 is true. That means, 𝑃 𝐴|𝐵 can be computed as 𝑃 𝐴 ∩ 𝐵 , but for a reduced sample space, as only outcomes that are members of 𝐵 are possible. Also, if event 𝐵 can never happen, the same is true for 𝐴|𝐵. Therefore, we obtain: 𝑃 𝐴|𝐵
𝑃 𝐴∩𝐵 𝑃 𝐵 0
for
𝑃 𝐵
0
for
𝑃 𝐵
0
.
(4‐145)
In the example stated above, we can say that if B has occurred, only the outcomes 1,2 are still possible, while it is reasonable to assume that the probability for both is equal. Now 𝐴 can only be true for the outcome 2 . Therefore, we can see that 𝑃 𝐴|𝐵 0.5. Keeping in mind that 𝑃 𝐵 1⁄3 and 𝑃 𝐶 𝑃 𝐴∩𝐵 1⁄6, we obtain the same result from equation (4‐145).
4.3 Parameter and Variable Estimation
137
The sequence of the events in the conditional probability is of big importance. The term 𝑃 𝐵|𝐴 represents the conditional probability that that 𝐵 occurs given that 𝐴 has occurred. If 𝐴 has occurred, the possible outcomes are 2,4,6 . That means, 𝐵 is only true in one of three cases, resulting in 𝑃 𝐵|𝐴 1⁄3. Again, this result can be obtained by equation (4‐145) with 𝑃 𝐴 1⁄2. Interestingly enough, the two conditional probabilities we have computed are not equal. Employing equation (4‐145) for 𝑃 𝐴|𝐵 and 𝑃 𝐵|𝐴 , we can equate the term 𝑃 𝐴 ∩ 𝐵 to obtain: 𝑃 𝐴|𝐵
𝑃 𝐵|𝐴
⇒ 𝑃 𝐴|𝐵
𝑃 𝐴 𝑃 𝐵
𝑃 𝐵|𝐴
for 𝑃 𝐴 , 𝑃 𝐵 for 𝑃 𝐴
0
(4‐146)
𝑃 𝐵 .
This law is denoted as Bayes' theorem, named after the English Mathematician Thomas Bayes. Another important law, the law of total probability can be employed to compute the so called total probability 𝑃 𝐴 from the conditional probabilities 𝑃 𝐴|𝐵 and 𝑃 𝐴|𝐵 and the total probability 𝑃 𝐵 . It is straightforward to see that 𝑃 𝐴
𝑃 𝐴|𝐵 𝑃 𝐵
𝑃 𝐴|𝐵
𝑃 𝐵
.
(4‐147)
These findings will be of importance when introducing the Bayes Estimation in section 4.3.2.1. 4.3.1.3 Stochastic Variables
Figure 4‐26: Thought experiment for the appearance of stochastic variables and their random variates
Up to now, we have discussed about distinct events that can occur or not if a probability experiment is executed. But our goal is to find quantitative measures for stochastic signals, like a measurement noise. To this extend, we define the stochastic variables, for which we introduce the following thought experiment:
138
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
Assume we have a probability space Ω 𝒮, 𝑀, 𝑃 . Further assume that in our case 𝑀 contains a large (and possible infinite) number of elementary events, that means elements that only exist of one possible outcome, denoted as 𝑚 , 𝑖 1, … , 𝑁 , and which are further all mutually exclusive. Now, assume there is a variable 𝑋, whose current value depends on the concrete event 𝑚 which is obtained when the experiment is executed. That means, for every elementary event there is a concrete random variate 𝑥 𝑚 which the variable 𝑋 will adopt to if 𝑚 occurs. The scheme is depicted in Figure 4‐26 for two different stochastic variables. Note that it is possible that several elementary events can be linked to an identical random variate. At this point, we can already distinguish between two different kinds of stochastic variables: If a variable can take a finite or at most a countable infinite number of random variates, it is denoted as discrete. A stochastic variable which can take uncountable infinite numbers of random variates is called continuous. Let us look at two simple examples. If we define 𝑋 as number on a dice after a single through, 𝑋 is discrete, as it can only take six different variates. Let 𝑌 be the lifetime of an electric circuit from the time it was produced until it breaks. In this case, 𝑌 can take an uncountable number of values, therefore this stochastic variable is continuous. In the following, we will discuss measures of stochastic variables and in this context find an important difference between discrete and continuous ones. Figure 4‐27 which was inspired by Brammer and Siffling, 1990 as well as Ament and Glotzbach, 2016b, concludes the upcoming discussion and shows the different probability and distribution functions as well as the algorithms to convert them. Let us start with the discrete variables. From the definition and the scheme in Figure 4‐26, it is straightforward to state that the probability that a variable 𝑋 takes any concrete variate 𝑥 equals the probability of the occurrence of event 𝑚 (or the sum of the several ones, but for the sake of simplicity we shall further assume that the bounds between 𝑚 and 𝑥 𝑖 are mutually exclusive): 𝑝 𝑥 ∶ 𝑃 𝑋
𝑥
𝑃 𝑚 .
(4‐148)
The function 𝑝 𝑥 is referred to as Probability Mass Function (PMF) for a discrete stochastic variable 𝑋. It assigns a concrete probability value to all possible variates of 𝑥. A typical graph of a PMF is given in the upper left part of Figure 4‐27. Note that according to the definitions made before, it holds true that 𝑝 𝑋
𝑥
1 ,
(4‐149)
If we define 𝑋 as the sum of numbers received by the throwing of two dices, one can imagine that the PMF would look in the way as it is depicted in the upper left part of Figure 4‐27. Low and high variates exhibit a smaller probability, as they can only be achieved with a few combinations, while the middle numbers have the highest probability to occur. In practical applications, it might be of interest with which probability the value of a stochastic variable remains under an upper bound. For discrete stochastic variables, it is straightforward to compute this information, as one can simply sum up the values of the PMF for all 𝑥 𝑖 smaller than the defined bound. This gives rise to the Cumulative Distribution Function (CDF) 𝐹 𝑥 which returns for any concrete value the probability that the random variate after an
4.3 Parameter and Variable Estimation
139
execution of the experiment is smaller than or equal to the specified value. This can be expressed as 𝐹 𝑥 ∶
𝐹 𝑋
𝑥
𝑝 𝑋
𝑥
𝑝 𝑥 .
(4‐150)
Figure 4‐27: Probability and distribution functions of discrete and continuous stochastic variables
If we consider again the example with the two dices and a maximum sum is given, one can simply add all the values of the PMF that fall shorter or equal to this maximum. This would result in a CDF as displayed in the upper right part of Figure 4‐27. Notes that, as 𝑝 𝑥 is always positive, the CDF is a monotonic increasing function, which returns the values 0 or 1 for 𝑋 tending towards the minimum respectively the maximum possible value. The CDF is also defined for continuous stochastic variables, where it exhibits the same properties as just discussed. For continuous variables, there is also a counterpart of the PMF which is called Probability Density function (PDF) 𝑓 𝑥 . Both functions can be mutually converted into each other by 𝐹 𝑥
𝑓 𝜏 d𝜏 ; 𝑓 𝑥
d𝐹 𝑥 . d𝑥
(4‐151)
In the lower part of Figure 4‐27, both functions for continuous stochastic variables are displayed. As discussed before for discrete variables in equation (4‐149), it can also be stated that
140
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
𝑓 𝑥
1 .
(4‐152)
As mentioned, there is an important difference between the handling of discrete and continuous stochastic variables, especially regarding their PMF respectively PDF. For the discrete variable ‘Sum of the numbers after throwing two dices’, we can for any random variate directly find the probability in the PMF. If we want to know the probability that the sum equals 4, the accordant probability is 3⁄36 8.33%, as there are three elementary events 1,3 , 2,2 , 3,1 that will result in the desired variate out of 36 possibilites. We can interpret that result that if we repeat the experiment very often, we will achieve the sum of 4 in about 8.33% of all tries. Now if we look at the example stated before, where we consider the lifetime of an electric circuit as continuous stochastic variable, the PDF might also show a course like the one in Figure 4‐27. We might find a probability value of 20% for 𝑥 equals exactly 5 years, for instance. But how can this be interpreted? For sure, if we look at a large number of these circuits, it is not very appealing that exactly 20% of them will break down exactly after 5 year (and 0 days, 0 hours, 0 minutes, 0.000 … seconds). Due to the fact that we have an uncountable amount of random variates, we cannot exactly assign a probability value to any concrete value of 𝑥. In fact, even if the PDF might return a value of 20%, the probability of any concrete value out of an uncountable infinite number of possibilities always tends towards zero. Therefore, we have to compute the probabilities of defined intervals of 𝑥 by employing the CDF. For instance, we could compute the probability of a breakdown of the circuit in the first 5 years of operation, or between month 59 and 61, by adapting the integral interval in equation (4‐151). By multiplying the second equation in (4‐151) by d𝑥, we can say that: 𝑓 𝑥 d𝑥
𝑃 𝑥
𝑋
𝑥
d𝑥 ,
(4‐153)
That means, for small d𝑥, 𝑓 𝑥 d𝑥 represents the probability that 𝑋 is between 𝑥 and 𝑥
d𝑥.
4.3.1.4 Normal (or Gaussian) Distribution Stochastic variables can be classified by the nature of their distribution/ density functions. The distribution with the biggest importance in practical application of continuous variables is called Normal (or Gaussian, after the German mathematician Carl‐Friedrich Gauss) distribution. If a variable is normally distributed, then its PDF can be computed as 𝑓 𝑥
1 √2 𝜋 𝜎
𝑒
,
(4‐154)
where 𝜇 and 𝜎 are adjustable parameters we will introduce betimes. The graph of the PDF for a normally distributed variable exhibits the shape of a bell which is symmetric about 𝜇. At 𝜇, the function also has its maximum at a value of 1⁄√2 𝜋 𝜎 . Figure 4‐28 on page 144 shows an example. The big importance of the normal distribution is related to the Central Limit Theorem (CLT). This theorem establishes that in real applications, many random phenomena obey the normal distribution, or can at least be approximated as normally distributed. This is especially true when several stochastic variables are summed up, and a sufficient large number of realizations
4.3 Parameter and Variable Estimation
141
for each variable is performed. Interestingly, the single involved variables can exhibit any arbitrarily distribution; still their sum can be at least approximated by a normal distribution. According to Ross, 2014, typical examples of real world phenomena which obey the normal distribution are the height of a man or a woman, the velocity in any direction of molecules in gas, and the measurement errors of physical quantities. Especially the latter one is of big importance for the applications discussed within this thesis. We are actually looking for a way to describe the influence of measurement errors on the observation process of a system model in state space representation. Therefore, the concept of normally distributed stochastic variables as a model for measurement errors will be employed in the chapters 5‐7. 4.3.1.5 Expected Value and Variance Let us look at a discrete stochastic variable 𝑋 with 𝑁 elementary events. Let us further assume that the underlying probability experiment has been executed 𝑁‐times, with 𝑁 ≫ 𝑁 . That means that 𝑁 values 𝑥 for 𝑋 have been computed, which can be classified into 𝑁 groups, denoted as 𝑥 𝑖 , 𝑖 1, … , 𝑁 . The question might arise, which value was the mean 𝑥̅ that 𝑥 took. We can use a gamble as a practical example: Assume that a player has to pay € 5 to participate. Then, a coin is tossed. If it shows head, the player wins € 9, otherwise he gets nothing. One could be interested whether the player will win or lose money, if he plays the game for a long time, and which amount he might win or lose in each game on average. Let us assume that 𝑋 took 𝑛 ‐times the variate 𝑥 1 , 𝑛 ‐times the variate 𝑥 2 , … , 𝑛 the variate 𝑥 𝑁 , so that 𝑛
𝑛
⋯
𝑛
𝑁
‐times
(4‐155)
holds true. Then it is straightforward to compute the mean value 𝑥̅ as 𝑥̅
1 𝑛 𝑥 1 𝑁 𝑥 𝑖
𝑛 𝑁
𝑛 𝑥 2
⋯
𝑛
𝑥 𝑁
(4‐156)
𝑥 𝑖 ℎ
𝑥 𝑖
,
where we have inserted the relative frequency ℎ 𝑥 𝑖 according to equation (4‐139) in the last step. As we have seen in equation (4‐140), the relative frequency can be replaced by the probability measure for large 𝑁. As we have introduced 𝑁 ≫ 𝑁 before, we can state that 𝑥̅
𝑥 𝑖 𝑃 𝑋
𝑥 𝑖
𝑥 𝑖 𝑝 𝑥 𝑖
.
(4‐157)
Also, for continuous stochastic variables, the mean value can be computed to be 𝑥̅
𝑥 𝑓 𝑥 d𝑥 .
(4‐158)
In the definition according to the equation (4‐157) respectively (4‐158), the mean value is also denoted as expected value 𝐸 𝑥 or 𝜇. The expected value can therefore be considered as the
142
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
mean value that a stochastic variable takes when the number of realizations is very high. It is also denoted as the first moment of the stochastic variable. Also, it is the parameter 𝜇 in the definition of the normal distribution in equation (4‐154). In general, one can distinguish between mean value and expected value. The expected value is a constant quantity that can be considered as a property of a stochastic variable. The mean value is always a result of a number of concrete realizations of the variable. If the number of realizations is small, there might be a significant difference between the (theoretical) expected value and the (practically calculated) mean value. As the number of realizations tends towards infinity, the mean value will tend towards the expected value. A stochastic variable is denoted as zero‐mean if and only if its expected value is zero, 𝐸 𝑥 0. At this point it shall be mentioned that we use the notation 𝐸 𝑥 with the lower case 𝑥 rather than 𝐸 𝑋 ,which would be more precise. For the later employed random variables, we will not distinguish between the variable and its variates and usually employ the lower case letters for notational convenience. In our example introduced before, the expected value of the gamble can be computed to be 𝐸 𝑥
𝑥 Head 𝑝 𝑥 Head
€9
€ 5 0.5
𝑥 Tail 𝑝 𝑥 Tail
€ 5 0.5
€ 0.50 ,
(4‐159)
That means the player loses € 0.50 per game on average. Let us look at another gamble that works similar. The player has to pay € 1000 per game, and he receives € 1999 if and only if the coin shows head. Employing equation (4‐159), we can again compute the expected value, which would again be 𝐸 𝑥 € 0.50. Nevertheless, we can clearly state that the game is different. It might grant a higher win for the player, if he wins the first game and then quits, but he might also lose more, especially if he has a losing streak which might quickly bring him out of money. But as the expected value is the same as for the first game, the overall loss of the player will be the same as for the first game, if he plays for a long time. From this example, we see the need to be able to have a measurement that describes how far away from the expected value the single values are placed. We can compute this quantity as a mean of the difference between the single values and the expected value. In order to prevent positive and negative differences to cancel each other, the differences are squared. This gives rise to a quantity which is denoted as variance Var 𝑥 or 𝜎 and can be computed to be: Var 𝑥 ⇒ Var 𝑥
𝐸 𝑥
𝜇
,
𝑥 𝑖
𝜇
𝑝 𝑥 𝑖
for discrete variables , (4‐160)
⇒ Var 𝑥
𝑥
𝜇
𝑓 𝑥 d𝑥
for continuous variables .
An easier way to compute the variance of a discrete variable might be the following one:
4.3 Parameter and Variable Estimation Var 𝑥
𝑥 𝑖
𝜇
𝑝 𝑥 𝑖
𝑥 𝑖 𝑝 𝑥 𝑖 𝐸 𝑥
2𝜇
𝜇
𝐸 𝑥
𝐸 𝑥 .
2𝜇 𝐸 𝑥
143
𝑥 𝑖
2𝜇𝑥 𝑖
𝑥 𝑖 𝑝 𝑥 𝑖
𝜇
𝜇
𝑝 𝑥 𝑖
𝑝 𝑥 𝑖
(4‐161)
𝜇
In this respect, 𝐸 𝑥 is denoted as the second moment of 𝑥, while Var 𝑥 is referred to as second central moment. Furthermore, the variance 𝜎 is the second parameter of the normal distribution in equation (4‐154). In the example of the two gambles, we can employ equation (4‐160) to compute the different variances. For gamble 1, we receive Var 𝑥 € 20.25, while for gamble 2, Var 𝑥 € 999,000.25. This clearly shown that the values of the stochastic variable ‘Amount of win’ are in a greater distance from the expected value in gamble 2. This means there is a chance for a higher win than in gamble 1 if one plays the game for only a limited times; on the other hand, there is the same chance for a higher loss. Especially, due to the high stakes in gamble 2, the player might be bankrupt quickly if he hits a losing streak. We introduce the following notation to be used in the further course of this thesis: Let 𝑋 be a stochastic variable with expected value 𝜇 and variance 𝜎 . Then we write 𝑋~ 𝜇, 𝜎
or
(4‐162)
𝑋~𝒩 𝜇, 𝜎 , if 𝑋 is normally distributed.
As we have just seen, the unit of the variance is always the square of the unit from the variable. It is straightforward to define a quantity with the same unit as the variable by taking the square root from the variance. This quantity is referred to as standard deviation 𝜎: 𝜎
Var 𝑥
𝜎 .
(4‐163)
The standard deviation is just another measure for the mean distance from the mean value that the single realizations of 𝑥 exhibit. Interestingly enough, for normally distributed variables it holds true that approximately 68.3% of all values lie in the interval 𝜇 𝜎, 95.4% lie within 𝜇 2 𝜎, and 99.7% lie within 𝜇 3 𝜎. This approximation is often used as rule of thumb in real applications. For instance, in a preprocessing of real measurement data, it is common to reject values that lie outside of the 2 𝜎‐ or 3 𝜎‐intervall. They are considers as outliers that might result from errors within the measuring device and are not employed for further calculations. Finally, to demonstrate the effect of variance, Figure 4‐28 shows the PDF of two normally distributed stochastic variables with identical expected value, but different variances. It becomes clearly visible that a higher variance results in the values being in a greater distance from the expected value.
144
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
Figure 4‐28: PDF of two normally distributed stochastic variables with different variance
We have discussed the possibilities to compute expected value and variance of a stochastic variable, based on the idea that the stochastic variable is a function of the events that result from a probability experiment. If we introduce a function of a discrete stochastic variable, e. g. 𝑔 𝑥 𝑥 , it is easy to see that 𝑔 𝑥 is also a stochastic variable. The expected value can be computed to be 𝐸 𝑔 𝑥
𝑔 𝑥 𝑖
𝑝 𝑥 𝑖
for discrete variables (4‐164)
𝐸 𝑔 𝑥
𝑔 𝑥 𝑓 𝑥 d𝑥 for continuous variables .
We will state some more rules for calculations with expected values that can directly be derived from the given definition. For instance, the expected value of a sum equals the sum of the expected values of the summands, that is, 𝐸 𝑥
𝑦
𝐸 𝑥
𝐸 𝑦 .
(4‐165)
For products, a similar rule exists that is only valid under some assumption we will introduce in the next section; see equation (4‐174). For the combination of stochastic variables and deterministic quantities, also some rules exist. For 𝑋 being a stochastic variable and 𝑎 being deterministic, it holds true that 𝐸 𝑎𝑥
𝑎 𝐸 𝑥 .
(4‐166)
Similar, it can be stated that the expected value of a deterministic quantity is the quantity itself. As the expected value is a deterministic quantity itself, the following holds true: 𝐸 𝑎
𝑎 ⇒ 𝐸 𝐸 𝑥
𝐸 𝑥 .
(4‐167)
4.3 Parameter and Variable Estimation
145
We will now focus on the case were several stochastic variables exist that might have some dependencies between them. 4.3.1.6 Higher‐Dimensional Stochastic Variables As it was already displayed in Figure 4‐26, there might be several stochastic variables being influence by the probability experiment that we constructed as thought experiment. In this case, we call them joint. If we start to look at the case with two discrete variables 𝑋 and 𝑌, which base on 𝑁 , respectively 𝑁 , elementary events, we might define the joint CDF based on the joint PMF as 𝐹
𝑥, 𝑦 ∶
𝐹
𝑋
𝑥 𝑖 ,𝑌
𝑦 𝑖
𝑝
𝑋
𝑥 𝑖 ,𝑌
𝑦 𝑖
.
(4‐168)
For continuous variables, the joint PDF can be computed to be 𝑓
𝑥, 𝑦
𝜕 𝐹 𝑥, 𝑦 . 𝜕𝑥 𝜕𝑦
(4‐169)
Using the definition for the expected value according to equations (4‐157) and (4‐158) as well as the computation of the expected value for a function of stochastic variable, we can say that given a function 𝑔 𝑥, 𝑦 of two stochastic variables, the expected value can be written as 𝐸 𝑔 𝑥, 𝑦
𝑔 𝑥, 𝑦 𝑝
𝑥, 𝑦 for discrete variables and
𝐸 𝑔 𝑥, 𝑦
𝑔 𝑥, 𝑦 𝑓
𝑥, 𝑦 d𝑥 d𝑦 for continuos variables.
(4‐170)
When looking at two stochastic variables, we might be interested in the relation between them, that is, whether the change of one variable will also result in the change of the other one. We introduce the notation of independence by the following definition: Definition: (Statistic or stochastic) Independence Two stochastic variables 𝑋 and 𝑌 are denoted as (statistically or stochastically) independent, if the realization of one does not affect the probability distribution of the other. This is the case if and only if the following equations hold true: 𝑝
𝑥, 𝑦
𝑝 𝑥 𝑝 𝑦 for discrete variables and
𝑓
𝑥, 𝑦
𝑓 𝑥 𝑓 𝑦
for continuos variables .
(4‐171)
By introducing the function 𝑔 𝑥, 𝑦 𝑥 𝑦 and employing equation (4‐170), we can define the so‐called correlation 𝑅 , and the similar concept of covariance 𝐶 : Definition: Correlation The correlation 𝑅 between two stochastic variables 𝑋 and 𝑌 is a measure for the degree of linearity between them. It is computed as: 𝑅
∶ 𝐸 𝑥 𝑦 .
(4‐172)
146
4. Mathematical Tools Used From the Areas of Control and Systems Engineering 0, the two variables are referred to as uncorrelated.
If 𝑅
Definition: Covariance The covariance 𝐶 or cov 𝑥, 𝑦 between two stochastic variables 𝑋 and 𝑌 is a measure for the degree of linearity between them. It is computed as: 𝐶
∶ 𝐸 𝑥
𝐸 𝑥
𝑦
𝐸 𝑦
𝐸 𝑥𝑦
𝐸 𝑥 𝐸 𝑦 .
(4‐173)
If 𝐶 0, the two variables are referred to as uncorrelated. In this case, following from equation (4‐173), it can be stated that 𝐸 𝑥𝑦
𝐸 𝑥 𝐸 𝑦 .
(4‐174)
We can see from the discussions above, that Correlation and Covariance are equal if the stochastic variables are zero‐mean. Also, we can say that independence implies non‐ correlation. The converse is not true, as the term correlation only covers linear relations between variables. Therefore, if two variables are uncorrelated, it can be stated that there is no linear relation between them; but still there might be a nonlinear relation, and thus a dependence. The covariance as a measure for a linear relation between stochastic variables is of great importance. Large values indicate a high level of linearity (for negative values, the relation is also negative, that is, it follows the principle of ‘the more, the less’). A value of zero indicates the absence of a linear relation. However, it is difficult to judge whether a concrete value of 𝐶 is ‘high’. Therefore, the correlation coefficient 𝜌 𝑥, 𝑦 or corr 𝑥, 𝑦 is introduced as: 𝜌 𝑥, 𝑦
cov 𝑥, 𝑦
corr 𝑥, 𝑦 ∶
Var 𝑥 Var 𝑦
∈ ℝ|
1
𝜌
1 .
(4‐175)
The correlation coefficient is limited to the interval of 1,…,1. Therefore it can easily be employed to assess the level of linearity between two variables, where a value of 0 again shows that the variables are uncorrelated. Now we consider 𝑛 different stochastic variables summed up in a vector 𝐱 𝑋 𝑋 ⋯ 𝑋 , and a vector containing the expected values of the variables in 𝐱, 𝐸 𝑥 ⋯ 𝐸 𝑥 𝐸 𝑥 denoted as 𝛍 . If we further assume that the operation 𝐸 𝐀 , executed on a vector or matrix 𝐀 with stochastic variables delivers a vector or matrix with the same dimensions, which contains the expected values of the appropriate variables in the original vector/ matrix, we can define the covariance matrix 𝐂 as 𝐂
𝐸 𝐱 𝛍 𝐱 𝐸 𝑥 𝜇 𝑥 ⎡ 𝐸 𝑥 𝜇 𝑥 ⎢ ⋮ ⎢ 𝑥 𝜇 ⎣𝐸 𝑥
𝛍 𝜇 𝜇
𝐸 𝑥 𝐸 𝑥
𝜇 𝜇
𝜇
𝐸 𝑥
𝜇
⋮
𝑥 𝑥
𝜇 𝜇
𝑥
𝜇
⋯ 𝐸 𝑥 ⋯ 𝐸 𝑥 ⋱ ⋯ 𝐸 𝑥
𝜇 𝜇 𝜇
⋮
𝑥 𝑥
𝜇 𝜇
𝑥
𝜇
⎤ ⎥ ⎥ ⎦
4.3 Parameter and Variable Estimation Var 𝑥 cov 𝑥 , 𝑥 ⋮ cov 𝑥 , 𝑥
cov 𝑥 , 𝑥 Var 𝑥 ⋮ cov 𝑥 , 𝑥
⋯ ⋯ ⋱ ⋯
147 cov 𝑥 , 𝑥 cov 𝑥 , 𝑥 ⋮ Var 𝑥
𝐸 𝐱𝐱
𝛍 𝛍 .
(4‐176)
For the very last transformation, we have employed equation (4‐173) and the fact that the expected values 𝛍 can be considered as deterministic variables, so 𝐸 𝛍 𝛍 holds, according to equation (4‐167). As we can see, the covariance matrix developed from a vector with stochastic variables contains in their principle diagonal the variances of each variable in the vector. The entry in the 𝑖th coloum and the 𝑗th row equal the covariance of the 𝑖th and 𝑗th variable in the vector. Therefore, every covariance matrix is symmetric. Also, every covariance matrix is positive‐semidefinite. We will employ covariance matrices to describe the relation between stochastic variables when discussing the Kalman filter in section 4.3.3. As we have seen the similarity between covariance and correlation, it is also straightforward to define a correlation matrix 𝐑 . In literature, the definition is usually based on the correlation coefficient according to equation (4‐175). Within this thesis, we will use the following definition which bases on the employed definition of correlation in equation (4‐172): 𝐑
𝐸 𝐱𝐱
.
(4‐177)
Additionally to the covariance matrix 𝐂 in equation ), it is straightforward to introduce a cross‐covariance matrix 𝐂 , to describe the cross‐covariances between two vectors with stochastic variables 𝐱 ∈ ℝ and 𝐲 ∈ ℝ . Using vectors containing the expected values of 𝐸 𝑥 𝐸 𝑥 ⋯ 𝐸 𝑥 the variables in 𝐱 and 𝐲, denoted as 𝛍 and 𝛍 𝐸 𝑦 𝐸 𝑦 ⋯ 𝐸 𝑦 , the cross‐covaraince matrix is given as 𝐸
𝐂
𝐱
𝛍
cov 𝑥 , 𝑦 cov 𝑥 , 𝑦 ⋮ cov 𝑥 , 𝑦
𝐲
𝛍
cov 𝑥 , 𝑦 cov 𝑥 , 𝑦 ⋮ cov 𝑥 , 𝑦
⋯ cov ⋯ cov ⋱ ⋯ cov
𝑥 ,𝑦 𝑥 ,𝑦 ⋮ 𝑥 ,𝑦
.
(4‐178)
For normally distributed higher‐dimensional stochastic variables, stored in the vector 𝐱 ∈ ℝ , with the positive definite covariance matrix 𝐂 ∈ ℝ , the jointed PDF is given in modification of equation (4‐154) by: 𝑓 𝑥
1 2𝜋
det 𝐂
𝑒
𝐱 𝛍
𝐂
𝐱 𝛍
,
(4‐179)
where det 𝐂 returns the determinant of 𝐂 . Finally, we can expand the concept of conditional probability as introduced in section 4.3.1.2 to stochastic variables, following discussion in Ross, 2014. For discrete variables, we can define the conditional PMF of 𝑋 given that 𝑌 𝑦 by
148 𝑝
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
|
𝑃 𝑋 𝑥, 𝑌 𝑦 𝑃 𝑌 𝑦
𝑥|𝑦
𝑝
𝑥, 𝑦 , 𝑝 𝑦
(4‐180)
based on equation (4‐145) and assuming that 𝑝 𝑦 0. Under the same condition and employing equation (4‐150), we can define the conditional probability distribution function by 𝐹
|
𝑥|𝑦
𝑃 𝑋
𝑥, 𝑌
𝑦
𝑝
|
𝑎|𝑦 .
(4‐181)
For continuous variables, it is straightforward to define the conditional PDF, for all cases where 𝑓 𝑦 0, by 𝑓
|
𝑥|𝑦
𝑓
𝑥, 𝑦 . 𝑓 𝑦
(4‐182)
However, we have to be careful with the interpretation of the conditional PDF. As we have discussed at the end of section 4.3.1.3, the real probability of a continuous variable to tend to any concrete variate is zero, even if the PDF is greater than zero. In order to find a reasonable interpretation of the conditional PDF, we might multiply the left side of equation (4‐182) by d𝑥, and the right side by d𝑥 d𝑦 ⁄d𝑦 d𝑥 to obtain 𝑓
|
𝑓
𝑥|𝑦 d𝑥
𝑥, 𝑦 d𝑥 d𝑦 𝑓 𝑦 d𝑦
𝑃 𝑥
𝑋 𝑥 𝑃 𝑦
d𝑥, 𝑦 𝑌 𝑦
𝑌 𝑦 d𝑦
d𝑦
𝑃 𝑥
𝑋
d𝑥|𝑦
𝑌
d𝑦 .
𝑥
𝑦
(4‐183)
Looking at this result and at the discussions around equation (4‐153), we can say that for d𝑥, d𝑦 being small, 𝑓 | 𝑥|𝑦 d𝑥 represents the conditional probability that 𝑋 is between 𝑥 and 𝑥 d𝑥, given that 𝑌 is between 𝑦 and 𝑦 d𝑦. This allows us to express the relationship between the conditional PDF and the conditional probability distribution function for continuous variables by 𝐹
|
𝑥|𝑦
𝑃 𝑋
𝑥, 𝑌
𝑦
𝑓
|
𝜏|𝑦 d𝜏 .
(4‐184)
Finally, we need to transfer Bayes’ theorem and the law of total probability to be used for stochastic variables. For the former one as stated in equation (4‐146), we can see that 𝑝
|
𝑥|𝑦
𝑝
|
𝑦|𝑥
𝑝 𝑥 𝑝 𝑦
(4‐185)
holds true for discrete variables, while for continuous ones, we can say that 𝑓
|
𝑥|𝑦
𝑓
|
𝑦|𝑥
𝑓 𝑥 . 𝑓 𝑦
(4‐186)
4.3 Parameter and Variable Estimation
149
The law of total probability was introduced in equation (4‐147), where we described a way to compute the probability of a single event based on conditional probabilities. To this extend, the conditional probabilities 𝑃 𝐴|𝐵 and 𝑃 𝐴|𝐵 have been multiplied by the total probabilities 𝑃 𝐵 respectively 𝑃 𝐵 and summed up. Considering that a stochastic variable can usually take more than two different values, the law can easily be generalized for joint stochastic variables. For discrete variables, it holds true that 𝑝 𝑥
𝑝
|
𝑥|𝑦 𝑝 𝑦 ,
, ,…,
(4‐187)
,
while for continuous variables, it can be stated that 𝑓 𝑥
𝑓
|
𝑥|𝑦 𝑓 𝑦 d𝑦 .
(4‐188)
4.3.1.7 Stochastic Signals
Figure 4‐29: Thought experiment for the appearance of stochastic signals
In the final part of the introduction of the stochastic basics we will define the stochastic signal or stochastic process. To this extend we look at Figure 4‐29, which borrows from Brammer and Siffling, 1990 as well as Ament and Glotzbach, 2016b, that shows a thought experiment about how we can imagine the synthesis of a stochastic signal. As it was done in Figure 4‐26 for variables, now every of the 𝑁 elementary events is linked to an arbitrary real function of time (which may even be deterministic), denoted as 𝑥 𝑚 , 𝑡 . Depending on which elementary event is realized in the probability experiment at a concrete time 𝑡, the accordant value of 𝑥 𝑚 , 𝑡 is selected to be the current value for the stochastic signal 𝑥 𝑡 . That means that any realisation at a concrete time 𝑡 is the same as finding a variate for a stochastic variable, as discussed in section 4.3.1.3. As for stochastic variables, also for signals it is possible to introduce a CDF. For a signal consisting of discrete variables, the CDF can be computed as
150 𝐹
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
,
𝑥, 𝑡 ∶
𝐹
,
𝑥 𝑡
𝑥 𝑚 ,𝑡
𝑝
,
𝑥 𝑡
𝑥 𝑚 ,𝑡
.
(4‐189)
For continuous variables, the PDF can be computed as 𝑓
,
𝑥, 𝑡
d𝐹
, 𝑥, 𝑡 . d𝑥
(4‐190)
At this point, two important classifications for stochastic signals can be made. A signal/ process is denoted as stationary if and only if the CDF/ the PDF do not change with time. That means that the condition und which the value of the signal at time 𝑡 is calculated stays the same for all 𝑡. The second classification for stationary signals is based on the mean values that can be computed from the involved functions of time. The mean value of all defined functions 𝑥 𝑚 , 𝑡 at a defined time 𝑡 is referred to as ensemble average 𝑥̅ 𝑡 . The mean value of any of the functions over the whole time 𝑡 is denoted as time average 𝑥̅ 𝑚 . A signal is denoted as ergodic, if the ensemble average at any time equals the time average of any function that may contribute to the stochastic signal (that is, the probability measurement of its accordant elementary event, 𝑃 𝑚 , is greater than zero). If a signal is ergodic, it can be stated that its value after a long time is almost independent from its initial state. Keeping in mind our applications in navigation of marine robots, we are interested in dealing with stochastic signals and in finding any relations as a base for estimation and forecast. To this extend, it is of interest whether the current value of a stochastic signal is somehow related to a past value. We have introduced the correlation and the covariance as a measure of linearity between two stochastic variables. For a stochastic signal, it is straightforward to compute the correlations respectively the covariances between the signal and itself, shifted by a certain amount of time. The autocorrelation 𝑅 and the autocovariance 𝐶 as functions of two times 𝑡 and 𝑡 describe the correlation respectively covariance between the signal 𝑥 at time 𝑡 and the signal 𝑥 at time 𝑡 . Based on equations (4‐172) and (4‐173), we can say that 𝑅
𝑡 ,𝑡
𝐶
𝑡 ,𝑡
𝐸 𝑥 𝑡 ∶ 𝐸 𝑥 𝑡
𝑥 𝑡
, 𝐸 𝑥 𝑡
𝑥 𝑡
𝐸 𝑥 𝑡
(4‐191)
.
If the stochastic signal is stationary, the absolute time is neglectable, and introducing ∆𝑡 𝑡 𝑡 , we can write: 𝑅
∆𝑡
𝐶
∆𝑡 ∶ 𝐸 𝑥 𝑡
𝐸 𝑥 𝑡 𝑥 𝑡
∆𝑡 𝐸 𝑥 𝑡
, 𝑥 𝑡
∆𝑡
𝐸 𝑥 𝑡
∆𝑡
.
(4‐192)
In this case, autocorrelation and autocovariance are just functions of the time interval ∆𝑡. They compute the correlation respectively covariance of a stochastic signal with itself, shifted by ∆𝑡. It is straightforward to say that for ∆𝑡 0, the autocovariance equals the variance of the signal, as one can see by comparing equation (4‐192) with (4‐160). Also, it can be stated that both autocorrelation and autocovariance have their maximum at ∆𝑡 0; the value of the autocovariance at the maximum equals the variance of the signal. As the autocorrelation is more commonly used, we will further on skip discussions on the autocovariance. Note that for zero‐mean signals, autocorrelation and autocovariance deliver the same results. For stationary, ergodic signals, the autocorrelation can be computed to be
4.3 Parameter and Variable Estimation 𝑅
∆𝑡
lim →
1 𝑇
𝑥 𝑡 𝑥 𝑡
∆𝑡 d𝑡 .
151
(4‐193)
It is also possible to realise a stochastic signal in a discrete time manner, e.g. by sampling a continuous stochastic signal. The adequate methodologies have been described in section 4.1.3. For a discrete time signal 𝑥 𝑘 ∶ 𝑥 𝑘 𝑇 with 𝑁 1 realizations from 𝑡 0 to 𝑡 𝑁 𝑇, where 𝑇 is the step size of the sampling, the autocorrelation equals 𝑅
𝑘
lim →
1 𝑁
𝑥 𝑖 𝑥 𝑖
𝑘 .
(4‐194)
We will look at two different stochastic signals as example to get a better understanding about the meaning of the autocorrelation. Let us at first imagine a signal in which every realisation according to our introduced thought experiment is completely independent from any of the past ones. That means, for a continuous time signal, the values at any pair of time are independent from each other. If we consider a discrete time signal, every single value is independent from other value. Such a signal is denoted as white noise 𝜀 𝑡 . As any two values in a white noise are independent and therefore uncorrelated, one can imagine that the autocorrelation is zero for all ∆𝑡 respectively 𝑘 unequal zero. For a zero‐mean white noise, the autocorrelation and the autocovariance are identical; therefore the value of the autocorrelation for ∆𝑡 respectively 𝑘 equal zero equals the variance 𝜎 , and we can write: 𝑅
∆𝑡
𝜎 𝛿 ∆𝑡 ,
(4‐195)
where 𝛿 ∆𝑡 is the Dirac delta function. Note that the definition of white noise does not say anything about the distribution of the single stochastic values in the signal. A signal in which the PDF for the stochastic variables equals the normal distribution according to equation (4‐154) is denoted as Gaussian noise. The two terms are sometimes confused in literature, but as stated, they have a different meaning. Nevertheless, it is very common to employ a signal being both white and Gaussian to model measurement inaccuracies which occur on top of the real quantities. This signal is then referred to as Additive White Gaussian Noise (AWGN). A discrete zero‐mean white Gaussian noise 𝜀 𝑘 with 𝑁 10,000 time steps and a variance of 1 has been generated by a computer. Figure 4‐30 top left shows the first 100 values. On the right side, the corresponding autocorrelation 𝑅 𝑘 computed with equation (4‐194) is shown within the interval 𝑘 0,1, … ,100 . The course is accordant equation (4‐195); the small deviations (oscillations of the figure at the value 0, first value slightly above variance) are because only a finite number of time steps of the signal have been realised. The lower part of Figure 4‐30 shows a so‐called autoregressive noise 𝜁 𝑘 . It was computed according to 𝜁 𝑘
𝜀 𝑘
0.9 𝜁 𝑘
10 ,
(4‐196)
where 𝜀 𝑘 is a zero‐mean white Gaussian noise with the same properties as before. Again, 10,000 samples have been created, and the first 100 samples are displayed in the figure. By comparing with the upper signal course, it is not possible to see a significant difference except
152
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
for the higher variance (which equals 5.39). In the accordant autocorrelation 𝑅 𝑘 we see a significant difference, as there is a maximum at every 10 sample. This is due to the fact that we have used the value ten samples back to compute the current one according to equation (4‐196). As it becomes visible, the autocorrelation can be used to receive important information about the nature of a stochastic signal which cannot be seen directly in the graph of the signal.
Figure 4‐30: Stochastic signals and corresponding autocorrelations
Finally, we can expand the concept of the autocorrelation to evaluate two stochastic signals at once. It is also possible to introduce a function that returns the magnitude of correlation between two different stochastic signals at different time points. This function is referred to as cross‐correlation 𝑅 . For 𝑢, 𝑦 being stationary stochastic signals, the cross‐correlation equals 𝑅
∆𝑡
𝐸 𝑢 𝑡 𝑦 𝑡
∆𝑡
.
(4‐197)
Thus it is straightforward to develop the equations for ergodic and discrete‐time cross‐ correlation similar to the procedures shown for the autocorrelation. It is straightforward to introduce the same principles also for the covariances. Thus we can define a cross‐covariance function 𝐶 ∆𝑡 , denoting the cross‐covariance between 𝑢 𝑡 and 𝑦 𝑡 ∆𝑡 . Note that this quantity is a scalar function of ∆𝑡, which differs from the cross‐ covariance matrix 𝐂 which we have defined in equation (4‐178). After these discussions, we have the necessary knowledge to move on to the estimation theory section.
4.3 Parameter and Variable Estimation
153
4.3.2 Estimation Theory Based on the stochastic introduction given within section 4.3.1, we will now proceed to discuss the estimation of variables or parameters based on noisy measurements. This has a tremendous importance in real life applications. As stated before, relevant quantities like the position of a robot are often not directly measureable; it is only possible to measure related sizes, like distances to a transponder. Moreover, every real measurement contains some uncertainties, which are usually modelled as additional noise. The task is to estimate the original quantity, based on noisy measurement data and (possible) additional information. The discussions in the section 4.3.2 are mainly based on Van Trees, 2001, who also inspired the picture of estimation model as displayed in Figure 4‐31. The following components can be identified:
Figure 4‐31: Model of the estimation process
1. Source: The source generates the variable or parameter 𝑋 that we are interested in supervising. We might or might not have information available on the source and the generation process. For what follows, we assume 𝑋 to be a continuous stochastic variable. 2. Parameter space: The space in which the concrete value 𝑥 of variable 𝑋 is generated is denoted as parameter space. In the simplest case, the variable is a scalar, and hence the parameter space is one‐dimensional. We do not have a direct access to the parameter space. 3. The value 𝑥 has a direct influence to some other quantity 𝑦 in the observation space. Usually, probabilistic effects are involved in that relation. The probabilistic mapping represents the probability law that governs the effect of 𝑥 on 𝑦. 4. The observation space contains the variable 𝑦 that we can access by measuring. In many cases, several measurements are influenced by the variable/ parameter to estimate. Then,
154
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
the observation space is a finite‐dimensional space, and 𝐲 represents the vector to a single point. 5. After obtaining 𝑦, we are interested in finding an appropriate estimate of x, denoted as 𝑥 . The mapping from the observation space into an estimate is denoted as estimation rule. It delivers 𝑥 𝑦 , that is, the estimate as a function of the measured values. In what follows, we will discuss several ways to perform the denoted estimation rule. As it is usual in engineering, we will try to formulate a mathematically traceable problem by assigning a cost function to the values of 𝑥 , which gives us the possibility to find an ‘optimal’ estimate as the product of the minimization of the cost function. In general, we can distinguish between two different estimation strategies; the difference is mainly related to the nature of the source as introduced above: In what is called Bayes estimation, we assume that we have some knowledge about the process that generates the variable to estimate. We can deploy this knowledge to improve the quality of the estimation. The approach is discussed in section 4.3.2.1. In the nonrandom estimation, we assume that we have no information on the generation of the signal; therefore we may ground our estimation solely on the observed variables within the observation space. The details will be introduced in section 4.3.2.3. They will be of great importance for some of the discussions on Optimal Sensor Placement in chapter 6. After the introduction of both principle, we will return to the concept behind the Bayes estimation when introducing the Kalman Filter (section 4.3.3): In fact, we can assume that the navigation variables we are interested in estimating are governed by some process that we have information about, and we can use a combination of simulated values by a kinematic vehicle model and measurement of additional quantities. 4.3.2.1 Bayes Estimation: Basics and Cost Functions As it was introduced before, the principle behind the Bayes estimation is that we can build the estimation rule on two aspects: firstly, on the observations of measurements of variable 𝑦 or vector 𝐲, and secondly, on information that we have about the source and the generation of the value 𝑥 to be estimated. As this information is available immediately at the beginning, even before the variable is generated or a variable in the observation space is build, we refer to it as a priori knowledge. To be more precise, in the Bayes estimation we assume that 𝑋 is a stochastic variable as introduced in section 4.3.1.3, and that we know the based PDF, which is in this respect denoted as a priori PDF or a priori probability density 𝑓 𝑥 . As 𝑋 is considered a stochastic variable, and the observed value 𝑦 bases on the current variate 𝑥 and some probalistic mapping, 𝑦 can be seen as the variate of a stochastic variable 𝑌. Therefore, we can introduce some more density functions describing the relation between 𝑋 and 𝑌. For instance, we can introduce the PDF of the observation process 𝑓 | 𝑦|𝑥 . It describes the density of the variable 𝑌, given that 𝑋 𝑥 has happened. It is possible to compute 𝑓 | 𝑦|𝑥 for different real scenarios. Let us assume we are able to directly measure a variable 𝑥. The measurement can be modelled by adding zero‐mean white Gaussian noise, as discussed in section 4.3.1.7, which is usually a good approximation for real measurement noise, as stated before. In this case, the PDF of the measurement noise 𝜀 is the same as stated in equation (4‐154), 𝑓 𝜀
1 2𝜋𝜎
𝑒
,
(4‐198)
4.3 Parameter and Variable Estimation
155
where 𝜇 0 (because we assumed the noise to be zero‐mean) and 𝜎 representing the variance of the measurement noise. As we are interested to compute the PDF of the observation process, that is, for 𝑋 𝑥, it is straightforward to say that the shape of the PDF will not be changed, as 𝑋 is assumed to be ‘constant’. Also, the expected value of 𝑦, 𝐸 𝑦 or 𝜇 , will equal 𝑥, as the variable 𝑋 is directly measured, and the added noise is zero‐mean. Therefore, it is straightforward to say that: 𝑓
|
1
𝑦|𝑥
2𝜋𝜎
𝑒
.
(4‐199)
Another density function of interest is just the converse one, 𝑓 | 𝑥|𝑦 , which is denoted as a posteriori PDF. The term ‘a posteriori’ shows that this PDF expresses the density of the original variable 𝑋 after the concrete observation 𝑌 𝑦 has been made. This function is not known to us, so we need to find a way to compute it using the already introduced function. Following equation (4‐186), we can state that 𝑓
|
𝑥|𝑦
𝑓
|
𝑦|𝑥 𝑓 𝑥 𝑓 𝑦
𝑓
|
𝑦|𝑥 𝑓 𝑥
𝑓
|
𝑦|𝑥 𝑓 𝑥 d𝑥
,
(4‐200)
where in the second step we have replaced the unknown function 𝑓 𝑦 according to the law of total probability, as stated in equation (4‐188). Our goal is now to develop an estimator, namely an algorithm that will return an estimate for the current value 𝑥, denoted as 𝑥 𝑦 , which employs the knowledge of the current observation 𝑦. The estimator is referred to as Bayes estimator if we assume that we also know the a priori PDF 𝑓 𝑥 . In order to develop Bayes estimators, we might introduce a cost function based on the estimation error and try to find an estimator that minimizes the expected value of this function. Therefore, we introduce the estimation error 𝜖 𝑦 as difference between the true current value 𝑥 and the estimation that the estimator returned based on the observation of 𝑦,: 𝜖 𝑦 ∶ 𝑥 𝑦
𝑥 .
(4‐201)
Figure 4‐32: Typical cost functions for the Bayes estimation: mean‐square error (left), absolute error (middle), uniform cost function (right), based on Van Trees, 2001
Then we have to introduce a cost function that returns the costs associated with the estimation error. This can be done in different ways; the most common ones are the mean‐ squared error (MS), the absolute error (AB) or a uniform function (MAP) that is zero as long as
156
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
the error stays below a defined bound and one otherwise. Figure 4‐32 provides an overview. Thus we obtain the following cost functions 𝐶 𝑥, 𝑦 : 𝑀𝑆:
𝐶 𝑥, 𝑦 ∶
𝑥 𝑦
𝐴𝐵𝑆: 𝐶 𝑥, 𝑦 ∶ |𝑥 𝑦 𝑈𝑁𝐹: 𝐶 𝑥, 𝑦 ∶
𝑥 𝑥| for |𝜖 𝑦 | for |𝜖 𝑦 |
0 1
(4‐202)
Δ⁄2 . Δ⁄2
4.3.2.2 Elementary Bayes Estimators It is important to note that we cannot compute 𝐶 𝑥, 𝑦 for a concrete observation 𝑦, as we do not know the true 𝑥. As 𝐶 𝑥, 𝑦 is a function of two stochastic variables, it is straightforward to use its expected value. This gives rise to the so‐called Bayes risk ℛ as expected value of the cost function. Our goal is to find that 𝑥 𝑦 that minimizes ℛ. In equation (4‐170), we have introduced the expected value for functions of two‐dimensional stochastic variables. Thus we can say that ℛ
𝐸 𝐶 𝑥, 𝑦
𝐶 𝑥, 𝑦 𝑓
𝑥, 𝑦 d𝑥 d𝑦 .
(4‐203)
The joint PDF 𝑓 𝑥, 𝑦 can be replaced according to equation (4‐182) by the product of the PDF of the observation process, 𝑓 | 𝑥|𝑦 , and the PDF of the variable 𝑌, 𝑓 𝑦 . As the latter one does not depent on 𝑥, we can write it in front of the inner integral: ℛ
𝑓 𝑦
𝐶 𝑥, 𝑦 𝑓
|
𝑥|𝑦 d𝑥
d𝑦 .
(4‐204)
It is straightforward to say that the inner integral and 𝑓 𝑦 are non‐negative. The cost function only enters into the inner integral. Thus the minimum of ℛ can be found by minimising the inner integral: argmin ℛ
argmin
𝐶 𝑥, 𝑦 𝑓
|
𝑥|𝑦 d𝑥 .
(4‐205)
Now we can insert the three defined cost functions of equation (4‐202) into (4‐205), derive with respect to 𝑥 𝑦 and set the result equal to zero in order to obtain the three algorithms. For the mean‐square (MS) case, we can say that d d𝑥
𝑥 𝑦
𝑥
𝑓
|
𝑥|𝑦 d𝑥
d d𝑥
𝑥 𝑦
2𝑥 𝑦 𝑥
𝑥
𝑓
|
𝑥|𝑦 d𝑥 (4‐206)
2𝑥 𝑦
𝑓
|
𝑥|𝑦 d𝑥
2
𝑥𝑓
|
𝑥|𝑦 d𝑥 .
4.3 Parameter and Variable Estimation
157
By setting this difference equal to zero and isolating 𝑥 𝑦 , we find the algorithm for the MS‐ estimator, which is tagged by the subscripted MS. Note that the first integral in the final line of (4‐206) equals one, according to equation (4‐152). In order to check for the sufficient condition of a minimum, we might derive equation (4‐206) for a second time to receive 2. 𝑥
𝑦
𝑥𝑓
𝑥|𝑦 d𝑥 .
|
(4‐207)
The result is remarkable: Comparing with equation (4‐158), we see that the MS‐estimate equals the mean of the a posteriori PDF, which is also denoted as conditional mean. We will now repeat the exercise for the ABS‐ cost function to obtain: d d𝑥
|𝑥 𝑦
𝑥| 𝑓
|
𝑥|𝑦 d𝑥 (4‐208)
d d𝑥
𝑥 𝑦
𝑥 𝑓
|
𝑥|𝑦 d𝑥
𝑥
𝑥 𝑦
𝑓
|
𝑥|𝑦 d𝑥 .
By executing the differentiation and setting the result equal to zero, we can find the following condition that the 𝑥 𝑦 ‐ estimator must hold: 𝑓
|
𝑥|𝑦 d𝑥
𝑓
𝑥|𝑦 d𝑥 .
|
(4‐209)
That means, that the ABS‐estimator returns that value at which the integrals of the a posteriori PDF on the left and right side are equal. This value is also denoted as the median of the density function. For the uniform cost function, which we have tagged with the abbreviation MAP due to reasons that will become obvious soon, we obtain the following integral to minimize: 𝑥
𝑦
|𝑥 𝑦 |𝑥 𝑦
0 for 1 for
argmin
Δ⁄2 𝑓 Δ⁄2
𝑥| 𝑥|
|
𝑥|𝑦 d𝑥 (4‐210)
⁄
argmin
𝑓
|
𝑥|𝑦 d𝑥
𝑓
|
𝑥|𝑦 d𝑥 .
⁄
That means, we have to build the integral over the a posteriori PDF for the full range, except for interval between 𝑥 𝑦 Δ⁄2 and 𝑥 𝑦 Δ⁄2. As the integral over the full range equals one according to equation (4‐152), we can write: ⁄
𝑥
𝑦
argmin 1
𝑓 ⁄
|
𝑥|𝑦 d𝑥 .
(4‐211)
158
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
Figure 4‐33: Results of the three introduced Bayes estimators at a distinct a posteriori PDF
It is straightforward to see that in order to find the minimum, we have to maximize the integral. The integral equals the area under the a posteriori PDF in the interval between 𝑥 𝑦 Δ⁄2 and 𝑥 𝑦 Δ⁄2. If Δ tends towards zero, it becomes obvious that this area is at a maximum at that point where also the a posteriori PDF reaches its maximum (see also Figure 4‐33). This is a special case (but with greatest interest) of the uniform cost function. As its result always equals the maximum of the a posteriori PDF, it is also denoted as maximum a posteriori estimator and subscripted with MAP: ⁄
𝑥
𝑦
argmin 1 argmax 𝑓
lim
𝑓
→
|
𝑥|𝑦 d𝑥 (4‐212)
⁄
𝑥|𝑦
|
Table 4‐1: Overview of the introduced Bayes estimators
MS
𝑥 𝑦
Cost function
ABS
𝑥
|𝑥 𝑦
MAP
𝑥|
0 for |𝜖 𝑦 | 1 for |𝜖 𝑦 |
Δ⁄2 , Δ⁄2
Δ → 0
𝑥 Conditional equation
Interpretation of result
𝑦 𝑥𝑓
𝑓 |
|
𝑥|𝑦 d𝑥
𝑥|𝑦 d𝑥
Mean of a posteriori PDF
𝑓
|
𝑥|𝑦 d𝑥
Median of a posteriori PDF
𝑥
𝑦
argmax 𝑓
|
𝑥|𝑦
Maximum of a posteriori PDF
We will only employ the MAP estimator and neglect other UNF estimators with greater Δ. Figure 4‐33 shows a distinct a posteriori PDF and the locations for the three introduced Bayes estimators. The solution of the MS‐estimator is located at the mean of the PDF and can be computed employing equation (4‐207). The solution of the ABS estimator is at that position so
4.3 Parameter and Variable Estimation
159
that the area under the PDF on its left and right are equal. The MAP estimator returns the value at which the PDF takes its maximum. The three estimators are summarized in Table 4‐1. Note that in different scenarios the outcome of the three estimators might be identical. For instance, if the a posteriori PDF exhibits a Gaussian distribution, it is easy to see that both the expected value as well as the median are located at the maximum of the function. As it will be of interest in the next subsection, we will discuss a way to compute the solution for the MAP‐estimator: It is straightforward to compute the derivative of the a posteriori PDF with respect to 𝑥 and set the solution equal to zero. Employing equation (4‐200), we can say that: ∂ 𝑓 ∂𝑥
∂ 𝑓 ∂𝑥
𝑥|𝑦
|
|
𝑦|𝑥 𝑓 𝑥 𝑓 𝑦
0 .
(4‐213)
The computation of this derivative is very cumbersome. However, we can ease the process by replacing the PDF by the logarithm of the PDF. As the logarithm does not change the position of the extrema (only their values), this step is admissible, and we obtain: ∂ ln𝑓 ∂𝑥
|
∂ 𝑓 ln ∂𝑥
𝑥|𝑦
∂ ln 𝑓 | 𝑦|𝑥 ∂𝑥 ∂ ln 𝑓 | 𝑦|𝑥 ∂𝑥
ln 𝑓 𝑥
|
𝑦|𝑥 𝑓 𝑥 𝑓 𝑦
ln 𝑓 𝑦 ∂ ln𝑓 𝑥 | ∂𝑥
(4‐214)
0 .
In this sum, the first summand is related to the stochastic dependence of 𝑦 on 𝑥, while the second one refers to the a priori knowledge. We will conclude our discussions on Bayes estimators with a simple example: A source creates a stochastic variable 𝑋; let us assume that the generation is based on a Gaussian distribution with mean 𝑥̅ and variance 𝜎 . Thus we can state about the a priori PDF: 1
𝑓 𝑥
2𝜋𝜎
̅
𝑒
,
(4‐215)
The generated variable can directly be measured; however, some zero‐mean white Gaussian noise 𝜀with variance 𝜎 is added. We have discussed this situation at the beginning of section 4.3.2.1 and stated the PDF of the observation process in equation (4‐199). After obtaining one measurement 𝑦, what will be the result of the MAP estimator? We can formulate the principle as: 𝑦
𝑥
𝜀 .
(4‐216)
To this extend, we insert the PDFs into equation (4‐214) and make use of the fact that the logarithm of a product equals the sum of the logarithms of the factors:
160
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
∂ 1 ln ∂𝑥 2𝜋𝜎
ln 𝑒 (4‐217) ̅
∂ 1 ln ∂𝑥 2𝜋𝜎
ln 𝑒
0 .
In both brackets, the first summands do no depend on 𝑥 and will vanish when performing the derivation. In the second summands, the logarithm and the exponential function cancel each other, thus we obtain: ∂ ∂𝑥
𝑦
2𝑦𝑥 2𝜎 2𝑦
⇔ 2𝜎
𝑥
2𝑥
𝜎 𝜎
𝜎
2 𝑥 𝑥̅ 2𝜎
𝑦
2𝜎
𝑥̅
0 (4‐218)
2𝑥
𝑦
2 𝑥̅
0 .
𝑦 to obtain:
We solve this for 𝑥 𝑥
𝑥
𝑦
𝜎 𝜎
𝜎
𝑥̅ .
(4‐219)
The MAP estimator is a weighted sum of the measured value and the a priori known mean 𝑥̅ . It is interesting to look at two special cases: If 𝜎 ≪ 𝜎 , the measurement noise has a much greater variance than the stochastic variable 𝑋. In this case, the measurement is not useable, and it seems more reasonable to rely on the a priori knowledge in terms of 𝑥̅ . This is exactly what happens according to the last equation, as the first fraction will tend towards zero, cancelling the influence of 𝑦, while the second fraction tends towards one. In the case 𝜎 ≫ 𝜎 , the influence of the measurement noise is quite small in comparison to oscillations of 𝑋. Thus it is better to use the measured 𝑦 as estimate and to ignore the a priori knowledge in terms of 𝑥̅ , which is again exactly what equation (4‐219) will deliver. 4.3.2.3 Nonrandom Estimation: Basics and Criteria for Comparison of Estimators We will now talk about the case in which the unknown parameter or variable 𝑋 can no longer be treated as a stochastic variable. To be more precise, we are looking at the case that we have absolutely no information about how 𝑥 is generated. The unknown variable might be generated by a deterministic process that we have no information about, however, it might still be a stochastic process that generates 𝑥, but we simply do not know anything about, that is, we do not know the a priori PDF 𝑓 𝑥 , the a priori mean 𝑥̅ , or variance 𝜎 . We can only assume that we still know the PDF of the observation process, 𝑓 | 𝑦|𝑥 . Let us at first try to reuse the approach employed within the Bayes estimation. In equation (4‐203) we have introduced the Bayes risk as the expected value of the cost function which was a function of two stochastic variables, 𝑦 and 𝑥. As we do no longer treat 𝑥 as a stochastic variable, the cost function 𝐶 𝑥, 𝑦 will only depend on the stochastic variable 𝑦, and the accordant PDF is 𝑓 | 𝑦|𝑥 . The algorithm to compute the expected value of a function was introduced in equation (4‐164). Thus we obtain:
4.3 Parameter and Variable Estimation ℛ
𝐸 𝐶 𝑥, 𝑦
𝐶 𝑦 𝑓
|
161
𝑦|𝑥 d𝑦 .
(4‐220)
Let us now use the MS cost function according to equation (4‐202). After inserting in the just stated risk function, we can compute the derivative with respect to 𝑥 to obtain: d d𝑥
𝑥 𝑦
𝑥
𝑓
|
𝑦|𝑥 d𝑦
d d𝑥
𝑥 𝑦
2𝑥 𝑦 𝑥
𝑥
𝑓
|
𝑦|𝑥 d𝑦 (4‐221)
2
𝑥 𝑦 𝑓
|
𝑦|𝑥 d𝑦
2𝑥
𝑓
𝑦|𝑥 d𝑦 .
|
By setting this result equal to zero, we obtain the conditional equation for the MS estimator for non‐random estimation. Noting that the second integral equals one, we obtain 𝑥
𝑓 | 𝑦|𝑥 d𝑦
𝑥.
(4‐222)
Note that the term on the left side equals the expected value of 𝑥 deterministic variable, we obtain the result: 𝑥
𝑥.
. As 𝑥 is considered a
(4‐223)
This is appealing in a mathematical sense, but of no practical use, as we do not know the variable 𝑥 and are trying to estimate it. We see that the employment of cost functions might not lead to useable result for nonrandom estimation. Therefore, we need to find other ways to evaluate the quality of nonrandom estimators. As an estimate can be considered a stochastic variable, it might be of interest to observe its expected value. If we consider the example at the end of section 4.3.2.2 and change the set‐up in that way that we are able to obtain a large number of measurements, stored as vector 𝐲, while the base variable 𝑋 remains at a constant value, we could ask whether the estimation would tend towards the true value 𝑥 in a finite time, or whether there would be a remaining divergence, no matter how many measurements of the same variate 𝑥 we take. The remaining divergence is denoted as bias. Estimators which expected value equals the true value 𝑥 are denoted as unbiased. In a mathematical formulation, we can say the first of the following two equations classifies an unbiased estimator, while the second on belongs to a biased one, 𝐸 𝑥
𝑓
𝑥
|
𝑦|𝑥 d𝑦
𝑥 , (4‐224)
𝐸 𝑥
𝑥
𝑓
|
𝑦|𝑥 d𝑦
𝑥
𝑏 𝑥 ,
162
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
where 𝑏 𝑥 represents the bias which is often a function of 𝑥. If an estimator exhibits a constant bias, it might be possible to compute or estimate it and subtract it from the estimated value in order to get an unbiased estimator. Employing the error 𝜖 𝑦 according to equation (4‐201), it is obvious that for all estimators, the following holds true: 𝐸 𝜖 𝑦
𝐸 𝑥 𝑦
𝑥
𝑏 𝑥 .
(4‐225)
It is clearly of interest to have an estimator which bias is as small as possible, preferably zero. We can formulate that as a first criterion to evaluate the quality of an estimator. However, we can even compare unbiased estimators by introducing the variance of the estimation error as a second criterion. For any estimator, we can say according to equation (4‐161) that Var 𝜖 𝑦
𝐸 𝜖 𝑦
𝐸 𝜖 𝑦
𝐸 𝑥 𝑦
𝑥
𝑏 𝑥 .
(4‐226)
If the variance is large, the single realisations of the estimator are farer away from the expected value. If we can only perform a low number of realisations, we might have a larger error than for smaller variances. Therefore we would ideally wish for an unbiased estimator whose estimation error variance is as small as possible. In what follows, we will start to introduce an estimator that is suitable for non‐random estimations. Then we will judge its performance in terms of estimation error variance by discussing the question how small the variance for any possible estimator can become. 4.3.2.4 Maximum Likelihood Estimation and Cramér‐Rao‐Bound For the Bayes estimation, we have introduced three estimators that were based on mean, median, and maximum of the a posteriori PDF. The MAP estimator employed a very sound concept: Under the condition that a concrete 𝑌 𝑦 has occurred, what is the value 𝑥 that would result in this observation with the highest probability. As the a posteriori PDF is not available in the non‐random estimation, we might find a similar concept employing the PDF of the observation process, 𝑓 | 𝑦|𝑥 . The PDF 𝑓 | 𝑦|𝑥 , as a function of x, is denoted as the likelihood function 𝛬 𝑥 which is of great interest within the estimation theory: 𝛬 𝑥
𝑓
|
𝑦|𝑥 .
(4‐227)
In this context, we introduce the maximum likelihood estimation 𝑥 𝑦 as that value of x at which the likelihood function is a maximum. As we have done it before, we will again work with the logarithm of the likelihood function, ln 𝛬 𝑥 , which is denoted as log likelihood function. It exhibits the same extrema, but will ease the computation process. We can obtain the maximum likelihood estimation by deviating the log likelihood function with respect to 𝑥 and setting the result equal to zero. The generated equation is referred to as likelihood function: 𝜕 ln 𝛬 𝑥 𝜕𝑥
𝜕 ln 𝑓 | 𝑦|𝑥 𝜕𝑥
0 .
(4‐228)
4.3 Parameter and Variable Estimation
163
It is worth noting that this is similar to the definition found for the MAP estimation, as stated in equation (4‐214), only that the second summand containing the a priori knowledge is missing for the ML estimation. This is straightforward, as in the non‐random estimation, no a priori knowledge is available. At this point, we need to ask the question how ‘good in some sense’ is the ML estimator. We have discussed that the variance of the estimation error is a good criterion to judge an estimator. At this point the question might arise whether there is a kind of an absolute lower bound for the error variance for any unbiased estimator that can never be undershot. With other words, if we can proof that such a lower bound exists, and if we can find an estimator which error variance equals the computed lower bound, then we cannot find a ‘better’ one, related to the employed criterion. In fact, such a lower bound exists. It was derived by Cramér, 1946, and Rao, 1945 and is therefore denoted as Cramér‐Rao (lower) bound (see also Kay, 1993): Definition: Cramér‐Rao bound Let 𝑥 𝑦 be an unbiased estimate of the non‐random variable 𝑥, based on the measurement 𝑦. Further let 𝑓 | 𝑦|𝑥 be the likelihood function or the PDF of the observation process, and assume that 𝜕𝑓
|
𝑦|𝑥 𝜕𝑥
and
𝜕 𝑓
𝑦|𝑥
|
𝜕𝑥
(4‐229)
exist and are absolutely integrable. Then the variance of the estimation error 𝜖 𝑦 𝑥 𝑦 𝑥 cannot fall below a certain bound, denoted as Cramér‐Rao Bound (CRB), so that the following equations will always hold true: Var 𝑥 𝑦
𝑥
𝐸
𝜕 ln 𝑓 | 𝑦|𝑥 𝜕𝑥
(4‐230)
𝜕 ln 𝑓 | 𝑦|𝑥 𝜕𝑥
.
(4‐231)
respectively Var 𝑥 𝑦
𝑥
𝐸
Any unbiased estimator that fulfils the equations with equality is referred to as efficient, that means it is not possible to find an estimator with a lower estimation error variance. In what follows, we will proof these statements. To this extend, we will make use of Schwarz's inequality, which is also referred to as Cauchy–Schwarz–Buniakowsky inequality (Gradshteyn and Ryzhik, 2007): For two real integrable functions 𝑔 𝑥 , ℎ 𝑥 on 𝑎, 𝑏 , it holds true that
𝑔 𝑥 ℎ 𝑥 d𝑥
𝑔 𝑥 d𝑥
ℎ 𝑥 d𝑥 ,
(4‐232)
where the equality holds if and only if 𝑔 𝑥
𝑘 𝑥 ℎ 𝑥 ,
(4‐233)
164
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
with 𝑘 𝑥 real. Another mathematical principle that we are going to exploit is the logarithmic differentiation. Due to the chain rule, it holds true that 𝜕 ln 𝑔 𝑥 𝜕𝑥
1 𝜕𝑔 𝑥 𝜕𝑔 𝑥 ⇒ 𝑔 𝑥 𝜕𝑥 𝜕𝑥
𝜕 ln 𝑔 𝑥 𝑔 𝑥 , 𝜕𝑥
(4‐234)
which can be employed to replace the deviation of a function by the deviation of the log function that might be easier to compute. We now start to prove the above theorem with the formulation of the expected value of the estimation error, which is zero as the estimate was assumed to be unbiased: 𝐸 𝑥 𝑦
𝑥
𝑓
𝑦|𝑥
|
𝑥 𝑦
𝑥 d𝑦
0 .
(4‐235)
Note that the products in the integral have been swapped. In the next step, we will derivate the equation with respect to 𝑥. Due to the condition formulated in equation (4‐229), we can put the differentiation inside the integral: d d𝑥
𝑓
𝑦|𝑥
|
𝑥 𝑦
d 𝑓 d𝑥
𝑥 d𝑦
|
𝑦|𝑥
𝑥 𝑦
𝑥
d𝑦
0 .
(4‐236)
We can now perform the derivation, employing the product rule: d 𝑓 d𝑥
𝑦|𝑥
|
𝑥 𝑦
𝑥
d𝑦 (4‐237)
d𝑓
|
𝑦|𝑥 d𝑥
𝑥 𝑦
𝑥 d𝑦
𝑓
𝑦|𝑥 d𝑦
|
0 .
Note that the second integral equals 1 according to equation (4‐152). Within the first integral, we can now replace the derivation of the PDF according to equation (4‐234) to obtain: ⇒
d ln 𝑓
|
𝑦|𝑥
d𝑥
𝑓
|
𝑦|𝑥
𝑥 𝑦
𝑥 d𝑦
1 .
(4‐238)
In order to make use of the Cauchy–Schwarz–Buniakowsky inequality, we separate the terms within the integral into two groups, and we square both sides of the equation: ⇒
d ln 𝑓
|
d𝑥
𝑦|𝑥
𝑓
|
𝑦|𝑥
𝑓
|
𝑦|𝑥
𝑥 𝑦
𝑥
d𝑦
1 .
(4‐239)
Note that the expression on the left is equivalent to the one in equation (4‐232). Therefore, we can conclude that the expression equivalent to the right side of equation (4‐232) has to be equal or greater than the right term in equation (4‐239), that is, 1. Thus we obtain:
4.3 Parameter and Variable Estimation d ln 𝑓
⇒
𝑦|𝑥
|
𝑓
d𝑥
|
𝑦|𝑥
165
d𝑦
𝑥 𝑦
𝑥
𝑓
|
𝑦|𝑥
d𝑦
1 ,
(4‐240)
and according to equation (4‐233), equality holds if d ln 𝑓
𝑦|𝑥
|
𝑥 𝑦
d𝑥
𝑥 𝑘 𝑥 ,
(4‐241)
It is easy to see that the two integrals in equation (4‐240) represent expected values. Thus we can write: d ln 𝑓
⇒𝐸
𝑦|𝑥
|
𝐸 𝑥 𝑦
d𝑥
and because Var 𝑥 𝑦 ⇒ Var 𝑥 𝑦
𝑥
𝑥
𝑥
𝐸 𝑥 𝑦 d ln 𝑓
𝐸
|
1 , 𝑥
(4‐242)
:
𝑦|𝑥
,
d𝑥
(4‐243)
which is exactly the claim made in equation (4‐230). In order to prove equation (4‐231), we start with: 𝑓
𝑦|𝑥 d𝑦
|
1 ,
(4‐244)
Now we have to derivate with respect to 𝑥, apply equation (4‐234), and repeat these two steps: d d𝑥 ⇒
𝑓 d d𝑥
|
d ln 𝑓
𝑦|𝑥 d𝑦 d ln 𝑓
𝑦|𝑥
|
|
𝑦|𝑥
d𝑥 𝑓
|
𝑦|𝑥 d𝑦
d ln 𝑓 | 𝑦|𝑥 d𝑥
𝑓
|
𝑦|𝑥 d𝑦
d ln 𝑓 | 𝑦|𝑥 d𝑥
𝑓
|
𝑦|𝑥 d𝑦
d𝑥
𝑓
|
𝑦|𝑥 d𝑦
d ln 𝑓
0 ,
𝑦|𝑥 d 𝑓
|
d𝑥 d ln 𝑓
|
d𝑥
𝑦|𝑥
𝑦|𝑥 d𝑥
|
𝑓
0 Again, both integrals represent expected values, and we can write:
|
(4‐245) d𝑦
𝑦|𝑥 d𝑦
166 ⇒𝐸
4. Mathematical Tools Used From the Areas of Control and Systems Engineering d ln 𝑓
𝑦|𝑥
|
𝐸
d𝑥
𝜕 ln 𝑓 | 𝑦|𝑥 𝜕𝑥
,
(4‐246)
and together with equation (4‐243) this proves equation (4‐231). Thus we have proven that any unbiased non‐random estimator will exhibit an estimation error variance which cannot fall below the Cramér‐Rao bound. We will usually be interested to find efficient estimators, that is, estimators whose error variance equals the Cramér –Rao bound. This requires equation (4‐241) to hold. Evaluating equation (4‐241) for 𝑥 𝑥 𝑦 and considering equation (4‐228), it holds true that 𝜕 ln 𝑓 | 𝑦|𝑥 𝜕𝑥
0
𝑥 𝑘 𝑥 |
𝑥 𝑦
In order for this equation to hold, either 𝑘 𝑥 solution does not depend on the data), or 𝑥 𝑦
.
(4‐247)
has to equal zero (what we discard as this 𝑥 𝑦 has to hold.
As conclusion, we can say that if an efficient estimator exists, then it equals the maximum likelihood estimator. If no efficient estimator exists, we have no possibility to judge the performance of the ML estimator or of any other estimator. We have to keep in mind that in general, all the statements are only valid for unbiased estimators. Let us again look at the example that we used at the end of section and that is described by equation (4‐216). We now assume that we have no a priori knowledge. In order to derive the ML estimator for this problem, we can employ equation (4‐228) and the equation for the likelihood function according to equation (4‐199) to obtain: 𝜕 ln 𝑓 | 𝑦|𝑥 𝜕𝑥 1 2𝜎 ⇒𝑥
𝑦
𝜕 1 ln 𝜕𝑥 2𝜋𝜎 2𝑦
2𝑥 |
𝑦
2𝑦𝑥 2𝜎
𝑥
(4‐248)
0
𝑦 .
For the described problem, the ML estimate equals the measurement value. One could say that this is a quite trivial solution. But using equation (4‐216) it holds true that 𝐸 𝑥
𝑦
𝐸 𝑦
𝐸 𝑥
𝜀
𝑥
0 ,
(4‐249)
thus the ML estimator is unbiased, therefore we know for sure that there is no ‘better’ estimator in terms of minimising the variance of the estimation error. By differentiating equation (4‐248) again with respect to 𝑥, we obtain: 𝜕 ln 𝑓 | 𝑦|𝑥 𝜕𝑥
𝜕 𝑦 𝑥 𝜕𝑥 𝜎
1 . 𝜎
(4‐250)
By inserting this result in equation (4‐231) and keeping in mind that the ML estimator is efficient, we can compute the concrete value of the CRB for our example as
4.3 Parameter and Variable Estimation Var 𝑥
𝑥
𝜎
167
,
(4‐251)
that means that the accuracy of our estimation is limited by the accuracy of the measurement, which is a very sound and reasonable solution. We will employ the ML estimator and the CRB in the context of Optimal Sensor Placement in chapter 6. Especially in this task, it is straightforward to search for a setup of the sensors that is optimal in the sense of possible estimation accuracy, that means, we will try to minimize the CRB. The CRB is also the inverse of what is called the Fisher Information or Fisher Information Matrix (FIM), being defined as: FIM 𝑥
CRB
𝑥
𝐸
𝜕 ln 𝛬 𝑥 𝜕𝑥
𝐸
𝜕 ln 𝑓 | 𝑦|𝑥 𝜕𝑥
.
(4‐252)
The FIM is a measure for the amount of information an observable random variable 𝑦 carries about an unknown parameter/variable 𝑥 or parameter/variable vector 𝐱. In the latter case, the FIM is a squared matrix with dimension equal the number of elements in 𝐱. It is straightforward to find a sensor setup that allows for the retrieving of the maximum possible amount of information. Several mathematically traceable problems can be stated that ‘maximize the FIM’ (and thereby, ‘minimize the CRB), which according to Ucinski, 2004, can be classified into three groups: maximization of the determinant of the FIM (denoted as D‐ optimum design), minimization of the trace of the FIM (A‐optimum design) or maximization of the smallest eigenvalue of the FIM (E‐optimum design). In chapter 5, we will employ the D‐ optimum design to find an optimal placement of range measurement sensors for target position estimation. This refers to problem 4 from the problem formulation in section 3.2. 4.3.3 State Estimation The upcoming sections are a very important part of this chapter. They introduce the Kalman filter and some of its derivates which will be of tremendous importance in the scientific part of this thesis in the chapters 5‐7. So far, we have discussed important principles that will now be put together. We have seen that the state space representation offers a sound way to handle dynamic systems within the time domain, while specially enabling us to consider internal states of a system – an issue that clearly outperforms the ‘classical’, frequency domain based way in control theory which is mainly based on the direct input/ output‐relations of a system or system part. We have become acquainted with the concept of observability and observers itself which enable us to supervise the system states in a permanent manner, even though we can usually not measure all of them. For observable, linear systems, we have introduced the Luenberger observer that makes use both of an adequate system model as well as of measurement data. This concept is limited to deterministic systems though. As for real world systems, the influence of stochastic processes often cannot be neglected, we have introduced the basics for the handling of stochastic variables and signals. Additionally, we have discussed the estimation theory and found ways to estimate the value of a quantity based on measurement data that was generated from the quantity by some probabilistic mapping. In the section on estimation theory, we have already distinguished between scenarios in which we had to assume that the generation of the quantity to be estimated is completely unknown to us (non‐random estimation), and scenarios in which we assumed the quantity to be created by a probabilistic experiment (Bayes estimation).
168
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
We will now combine all these principles. We assume that the unknown quantity we want to supervise can be treated as a state in a state space model of a linear dynamic system. We assume that we have a model of the system, while allowing for some uncertainty in modelling. We further assume that we get measurements of the outputs, which is overlaid by some noise. The task will be to create estimations of the systems state. For this purpose, the so‐called Kalman filter can be employed. The Kalman filter is a set of mathematical equations, introduced by the Hungarian‐born American engineer R. E. Kalman (Kalman, 1960), enabling the user to perform state estimation using both available system knowledge and noisy measurements. As it is assumed that we have some a priori knowledge about the unknown quantity in terms of a system model, the Kalman filter is classified as Bayes estimator, and it is based on a minimum variance cost function. In what follows, we will introduce the Kalman filter, starting with the discrete time approach. This is due to the fact that on the one hand, the applications stated in chapter 5‐7 will all be based on discrete‐time descriptions. On the other hand, the discrete time approach allows for an easy introduction, as derivatives and integrals are simply replaced by differences and sums. After the discrete time Kalman filter has been introduced, we will transfer the concept into the continuous‐time domain. After that, we will consider nonlinear systems and start with the Extended Kalman filter, which is based on a linearization around the current estimate, and the Unscented Kalman filter, which employs the unscented transformation to deal with the nonlinearity. The discussions in this section and the following subsections are mainly based on Simon, 2006. Additional inspiration was gained from Brammer and Siffling, 1994. 4.3.3.1 Kalman Filter: System Description and Basics Our discussions are based on a system in discrete time state space representation according to equation (4‐46) with sample time 𝑇 and feedthrough matrix 𝐃 𝟎, which is valid for most real systems. Also, to cover the time variant systems, we will allow for the system, input and output matrices to depend on the current time step. Additionally, we now introduce two vectors with stochastic signals to disturb the overall process: The so‐called process noise 𝐰 𝑘 ∈ ℝ which represents model inaccuracies, and the measurement noise 𝐯 𝑘 ∈ ℝ representing the noise that is added to the true system outputs by the measurement process: 𝐱 𝑘 𝐲 𝑘
1
𝐀 𝑘 𝐱 𝑘
𝐂 𝑘 𝐱 𝑘
𝐁 𝑘 𝐮 𝑘
𝐯 𝑘 .
𝐰 𝑘 ,
(4‐253)
For all stochastic processed included in 𝐰 𝑘 and 𝐯 𝑘 , it is assumed that they are white, Gaussian, zero‐mean and that all processes in 𝐰 𝑘 are uncorrelated with those of 𝐯 𝑘 (whereas it is allowed that both within 𝐰 𝑘 and 𝐯 𝑘 , the signals might be correlated with each other). We have introduced the covariance matrix for vectors with stochastic variables in equation ), containing the variances of the variables in the main diagonal and the covarinces between the single variables outside of the main diagonal. Therefore, we now introduce covariance matrix 𝐐 𝑘 for the process noise 𝒘 𝑘 and covariance matrix 𝐑 𝑘 for the process noise 𝒗 𝑘 . Note that for time invariant systems, both matrices can be assumed to be constant. Also note that both matrices, especially the measurement noise covariance matrix, will often be assumed to be diagonal, in cases where it can be assumed that there is no correlation between the single noises in the vector. We can summarize the stated properties of the noises in the following form:
4.3 Parameter and Variable Estimation
169
𝒘 𝑘 ~𝒩 𝟎, 𝐐 𝑘 𝒗 𝑘 ~𝒩 𝟎, 𝐑 𝑘 𝐸 𝒘 𝑖 𝒘 𝑗 𝐸 𝒗 𝑖 𝒗 𝑗 𝐸 𝒗 𝑖 𝒘 𝑗 where 𝛿
𝐐 𝑘 𝛿 𝐑 𝑘 𝛿
(4‐254)
𝟎 1 for 𝑖 𝑗 denoted as Kronecker delta . 0 otherwise
Figure 4‐34 shows the block diagram of the extended state space representation in discrete time. The block with the notation 𝑧 is a delay block that holds and delays the input for the time period 𝑇, according to the z‐transformation which we have not introduced here.
Figure 4‐34: State space representation including process and measurement noise
Before we continue, it is reasonable to think for a moment about different estimation possibilities that are possible now. So far, we have assumed that the quantity to estimate is either a stochastic variable, or that nothing is known about its nature. Now, in the discrete time state space representation, the states to estimate are summed up in a vector as a function of the current time step, and we assume that between the state values from different time steps there is a relation that we are able to model. Therefore, it is not reasonable to see the state vector in every time step as ‘stand‐alone’, but we should make usage of the relations we have found by the modelling process. By looking at the system description in equation (4‐253) and neglecting the disturbing influence of the noises for a moment, we can see that the state vector of time step 𝑘 1 is solely a function of values at time step 𝑘, which means that the state vector of 𝑘 is solely a function of values at time step 𝑘 1 (in fact, in some literature the vector state difference equation is written in that form). This might give us the opportunity to make an estimate for the state vector in a time step 𝑘, soley based on values of 𝑘 1, without having knowledge of the measurements of time step 𝑘. This estimate can even be made at time step 𝑘 1, so we can say we predict the state vector for the next time step. In the context of Kalman filtering, this predictive estimation is referred as a priori estimation, denoted with a superscripted minus sign: 𝑥 𝑘 . The term a priori is used similarly as within the Bayes estimation introduced in section 4.3.2.1; here it denotes that we make an estimate for time step 𝑘 without employing measurement data of time step 𝑘; the latest measurement we can use is the one from time step 𝑘 1. Therefore, the estimation from 𝑘 1 to 𝑘 is built on our a priori knowledge of the system behaviour. Aiming for an unbiased estimator, we can say that the a priori estimate for time step 𝑘 is the expected value of the system state vector at time step 𝑘, based on knowledge available at time step 𝑘 1. This can be written as:
170 𝑥
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
𝑘
𝐸 𝑥 𝑘 |𝑦 1 , 𝑦 2 , … , 𝑦 𝑘
1
.
(4‐255)
Note that we have used the vertical line 𝑥 𝑘 | in the same way as before for the conditional probability: After the line, we note the elements (or events) that we assume as ‘have happened’ or ‘are available’. Now it is straightforward to introduce another estimation of 𝑥 𝑘 , which we can compute after the measurement 𝑦 𝑘 is available. We refer to this estimation as a posteriori estimation 𝑥 𝑘 , in the same sense as we did for the Bayes estimation in section 4.3.2.1, namely as estimation after the current measurements are available. We can also say that we use the information brought by the measurements to correct the previously made prediction. Therefore, the structure of the Kalman filter can also be denoted as predictor‐corrector structure. However, the algorithm that we are going to develop for the a posterior estimation will employ both the measurements and the prediction: We have to keep in mind that the measurements are disturbed by the measurement noise; therefore our algorithm should also consider the predicted system behaviour and not solely rely on measurement values which might possible be heavy corrupted. Mathematically, we can write for the a posteriori estimation: 𝑥
𝑘
𝐸 𝑥 𝑘 |𝑦 1 , 𝑦 2 , … , 𝑦 𝑘
1 ,𝑦 𝑘
.
(4‐256)
Figure 4‐35 shows the difference between the two mentioned estimates: We assume that we are currently at the time step marked by the arrow in upwards direction, that means we already have the measurements at that time step. The a posteriori estimate is the one for the current time step, for which the measurement data is already available, while a a priori estimate reaches one step into the future. The quality bar at the bottom shows the typical accuracy that can be reached, where the left side represents a better and the right side representing a worse one. Our goal is to improve the quality of the a priori estimation by the a priori one, because it can also be based on measurement data; therefore, it is closer to the left end of the bar.
Figure 4‐35: Different estimates of a discrete time variable
For the sake of completeness, we will also name the two other possible classes of estimates, even though they are not employed within the Kalman filter. It is possible to estimate the state vector for periods that are further away in the future. Assume we want to estimate the state vector at time step 𝑘, but currently we are only at time step 𝑘 𝑀. We refer to this situation as a predicted estimate
4.3 Parameter and Variable Estimation 𝑥 𝑘|𝑘
𝑀
𝐸 𝑥 𝑘 |𝑦 1 , 𝑦 2 , … , 𝑦 𝑘
171
𝑀
(4‐257)
of time step 𝑘, performed at time step 𝑘 𝑀. The larger 𝑀 will become, the lower is the accuracy of our estimate. This is easy to see. Our predicted estimate can only be based on the vector state difference equation according to equation (4‐253). In every time step, there is an additional influence of the process noise that we have no possibility to predict. Therefore, the accuracy of our estimate will get worse with every further time step. It is also possible to perform an estimate of a time step 𝑘 in the past, incorporating measurement data between 𝑘 and the current time step 𝑘 𝑁. We would expect the quality of this estimate to rise, as we can use more measurement data. Such an estimate is denoted as smoothed estimate: 𝑥 𝑘|𝑘
𝑁
𝐸 𝑥 𝑘 |𝑦 1 , 𝑦 2 , … , 𝑦 𝑘 , … , 𝑦 𝑘
𝑁
.
(4‐258)
To come back to the Kalman filter, we have seen that we perform two estimations in every time step: an a priori estimation 𝐱 𝑘 which does not depend on measurement data of time step 𝑘, and an a posteriori estimation 𝐱 𝑘 that incorporated the information provided by the measurements in step 𝑘. As it was said above, the is based on a minimum variance cost function. That means we must be able to express the variance of the estimation error 𝛜 𝑘 ∈ ℝ . Therefore, we will obtain a covariance matrix, according to the discussions around equation ), which is denoted as 𝐏 𝑘 ∈ ℝ . As we have two estimations in every time step, we will also have two estimation errors and therefore two estimation matrices, which are also denoted as a priori and a posteriori: 𝛜 𝑘
𝐱 𝑘
𝐱
𝑘 ;
𝐏
𝑘
𝐸 𝛜 𝑘 𝛜
𝑘
,
𝛜 𝑘
𝐱 𝑘
𝐱
𝑘 ;
𝐏
𝑘
𝐸 𝛜 𝑘 𝛜
𝑘
.
(4‐259)
Note that in the equations for the covariance matrix, we have assumed that the expected value of the estimation error is zero, that is, that the Kalman filter is unbiased; we will show that in the process of describing the algorithm.
Figure 4‐36: A priori and a posteriori estimates of a Kalman filter with typical course of error variance
172
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
Figure 4‐36 shows the computation principle of the a priori and a posteriori estimates. Each a priori estimate and the accordant covariance matrix has to be computed based on the a posteriori estimates and the covariance matrix of the previous time step; additionally, the input values of the last time step can be used. The a posteriori estimates and matrices are computed based on the a priori estimates and matrices of the same time step; and additionally the measurements of the current time step have to be used. In the upper half of Figure 4‐36, a typical course of one of the estimation error variances is displayed. Whenever an a priori estimate is executed, we can expect that the variance is larger than it was at the last a posteriori estimation. This is due to the fact that the a priori estimation can solely be based on the vector state difference equation according to equation (4‐253), but we have no information about the value of the process noise in the current time step. This will raise the uncertainty of the estimation, namely of the error covariance. When we perform the a posteriori estimation, we have access to the measurement of the accordant time step. According to the output equation in (4‐253), the current outputs depend on the current states, that means, the value of the process noise has had an influence on the output, and due to our measurement, we might be able to assess the influence of the process noise. But it must be kept in mind that the measurement is corrupted by the measurement noise, so the Kalman filter has to find a compromise between solely relying on the a priori estimation (that would neglect the influence of the process noise) and solely relying on the measurements (that would neglect the effect of the measurement noise). As for the a posteriori estimation, we have access to both the a priori estimation (which brings the system model into the equation) and the measurements; therefore we usually expect that the a posteriori estimation error variance is smaller than the a priori one before. Before we start to discuss the two estimations necessary for the Kalman filter, we will look at the initialization process. We assume that the first measurement will be available at step 𝑘 1. Therefore, it is reasonable to initialize the estimation vector as a posteriori estimation for k=0 as the expected value of the original state vector at the same time step: 𝐱
0
𝐸 𝐱 0
.
(4‐260)
The a posteriori covariance matrix is usually set to a diagonal matrix, possibly with identical values, to express the uncertainty of the initialization of the estimation: 𝐏
0
𝑐 𝐈 ,
(4‐261)
where 𝑐 is a real number which is set to a large value if no good information on 𝐸 𝐱 0 was available to initialize the estimation vector, or to a small number otherwise. Note that in real applications, it is common to try and perform a precise initialization, if this is possible, to create good conditions for the filter. In the navigation of marine underwater vehicles, the initialization is often done while the vehicle is still at the surface and has GPS access, so the navigation filter can be initialized with good accuracy. For nonlinear filter, this approach is mandatory (see. Section 4.3.3.6). In the following, we will derive the algorithms for the a priori and a posteriori estimations in form of a recursive algorithm.
4.3 Parameter and Variable Estimation
173
4.3.3.2 A Priori Estimation As stated before, we need to find an algorithm to compute 𝐱 𝑘 , based on the knowledge of 𝐱 𝑘 1 , 𝐏 𝑘 1 , and 𝐮 𝑘 1 . According to equation (4‐255), the a priori estimation at time step 𝑘 equals the expected value of 𝐱 at time step 𝑘, based on the measurement data up to time step 𝑘 1. It is straightforward to say that the measurements 𝐲 𝑘 1 are incorporated in the a posteriori estimation 𝐱 𝑘 1 . To get an algorithm for 𝐱 𝑘 , we must compute 𝐸 𝐱 𝑘 without employing knowledge on 𝐲 𝑘 . We can look at the vector state difference equation (4‐253) and apply the expected value on both sides of the equation. Employing the principle obtained in equation (4‐165), we can write: 𝐸 𝐱 𝑘
𝐸 𝐀 𝑘
1 𝐱 𝑘
𝐀 𝑘
1 𝐸 𝐱 𝑘
1
𝐁 𝑘
1
1 𝐮 𝑘
𝐁 𝑘
1
1 𝐮 𝑘
𝐰 𝑘
1
(4‐262)
1
Note that 𝐸 𝐁 𝑘 1 𝐮 𝑘 1 𝐁 𝑘 1 𝐮 𝑘 1 , because both variables are deterministic. The expected value of the process noise is zero, as we assumed it zero‐mean. But how can we express 𝐸 𝐱 𝑘 1 ? For 𝑘 1, this is easy to do, when we look at equation (4‐260), as 𝐸 𝐱 0 𝐱 0 . It is straightforward to employ the same principle also for all 𝑘 1, as long as the a posteriori estimation is unbiased. Because then always the relation 𝐸 𝐱 𝑘 1 𝐱 𝑘 1 holds. Therefore we can write: 𝐱
𝑘
𝐸 𝐱 𝑘
𝐀 𝑘
1 𝐱
𝑘
1
𝐁 𝑘
1 𝐮 𝑘
1
(4‐263)
This is the first of the Kalman equations. We see that the a priori estimation can be obtained from the a posteriori estimation of the last time step by passing it through the vector state difference equation. By applying the expected value on equation (4‐263), we see directly that the a priori estimator can be considered to be unbiased, as its expected value equals 𝐱 𝑘 . Because we have no way to consider the process noise, we expect that a rise in the estimation error covariance matrix, as shown in Figure 4‐36. The covariance matrix can be computed according equation (4‐259) to be: 𝐏
𝑘
𝐸
𝐱 𝑘
𝐱
𝑘
𝐱 𝑘
𝐱
𝑘
.
(4‐264)
By inserting equations (4‐253) and (4‐263) into (4‐264), we obtain: 𝐏
𝑘
𝐸 𝐀 𝑘 1 𝐱 𝑘 1 𝐁 𝑘 1 𝐮 𝑘 1 𝐰 𝑘 𝐀 𝑘 1 𝐱 𝑘 1 𝐁 𝑘 1 𝐮 𝑘 1 …
1
Note that the terms in pale cancel each other. After factoring 𝐀 𝑘 the multiplication with … to obtain: 𝐏
𝐸
𝑘 ⎧𝐀 𝑘 ⎪ ⎨ ⎪ ⎩
𝐸 𝐀 𝑘
1 𝐱 𝑘
1
𝐱
1 𝐱 𝑘 1 𝐱 𝑘 1 𝐀 𝑘 1 𝐱 𝑘 1 𝐰 𝑘 1 𝐱 𝑘 1 𝐰 𝑘 1
𝑘
1
𝐰 𝑘
1
(4‐265)
.
1 out, we can execute …
𝐱 𝑘 1 𝐱 𝑘 1 𝐀 𝑘 𝐱 𝑘 1 𝐰 𝑘 1 𝐱 𝑘 1 𝐀 𝑘 1 𝐰 𝑘 1
1 ⎫ ⎪ ⎬ ⎪ ⎭
.
(4‐266)
174
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
We have seen in equation (4‐165) that the expected value of a sum equals the sum of the expected values of the summands. Furthermore, we can replace the differences of states and estimates by the estimation error according to equation (4‐259) to obtain: 𝐏
𝑘
𝐸 𝐀 𝑘
1 𝛜 𝑘
1 𝛜
𝑘
1 𝐀 𝑘
𝐸 𝐀 𝑘 1 𝛜 𝑘 1 𝐰 𝑘 1 𝐸 𝐰 𝑘 1 𝛜 𝑘 1 𝐀 𝑘 𝐸 𝐰 𝑘
1 𝐰 𝑘
1
1 (4‐267)
1
.
For the second and third summand marked in pale, the following holds true: Both contain the product of the two stochastic variables 𝛜 𝑘 1 and 𝐰 𝑘 1 (or their transposes). Note that 𝛜 𝑘 1 is a stochastic variable based on 𝐱 𝑘 1 , which again is based in 𝐲 𝑘 1 , and therefore correlated with the measurement noise 𝐯 𝑘 1 . But 𝐯 𝑘 1 is by definition uncorrelated with 𝐰 𝑘 1 , see equation (4‐254). According to equation (4‐174), it holds true that the expected value of a product of two uncorrelated stochastic variables equals the product of the expected values of the two stochastic variables. In this case, both the expected values of 𝛜 𝑘 1 and 𝐰 𝑘 1 are zero; therefore the second and third summand of the sum in equation (4‐267) equal zero. In the first and fourth summand, we can replace the expected values according to equation (4‐259) respectively (4‐254), and we can write: 𝐏
𝑘
𝐀 𝑘
1 𝐸 𝛜 𝑘 𝐸 𝐰 𝑘
⇒ 𝐏
𝑘
𝐀 𝑘
1 𝐏
1 𝛜
1 𝐰 𝑘 𝑘
𝑘
1
𝐀 𝑘
1
1
1 𝐀 𝑘
1
(4‐268) 𝐐 𝑘
1 .
This is the second Kalman filter equation. It computes the current a priori estimation error covariance matrix based on the a posteriori one of the last time step. We see that the covariance matrix of the process noise is directly added, so that a large 𝐐 will also contribute to a large rising of 𝐏 , what is a very sound finding. Also, assume that a filter can be initialized 𝟎. In this case, it holds true that 𝐏 1 𝐐. This is also a absolutely correct, so that 𝐏 0 reasonable result, as in this case all the uncertainty after the first step is caused solely by the process noise. 4.3.3.3 A Posteriori Estimation Now we must answer the following question: Assuming that we have an a priori estimation which contains the information of the inputs and the system model, how can we update this estimation as soon as measurements of the outputs are available? We want the resulting a posteriori estimation to be unbiased, and we would like to minimize the variance of the estimation error. As the a priori estimation before, we would like to find a recursive algorithm that computes the current estimate based on the prior estimates. To find a solution, we will start to tackle a similar, but simpler problem: Imagine we have a constant parameter vector 𝐱 that we want to estimate. We will obtain measurements of the quantity 𝐂 𝑘 𝐱 in every time step 𝑘, overlaid by zero‐mean white Gaussian noise 𝐯 𝑘 with covariance matrix 𝐑 𝑘 . The principle is sketched in Figure 4‐37. It can be stated that
4.3 Parameter and Variable Estimation 𝐲 𝑘
𝐂 𝑘 𝐱
175
𝐯 𝑘 .
(4‐269)
Figure 4‐37: Simplified approach for the derivation of the a posteriori estimation
In every time step, after the measurements are available, we are supposed to compute an estimate of the parameter vector, denoted as 𝐱 𝑘 . We also introduce the covariance matrix 𝐏 𝑘 related to the estimation error 𝛜 𝑘 as 𝐏 𝑘
𝐸 𝛜 𝑘 𝛜 𝑘
𝐸
𝐱
𝐱 𝑘
𝐱
𝐱 𝑘
.
(4‐270)
It is straightforward to use the same initialization as we introduced for the overall Kalman filter in section 4.3.3.1: 𝐱 0
𝐸 𝐱 ,𝐏 0
𝐸
𝐱
𝐱 0
𝐱
𝐱 0
𝑐 𝐈 ,
(4‐271)
where the real scalar 𝑐 is set according to our trust in the initial value for 𝐱 0 . In the described situation, where we look for a way to update our estimate from 𝐱 𝑘 1 to 𝐱 𝑘 upon the measurements 𝐲 𝑘 , it is straightforward to employ the following recursive linear estimator: 𝐱 𝑘
𝐱 𝑘
1
𝐊 𝑘 𝐲 𝑘
𝐂 𝑘 𝐱 𝑘
1
,
(4‐272)
where 𝐊 𝑘 is a gain matrix which we can use to tune the estimator; it is also denoted as Kalman (filter) gain. We see that the term in the round brackets is the difference between the received measurements, 𝐲 𝑘 , and the ones we would expect based on our current estimate, 𝐂 𝑘 𝐱 𝑘 1 . The difference is referred to as the correction term. If the correction terms exhibits large values, this might indicate that our current estimate is still far from the true values, which might result in a large correction. However, as it was discussed before, we see that even if at some point our estimation 𝐱 𝑘 1 is completely equal the true vector 𝐱, the correction term might still not be zero, as the measurement is corrupted by the measurement noise 𝐯 𝑘 . Therefore we need an algorithm to compute 𝐊 𝑘 which considers the influence of the noise. But before we look at this, we need to investigate if and for which conditions related to 𝐊 𝑘 the estimator in equation (4‐272) is unbiased. We can do this by computing the expected value of the estimation error, 𝛜 𝑘 𝐱 𝐱 𝑘 , where we replace 𝐱 𝑘 according to equation (4‐272). We obtain: 𝐸 𝛜 𝑘
𝐸 𝐱
𝐱 𝑘
𝐸 𝐱
𝐱 𝑘
1
𝐊 𝑘 𝐲 𝑘
𝐂 𝑘 𝐱 𝑘
1
.
(4‐273)
176
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
Note that 𝐱 𝐱 𝑘 equation (4‐269): 𝐸 𝛜 𝑘
1 can be replaced by 𝛜 𝑘
1 , and 𝐲 𝑘 can be replaced according to
𝐸 𝛜 𝑘
1
𝐊 𝑘 𝐂 𝑘 𝐱
𝐯 𝑘
𝐂 𝑘 𝐱 𝑘
𝐸 𝛜 𝑘
1
𝐊 𝑘 𝐂 𝑘 𝐱
𝐱 𝑘
1
𝐈
𝐊 𝑘 𝐂 𝑘
𝐸 𝛜 𝑘
1
1
𝐊 𝑘 𝐯 𝑘
𝐊 𝑘 𝐸 𝐯 𝑘
(4‐274)
.
As the measurement noise is zero‐mean, the subtrahend in this equation equals zero. That means: the expected value of the estimation error in step 𝑘 is zero, if the same is true at time step 𝑘 1. Due to the chosen initialisation according to equation (4‐271), we see that the expected value at all time steps is zero, and the estimator is unbiased. Interestingly enough, this property is independent from the chosen values for 𝐊 𝑘 . That means, we can chose 𝐊 𝑘 arbitrarily without destroying the unbiasedness. During our discussions on non‐random estimations in section 4.3.2.3, we have seen that a minimal variance of the estimation error is a suitable optimization criterion for an unbiased estimator. As stated before, the Kalman filter is a minimum variance estimator. Therefore, we need to find a 𝐊 𝑘 that minimizes the variances at time step 𝑘. Therefore, we can define a cost function as 𝐶
Var 𝐱
𝐱 𝑘
𝐸 𝜖 𝑘
𝐸
⋯
𝑥
𝜖 𝑘
𝑥 𝑘
⋯
𝐸 𝛜 𝑘 𝛜 𝑘
𝐸
𝑥
𝑥 𝑘
(4‐275)
.
For the final transformation, note that the sum of the quadrats of the elements in a vector can be expressed as the multiplication of the transposed vector with the original one. On the other hand, if we swap the vectors in the multiplication, we obtain the expression 𝛜 𝑘 𝛜 𝑘 which spans a matrix that has the quadrats of the elements in its main diagonal. The mathematical trace operation tr ∙ returns the sum of the elements in the main diagonal of a matrix. If we further consider that the expected value of 𝛜 𝑘 𝛜 𝑘 equals the covariance matrix 𝐏 𝑘 according to equation (4‐270), we can write: 𝐶
𝐸 𝛜 𝑘 𝛜 𝑘
𝐸 tr 𝛜 𝑘 𝛜 𝑘
tr 𝐏 𝑘 .
(4‐276)
Now we have to find the 𝐊 𝑘 that minimizes 𝐶 . In order to do so, we need to express the current value of 𝐏 𝑘 in a recursive algorithm based on 𝐏 𝑘 1 and 𝐊 𝑘 . By inserting the result of equation (4‐274) into (4‐270) we obtain: 𝐏 𝑘
𝐸 𝛜 𝑘 𝛜 𝑘
𝐈
𝐊 𝑘 𝐂 𝑘 𝐈
𝐊 𝑘 𝐂 𝑘
𝐸
𝐸 𝛜 𝑘 𝐸 𝛜 𝑘
𝐈
𝐊 𝑘 𝐂 𝑘
1 𝛜 𝑘 1 𝐯 𝑘
𝐊 𝑘 𝐸 𝐯 𝑘 𝛜 𝑘
1
𝐈
𝐊 𝑘 𝐸 𝐯 𝑘 𝐯 𝑘
𝐊 𝑘 .
1
𝛜 𝑘 𝐈
𝐊 𝑘 𝐂 𝑘
𝐊 𝑘
𝐊 𝑘 𝐂 𝑘
1
𝐊 𝑘 𝐯 𝑘
…
(4‐277)
We can now use a similar argumentation as we did for equation (4‐267): Because the measurement noise of the current step 𝐯 𝑘 is surely uncorrelated with the estimation error
4.3 Parameter and Variable Estimation
177
𝛜 𝑘 1 of the last time step, we can transform the expected value of the product in the second and third summand of the last equation into the product of the expected values of the two factors. Thus the measurement noise is zero‐mean, the second and third summand equal zero. We can further see that we can replace the expected value in the first summand by the covariance matrix of the last time step. The expected value in the fourth summand can be replaced by the measurement noise covariance matrix 𝐑 𝑘 , according to equation (4‐254). This results in: 𝐏 𝑘
𝐈
𝐊 𝑘 𝐂 𝑘
𝐏 𝑘
1
𝐈
𝐊 𝑘 𝐂 𝑘
𝐊 𝑘 𝐑 𝑘 𝐊 𝑘 .
(4‐278)
We see that a large 𝐑 𝑘 or a large 𝐊 𝑘 might result in a raise of the values in 𝐏 𝑘 . This seems reasonable; a large 𝐊 𝑘 will enlarge the effect of the correction term in equation (4‐272) which is corrupted by the measurement noise contained in 𝐲 𝑘 . We need the find the 𝐊 𝑘 that minizimes the cost function. By inserting equation (4‐278) into (4‐276), deviating with respect to 𝐊 𝑘 and setting the result equal to zero we obtain: 𝜕𝐶 𝜕𝐊 𝑘
𝜕tr 𝐏 𝑘 𝜕𝐊 𝑘 𝜕tr 𝐈
𝐊 𝑘 𝐂 𝑘
𝐏 𝑘
1
𝐈
𝐊 𝑘 𝐂 𝑘
𝐊 𝑘 𝐑 𝑘 𝐊 𝑘
(4‐279)
𝜕𝐊 𝑘 0 .
This equation can be solved employing some knowledge about the derivation of matrices. The following two important rules can be found e.g. in Skelton et al., 1998: 𝜕 𝐟 𝐱 𝐟 ; 𝜕𝐱 𝜕 tr 𝐀 𝐁 𝐀 𝜕𝐀
(4‐280) 2 𝐀 𝐁 if 𝐁 symmetric.
We can see that the condition for the second equation is given, as both 𝐏 𝑘 1 and 𝐑 𝑘 are covariance matrices which are always symmetric, see equation ) and the discussions besides. To solve equation (4‐279), we can compute the derivation for both summands separately. For the second summand, we can directly apply the second equation in (4‐280). In the first summand, we can set 𝐀 𝐈 𝐊 𝑘 𝐂 𝑘 , but we have to keep in mind that we need to derivate with respect to 𝐊 𝑘 . Therefore, according to the chain rule, we need to multiply the result with the derivative of 𝐈 𝐊 𝑘 𝐂 𝑘 with respect to 𝐊 𝑘 , which equals 𝐂 𝑘 according to the first equation in (4‐280). Thus we obtain: 𝜕𝐶 𝜕𝐊 𝑘
2 𝐈
𝐊 𝑘 𝐂 𝑘
⇒𝐊 𝑘 𝐑 𝑘
𝐈
⇒𝐊 𝑘
𝐂 𝑘 𝐏 𝑘
𝐏 𝑘
𝐊 𝑘 𝐂 𝑘
1 𝐏 𝑘
2𝐊 𝑘 𝐑 𝑘
𝐂 𝑘
0
1 𝐂 𝑘 (4‐281)
⇒𝐊 𝑘
𝐑 𝑘 𝐏 𝑘
1 𝐂 𝑘
1 𝐂 𝑘 𝐑 𝑘
𝐏 𝑘
𝐂 𝑘 𝐏 𝑘
1 𝐂 𝑘 1 𝐂 𝑘
.
178
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
To fulfill the sufficient condition the second derivation has to be greater than zero: 𝜕 𝐶 𝜕𝐊 𝑘 𝜕𝐶 𝜕𝐊 𝑘
2𝐏 𝑘
2 𝐂 𝑘 𝐏 𝑘
1 𝐂 𝑘 1 𝐂 𝑘
2𝐊 𝑘 𝐂 𝑘 𝐏 𝑘
1 𝐂 𝑘
2𝐊 𝑘 𝐑 𝑘
(4‐282)
2 𝐑 𝑘 .
The expression is positive semidefinite. This is due to the fact that every covariance matrix is positive semidefinite, and the same is true for every matrix 𝐀 𝐀 . This shows that the computed extremum is a minimum. Equations (4‐272), (4‐278), and (4‐281) describe our estimator for the constant parameter case. It is straightforward to go back to the case where 𝐱 𝑘 is variable and may change with time. In this case, 𝐱 𝑘 represents the estimation and 𝐏 𝑘 the error covariance matrix after the measurements 𝐲 𝑘 are available. This equals 𝐱 𝑘 and 𝐏 𝑘 in the constant parameter case. On the other hand, 𝐱 𝑘 and 𝐏 𝑘 are used in the situation before the measurements 𝐲 𝑘 are available; they can therefore be equalized with 𝐱 𝑘 1 and 𝐏 𝑘 1 for the constant 𝐱‐ case. Therefore, in the three above mentioned equation, we have to substitute 𝐱 𝑘
1 ⇒𝐱
𝐱 𝑘 ⇒𝐱
𝑘 ; 𝐏 𝑘
𝑘 ; 𝐏 𝑘 ⇒𝐏
1 ⇒𝐏
𝑘 and
𝑘
(4‐283)
in order to obtain the equations for the a posteriori estimation of the Kalman filter. In the next section, we will summarize the five impostant Kalman filter equations for a priori and a posteriori estimation. 4.3.3.4 Summary: Kalman Filter for Linear Discrete‐Time Systems
Figure 4‐38: Block diagram of the linear discrete Kalman filter
4.3 Parameter and Variable Estimation
179
The linear discrete time Kalman filter Initialization: 𝐱
0
𝐸 𝐱 0 0 according to reliability of the initialisation of 𝐱
Set 𝐏 𝑘
(4‐260)
0 .
1
Time step 𝑘
𝐮 𝑘
1
Prediction (A priori estimation): 𝐱
𝑘
𝐀 𝑘
1 𝐱
𝑘
1
𝐁 𝑘
𝐏
𝑘
𝐀 𝑘
1 𝐏
𝑘
1 𝐀 𝑘
1 𝐮 𝑘 1
1
(4‐263)
𝐐 𝑘
1
(4‐268)
𝑘 𝐂 𝑘
(4‐281)
Correction (A posteriori estimation): 𝐲 𝑘
𝐊 𝑘 𝐱
𝑘
𝐏
𝑘
𝐏
𝑘 𝐂 𝑘
𝐱
𝑘 𝐈
𝐑 𝑘
𝐂 𝑘 𝐏
𝐊 𝑘 𝐲 𝑘 𝐊 𝑘 𝐂 𝑘
𝐏
𝐂 𝑘 𝐱 𝑘
𝐈
𝑘
𝐊 𝑘 𝐂 𝑘
(4‐272)
𝐊 𝑘 𝐑 𝑘 𝐊 𝑘
(4‐278)
Next time step: 𝑘∶ 𝑘
1
Figure 4‐39: Algorithm of the linear discrete time Kalman filter
As we have seen, the linear discrete time Kalman filter is a predictor‐corrector algorithm that minimizes the estimation error covariance. Figure 4‐38 shows the block diagram of the filter. Figure 4‐39 summarizes the algorithm, as we have derived it within the prior section. Again, the two‐step approach with the a priori estimation, which is based on the system knowledge, and the a posteriori estimation, which adds the measurement information, gets visible. When we look at the algorithms in Figure 4‐39, we see that we need the covariance matrices of process and measurement noises, 𝐐 𝑘 and 𝐑 𝑘 . In real application, it cannot always be guaranteed that these matrices are available. While the variances of the measurement noises might be available (e. g. in the data sheet of the employed sensors), especially the variances of the process noise are often completely unknown. Moreover, we have to keep in mind that the process noise models the uncertainties that we have within our model as well as within the inputs. In many cases, it might not even be justified to treat the modelling error as an added
180
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
noise, or at least not as an added noise with normal distribution. All this can have a negative influence on the performance of the Kalman filter. Also, it might require the user to adapt the covariance matrices in a way that he is satisfied with the filtering process, that means 𝐐 𝑘 and 𝐑 𝑘 are used for empirical tuning of the filter. We will deepen the discussion of that issue employing the example of the mechanical system that was established in section 4.1.3.3 (see Figure 4‐14). Additional to the already introduced parts, we will now assume that there is a process noise, denoted as 𝐰 𝑘 , and a measurement noise 𝐯 𝑘 . This gives raise to the following state space representation: 𝐱 𝑘 𝑦 𝑘
1
𝐀 𝐱 𝑘 𝐜 𝐱 𝑘
𝐰~𝒩 𝟎, 𝐐 , 𝐐 𝑣~𝒩 0, 𝑅 , 𝑅
𝐛 𝑢 𝑘
𝐰 𝑘 ,
𝑣 𝑘 , 0.001 𝐈 ,
(4‐284)
0.04 ,
𝐀 , 𝐛 , and 𝐜 according to equation (4‐59) and (4‐60).
Figure 4‐40: States 1 (left) and 2 (right) of example systems in a numerical simulation with and without process noise
Figure 4‐40 shows the results of a numerical simulation of the system The dashed curves show the states including the process noise, and the solid ones without. We have to keep in mind that the process noise represents inaccuracies in our model and the inputs; therefore, we are interested in estimation the blue curve. The stars represent the measurement values at the particular time, which are disturbed by the measurement noise. As state two cannot be measured directly, the stars are computed by subtracting the last from the current time step measurement and dividing the result by the sample time 𝑇. This is the simplest way to obtain estimates for state 2 without employing a filter. However, it is straightforward to say that due to the large measurement noise, this process leads to an intensively oscillating estimation which is of no use for any practical application.
4.3 Parameter and Variable Estimation
181
Figure 4‐41: Original States and Kalman filter estimation for optimal filter parameters
Figure 4‐42: Original States and Kalman filter estimation, too large values for matrix Q
Figure 4‐43: Original States and Kalman filter estimation, too large values for matrix R
Figure 4‐41 shows again the simulation of the system states with process noise (blue), the measurements for state 1 respectively the estimations for state 2, as explained above (black), and the a posteriori estimations obtained by a discrete linear Kalman filter with the algorithms
182
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
according to Figure 4‐39. Additionally, the dotted red curves show the confidence interval of the standard deviation of the estimation error, computed as 𝒙𝒊 𝒌
𝑷𝒊,𝒊 𝒌 . The results are
very precise; it was possible to re‐establish the course of the process noise by the quite noisy measurements. As one can see, the real curve (blue) is mostly within the confidence interval, that means that also the estimation error was estimated with a good accuracy. We have to keep in mind that for the computation we employed the true values of 𝐐 and 𝑹. As stated, they are often not known in real applications, and therefore we might need to employ estimates, 𝐐 and 𝑹, in the algorithm according to Figure 4‐39, and we need to manipulate the values until the Kalman filter performance meets our requirements. In what follows, we will look at filter performances with non‐optimal parameters. Figure 4‐42 shows the results for 𝐐 100 𝐐 and 𝑅 𝑅. With this values, we ‘tell’ the algorithm that the process noise might be very larger than it is. The algorithm will therefore put more trust in the measurements, neglecting the influence of the a priori estimations. As result, as shown in the left figure, the filter estimates are almost perfectly following the measured value, and almost no system knowledge is employed. For the second state, the estimation is again quite oscillating towards the ‘pseudo‐measurements’. Note the large confidence interval, based on large values of 𝑃 , 𝑘 , that includes most of the noisy pseudo‐measurements. Finally, in Figure 4‐43, the other extremum is displayed: For 𝐐 𝐐 and 𝐑 100 𝐑, the algorithm does not trust the measurements, and the estimation is almost completely based on the system knowledge and the input values. As a consequence, the curves of the estimations are almost identical to the simulation without process noise (Figure 4‐40, solid curves). As stated, in real applications it is common to adapt the covariance matrices until the result is appealing. This requires some experience with the handling of Kalman filters. Figure 4‐44 shows another simulation of the system and the Kalman filter. This time, the original covariance matrices were used, but the Kalman filter was wrongfully inialized with 𝐱 0 2 2 instead of the true values of 0 0 . As one can see, the estimation approaches the true values quickly. It is a very beneficial property of the Kalman filter as linear estimator to approach the estimation to the true value in a finite time, even if there is a large error at some time instance. This is very helpful if no measurements are available for some time, or if the initial values are not known at all, as shown in the example.
Figure 4‐44: Original States and Kalman filter estimation, wrong initialization; color scheme is the same as in the previous figures
4.3 Parameter and Variable Estimation
183
Before we conclude this section, it shall be noticed that in literature there are different forms of the Kalman filter algorithm, which are usually mathematically identical, but might exhibit typical advantages and disadvantages in real applications. As we will need two alternative forms in the coming section, we will derive them at this point. Namely, we strive to find an algorithm for the computation of 𝐏 𝑘 without using 𝐊 𝑘 . To this extend, we start with simplifying the equation for 𝐊 𝑘 in Figure 4‐39 by introducing an auxiliary variable 𝐇 𝑘 to obtain: 𝐊 𝑘
𝐏
𝑘 𝐂 𝑘 𝐇
with 𝐇 𝑘
𝐑 𝑘
𝑘 ;
𝐂 𝑘 𝐏
By inserting this into the equation for 𝐏 𝐏
𝑘
𝐈
𝐏
𝐏
𝑘 𝐂 𝑘 𝐇
𝑘 𝐂 𝑘 𝐇
(4‐285)
𝑘 𝐂 𝑘 . 𝒌 in Figure 4‐39, we obtain:
𝑘 𝐂 𝑘 𝑘 𝐑 𝑘 𝐇
𝐏
𝑘
…
𝑘 𝐂 𝑘 𝐏
(4‐286)
𝑘 ,
where we make usage of the fact that covariance matrices as well as matrix 𝐇 𝑘 are symmetric and therefore equal to their inverse. By expanding, we obtain: 𝐏
𝑘
𝐏
𝑘
𝐏 𝐏
𝑘 𝐂 𝑘 𝐇 𝑘 𝐂 𝑘 𝐏 𝑘 𝑘 𝐂 𝑘 𝐇 𝑘 𝐂 𝑘 𝐏 𝑘
𝐏
𝑘 𝐂 𝑘 𝐇
𝑘 𝐂 𝑘 𝐏
𝑘 𝐂 𝑘 𝐇
𝐏
𝑘 𝐂 𝑘 𝐇
𝑘 𝐑 𝑘 𝐇
𝑘 𝐂 𝑘 𝐏
𝑘 𝐂 𝑘 𝐏
(4‐287)
𝑘
𝑘 ,
Note that in this sum, summand two and three are identical, and summand four and five can be combined to obtain: 𝐏
𝑘
𝐏 𝐏
𝑘
2𝐏
𝑘 𝐂 𝑘 𝐇
𝑘 𝐂 𝑘 𝐇
𝑘
𝑘 𝐂 𝑘 𝐏
𝐂 𝑘 𝐏
𝑘
𝑘 𝐂 𝑘
𝐑 𝑘
𝐇
𝑘 𝐂 𝑘 𝐏
𝑘
(4‐288)
We can now substitute the square bracket by 𝐇 𝑘 , see equation (4‐285), and 𝐇 𝑘 cancels one of the 𝐇 𝑘 . As result, the third summand equals the (negative) half of the second summand, and we can write: 𝐏
𝑘
𝐏
𝑘
𝐏
𝑘 𝐂 𝑘 𝐇
𝑘 𝐂 𝑘 𝐏
𝑘
(4‐289)
From this equation on, we will split the discussions to obtain two new canonical formulations for the covariance matrix. For the first case, we can see from equation (4‐285) that the expression of 𝐊 𝑘 is within the equation, and we can substitute which gives 𝐏
𝑘
𝐏 𝐈
𝑘
𝐊 𝑘 𝐂 𝑘 𝐏
𝐊 𝑘 𝐂 𝑘 𝐏
𝑘
𝑘 ,
(4‐290)
which is one of the two other canonical forms we were looking for. For the second one, we return to equation (4‐289), get rid of the auxiliary variable by re‐substitution, and we invert both sides of the equation to obtain:
184
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
𝐏
𝑘
𝐏
𝑘 𝐏
𝑘 𝐂 𝑘
𝐂 𝑘 𝐏
𝑘 𝐂 𝑘
𝐑 𝑘
𝐂 𝑘 𝐏
𝑘
(4‐291)
We can apply the so‐called matrix inversion lemma, also denoted as Woodbury matrix identity or Sherman–Morrison–Woodbury formula (see Simon, 2006, section 1.1.2 for a complete derivation). Let 𝐀, 𝐁, 𝐂, and 𝐃 be matrices of accordant dimensions, the following relation holds: 𝐀
𝐁𝐂𝐃
𝐀
𝐀
𝐁 𝐃𝐀
𝐁
𝐂
𝐃𝐀
.
(4‐292)
We can use this to compute the right side of equation (4‐291) to obtain: 𝐏
𝑘
𝐏
𝑘 𝐏
𝑘 𝐏
𝐂 𝑘 𝐏 𝐏
𝑘
𝑘 𝐂 𝑘
𝑘 𝐂 𝑘
𝐂 𝑘 𝐑
𝐂 𝑘 𝐏 𝐑 𝑘
𝑘 𝐏
𝐂 𝑘 𝐏
𝑘 𝑘 𝐏
𝐏 𝑘
𝑘 𝐂 𝑘 (4‐293)
𝑘 𝐂 𝑘 .
After we inverse again both sides, we get our final result: 𝐏
𝑘
𝐏
𝑘
𝐂 𝑘 𝐑
𝑘 𝐂 𝑘
.
(4‐294)
This enables us to compute the a posteriori covariance matrix without the need to compute the Kalman gain matrix before. 4.3.3.5 Kalman Filter for Continuous‐Time Systems The Kalman filter is also available for continuous systems as a set of differential equations. It was derived by Kalman and R.S. Bucy and is therefore also denoted as Kalman Bucy filter. Even though we will mainly look at discrete time systems in the further course of this thesis, we will discuss the continuous time Kalman filter at this point to discover an interesting relation to the discussions on observability in the accordant section. The main difference is that we will lose the predictor corrector principle which was based on the fact that measurements where only available at discrete time steps. In the continuous time Kalman filter, we assume that the output function 𝐲 𝑡 is available at all times; therefore it is not necessary to distinguish between a priori and a posteriori estimates. Within this section, we will again clearly distinguish between the state and input matrices 𝐀 , 𝐁 of a discrete time state space realisation, and 𝐀, 𝐁 for a continuous time one. We have seen that the output matrix 𝐂 is the same in both cases. In what follows, we will show that the covariance matrices for discrete and continuous time representations are different, and we will show how they are related. To this extent, we use the notation 𝐐 , 𝐑 for a discrete time system with sample time 𝑇. We will now introduce two simple autonomous discrete systems and use them to derive the relationships. The first system obeys the following description:
4.3 Parameter and Variable Estimation 𝐱 𝑘
1
𝐱 𝑘
𝒘 𝑘 ,
𝐰 𝑘 ~𝒩 𝟎, 𝐐 𝐱 0
185
,
(4‐295)
𝟎 .
It can be stated that this system at time 𝑘 T equals the 𝑘‐times sum of single realisations of a white noise process. Therefore we can write: 𝐱 𝑘
𝐰 0
𝐰 1
⇒𝐸 𝐱 𝑘 𝐱 𝑘
⋯
𝐰 𝑘
𝐸 𝐰 0
1 ,
𝐰 1
⋯
𝐰 𝑘
1
𝐸 𝐰 0 𝐰 0 𝐸 𝐰 1 𝐰 1 𝐸 𝐰 𝑘 1 𝐰 𝑘 1
… ⋯
(4‐296)
𝑘 𝐐 . Now we will do the same computation for the continuous system 𝐱 𝑡
𝒘 𝑡 ,
𝐰 𝑡 ~𝒩 𝟎, 𝐐 , 𝐱 0
(4‐297)
𝟎 .
It is straightforward tom say that the continuous time process noise covariance can be computed to be 𝐸 𝐰 𝑡 𝐰 𝜏 where 𝛿 𝑡
𝐐𝛿 𝑡 𝜏
denoted as 𝑡𝑖𝑚𝑒
𝜏 ,
∞ for 𝑡 𝜏 ; 0 otherwise
𝛿 𝑡
𝜏 𝑑𝑡
1;
(4‐298)
𝑑𝑒𝑙𝑎𝑦𝑒𝑑 𝑖𝑚𝑝𝑢𝑙𝑠𝑒 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒 .
This is the continuous equivalent to the Kronecker delta employed in the discrete time representation according to equation (4‐254). The variance of the state can now be computed to be: 𝐸 𝐱 𝑡 𝐱 𝑡
𝐸
𝐰 𝑎 d𝑎
𝐰 𝑏 d𝑏 (4‐299)
𝐸 𝐰 𝑎 𝐰 𝑏
d𝑎 d𝑏
By inserting equation (4‐298), we obtain: 𝐸 𝐱 𝑡 𝐱 𝑡
𝐸
𝐰 𝑎 d𝑎
𝐰 𝑏 d𝑏
(4‐300)
186
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
𝐐𝛿 𝑎
𝑏 d𝑎 d𝑏
Due to the so‐called sifting property of the time‐delayed impulse response, it holds true that 𝑓 𝑎 𝛿 𝑎
𝑏 d𝑎
𝑓 𝑎 .
(4‐301)
Therefore we can write: 𝐸 𝐱 𝑡 𝐱 𝑡
𝐐 d𝑏
𝐐 𝑡 .
(4‐302)
Now if we assume that the system in equation (4‐295) is the discrete time representative of the system in equation (4‐297), recalling that 𝐱 𝑡 𝑘𝑇 𝐱 𝑘 , it becomes clear that the results of equations (4‐296) and (4‐302) have to be equal, if we set 𝑡 𝑘 𝑇. This gives 𝑘𝐐
𝐐 𝑡|
⇒𝐐
𝐐 . 𝑇
⇒ 𝑘𝐐
𝐐 𝑘 𝑇 (4‐303)
In order to find a relation between 𝐑 and 𝐑, we look at the following discrete time system representation: 𝐱 𝑘
1
𝐲 𝑘
𝐱 𝑘 ,
𝐱 𝑘
𝐯 𝑘 ,
(4‐304)
𝐯 𝑘 ~𝒩 𝟎, 𝐑 𝐱 0
𝟎 .
As there is no process noise in this system, it holds true that 𝐏 𝑘 the output matrix 𝐂 𝐈, we can adapt equation (4‐294) which gives: 𝐏
𝑘
𝐏
𝑘
1
𝐏 0 𝐑 ; 𝐏 𝐏 0 𝐑
1
⇒𝐏
2
𝑘
1 . Further, as
𝐏 𝑘 1 𝐑 . 𝐏 𝑘 1 𝐑
𝐑
The transfer of the covariance matrix of 𝑘 𝐏
𝐏
2
𝐏 0 𝐑 𝐑 𝐏 0 𝐑 𝐏 0 𝐑 𝐑 𝐏 0 𝐑
Repeating this step 𝑘 times gives:
0 to 𝑘
(4‐305)
1 and 𝑘
2 can therefore be written as:
𝐏 1 𝐑 𝐏 1 𝐑
𝐏
𝐏 0 𝐑 𝐏 0 𝐑 0 𝐑 𝐏 0 𝐑 𝐏 0 𝐑
𝐑
𝐏 0 𝐑 . 2𝐏 0 𝐑
(4‐306)
4.3 Parameter and Variable Estimation 𝐏
𝐏 0 𝐑 . 𝑘𝐏 0 𝐑
𝑘
(4‐307) 𝑘 𝑇, we can write:
Considering again that 𝑡 𝐏
187
𝐏 0 𝐑 𝑡 𝐏 0 𝐑 𝑇
𝑡
𝐏 0 𝐑 𝑇 . 𝑡 𝐏 0 𝐑 𝑇
(4‐308)
Therefore, the error covariance matrix at time 𝑡 does not depend on the sample time if 𝐑
𝐑 ⇒ lim 𝐑 → 𝑇
𝐑𝛿 𝑡
(4‐309)
holds. This expresses the relation between discrete time and continuous time measurement noise. We are now ready to derive the continuous Kalman filter. We consider a continuous time system with 𝐱 𝑡
𝐀 𝑡 𝐱 𝑡
𝐁 𝑡 𝐮 𝑡
𝐲 𝑡
𝐂 𝑡 𝐱 𝑡
𝐯 𝑡
𝐰 𝑡 (4‐310)
𝒘 𝑡 ~𝒩 𝟎, 𝐐 𝒗 𝑡 ~𝒩 𝟎, 𝐑
We need to compare this system with a discrete time representation with sample time 𝑇, as we have already derived the Kalman filter algorithms for this case. Then, we can try to let 𝑇 tend towards zero: 𝐱 𝑘 𝐲 𝑘
𝐀 𝑘 𝐱 𝑘
1
𝐂 𝑘 𝐱 𝑘
𝐁 𝑘 𝐮 𝑘
𝐯 𝑘
𝒘 𝑡 ~𝒩 𝟎, 𝐐
𝐐𝑇
𝒗 𝑡 ~𝒩 𝟎, 𝐑
𝐑⁄𝑇
𝐆 𝑘 𝐰 𝑘 (4‐311)
Note that it is necessary to introduce a new parameter 𝐆 𝑘 to use the same notation 𝐰 in both descriptions. In the discrete case, 𝐰 is has the same units as 𝐱, while in the continuous case, its units are the same as 𝐱. In the discretization process, we can treat 𝐰 as an additional input with 𝐁 𝑡 𝐈 to derive 𝐆 𝑘 . We have introduced the relations between the continuous and discrete system and input matrices in the equations (4‐52) and (4‐53). The computation involves the solution of the expression e𝐀 , which we have defined in equation (4‐19) as a sum with an infinite number of summands. For 𝑇 → 0, it is justified to cancel the series after the second summand. With lim e𝐀 →
𝐈
𝐀 𝑇 ,
we can state that
(4‐312)
188
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
for small 𝑇: 𝐁 𝑘 𝐂 𝑘
𝐀 𝑘 e𝐀 𝐈 𝑡 𝐀 𝑘 𝐈 𝐁 𝑡 𝐀 𝐆 𝑘 𝐁 𝑡 𝑇
𝐀
𝐀 𝑡 𝑇 𝑡 𝐀 𝑡 𝑇 𝐁 𝑡 𝐈𝑇
𝐁 𝑡 𝑇
(4‐313)
𝐂 𝑡 .
Now we can employ the discrete time Kalman equations, replace the discrete matrices by the continuous ones according to the last equation, and tend 𝑇 towards 0. For the Kalman gain we have derived the equation (see Figure 4‐39): 𝐊 𝑘
𝐏
𝑘 𝐂 𝑘
𝐑 𝑘
𝑘 𝐂 𝑘
𝐑 𝑡 𝑇
𝐂 𝑘 𝐏
𝑘 𝐂 𝑘
,
(4‐314)
which gives: 𝐊 𝑘
𝐏
𝐊 𝑘 ⇒ 𝑇
𝐏
𝑘 𝐂 𝑘
𝐂 𝑘 𝐏
𝑘 𝐂 𝑘
(4‐315)
𝐑 𝑡
𝐂 𝑘 𝐏
𝑘 𝐂 𝑘 𝑇
.
Recalling that in the continuous case there is no distinguish between a priori and a posteriori estimation, we can write: 𝐊 𝑡
lim →
𝐊 𝑘 𝑇
𝐏 𝑡 𝐂 𝑡 𝐑
𝑡 ,
(4‐316)
which gives us the first relevant equation. Also, we can conclude that if 𝑇 tends towards zero, the same has to be true for 𝐊 𝑘 , as the right side would need to tend towards infinity otherwise. For the error covariance matrix, we will start with the equation for the a priori case according to equation (4‐268), in which we substitute the matrices according equation (4‐313) for small 𝑇 to obtain: 𝐏
𝑘
1
𝐀 𝑘 𝐏 𝐈
𝑘 𝐀 𝑘
𝐀 𝑡 𝑇 𝐏
𝐏
𝑘
We now substitute 𝐏 𝐏
𝑘
1
𝐈
𝐈 𝑘
𝐀 𝑡 𝑇 𝐏
𝐐 𝑡 𝑇
𝑘 𝐀 𝑡
(4‐317)
𝐐 𝑡 𝑇
𝑘 𝐀 𝑡 𝑇 .
𝑘 according to equation (4‐290) and write: 𝐊 𝑘 𝐂 𝑘
𝐀 𝑡 𝐐 𝑡 𝐀 𝑡 𝐏 We substract 𝐏
𝑘
𝐀 𝑡 𝐏
𝐀 𝑡 𝐏
𝐐 𝑘
𝐈
𝐏
𝑘
𝐊 𝑘 𝐂 𝑘 𝐏
𝑇 𝑘 𝐀 𝑡 𝑇 .
𝑘 and divide by 𝑇 to obtain:
𝑘
𝐈
𝐊 𝑘 𝐂 𝑘 𝐏
𝑘 𝐀 𝑡
(4‐318)
4.3 Parameter and Variable Estimation 𝐏
𝑘
1 𝑇 𝐀 𝑡
𝐏
𝑘
𝐈
𝐊 𝑘 𝐂 𝑘 𝐏
189
𝐊 𝑘 𝐂 𝑘 𝐏 𝑇 𝑘
𝑘
𝐀 𝑡 𝐏
𝑘 𝐀 𝑡 𝑇 (4‐319)
𝐈
𝐊 𝑘 𝐂 𝑘 𝐏
𝑘 𝐀 𝑡
𝐐 𝑡
.
Now let 𝑇 tends towards zero. The left side of the equation becomes 𝐏 𝑡 . On the right side, in the first summand, we can replace the term 𝐊 𝑘 ⁄𝑇 according to equation (4‐316). The second summand equals zero as it is multiplied by 𝑇. In the bracket, all summands which contain 𝐊 𝑘 become zero, as discussed above after equation (4‐316). This results in: 1 𝑇 𝐏 𝑡 𝐂 𝑡 𝐑
𝐏 𝑡
lim
𝐏
𝑘
𝐏
𝑘
→
(4‐320)
𝑡 𝐂 𝑡 𝐏 𝑡
𝐀 𝑡 𝐏 𝑡
𝐏 𝑡 𝐀 𝑡
𝐐 𝑡 .
which gives a differential equation to compute the error covariance matrix. The a priori and a posteriori estimations of the discrete state vector have been shown in equation (4‐263) and Figure 4‐39. Substituting the former into the latter one and replacing the discrete matrices with the continuous ones according to equation (4‐313) for small 𝑇 gives: 𝐱
𝑘
𝐀 𝑘
1 𝐱
𝑘
1
𝐁 𝑘
𝐊 𝑘 𝐲 𝑘 𝐂 𝑘 𝐀 𝑘 𝐂 𝑘 𝐁 𝑘 1 𝐮 𝑘 𝐈
𝐀 𝑡 𝑇 𝐱
𝑘
1
𝑘
1
𝐀 𝑡 𝑇𝐱 𝐲 𝑘
𝐊 𝑘 Now we subtract 𝐱 𝐱
𝑘
𝐱 𝑇
𝑘
𝑘
1
𝐊 𝑘 𝑇
1 𝐱 𝑘 1
𝐁 𝑡 𝑇𝐮 𝑘
𝐊 𝑘 𝐲 𝑘 𝐱
1 𝐮 𝑘
𝐂 𝑘
𝐈
𝑘
1
𝐂 𝑘 𝐱
1 1
1
𝐀 𝑡 𝑇 𝐱
𝑘
𝐁 𝑡 𝑇𝐮 𝑘
1
𝐂 𝑘 𝐁 𝑡 𝑇𝐮 𝑘
1
(4‐321)
1
𝑘 1 𝐂 𝑘 𝐀 𝑡 𝑇𝐱 𝐂 𝑘 𝐁 𝑡 𝑇𝐮 𝑘 1
𝑘
1
𝑘
1
1 and divide by 𝑇 to obtain: 𝐀 𝑡 𝐱
𝐲 𝑘
𝑘
𝐂 𝑘 𝐱
1
𝐁 𝑡 𝐮 𝑘
1
𝑘 1 𝐂 𝑘 𝐀 𝑡 𝑇𝐱 𝐂 𝑘 𝐁 𝑡 𝑇𝐮 𝑘 1
(4‐322)
For 𝑇 tending towards 0, the left side becomes to 𝐱 𝑡 , as there is no difference between a priori and a posteriori estimation for the continuous filter. On the right side, we can replace the quotient according to equation (4‐316), while the summands containing 𝑇 as a factor disappear: 𝐱 𝑡
lim
𝐱
𝑘
→
𝐀 𝑡 𝐱 𝑡
𝐱 𝑇
𝑘
𝐁 𝑡 𝐮 𝑡
1
(4‐323) 𝐊 𝑡 𝐲 𝑡
𝐂 𝑡 𝐱 𝑡
The equations (4‐316), (4‐320), and (4‐323) display the continuous Kalman filter. Note that in order to compute the Kalman gain, the differential equation of the error covariance matrix
190
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
must be solved. This equation is denoted as Matrix Riccati equation which is very laborious to solve, especially for time variant systems. This limits the practical usability of the continuous time Kalman filter. Figure 4‐45 shows the block diagram of the linear continuous Kalman filter for a time invariant sytem (all the matrices are constant). Interestingly enough, the structure is exactly the same as for the Luenberger observer developed in section 4.2.2.1, see Figure 4‐18 (For the Kalman filter, we did not use circumflexes on top of the state matrices in the filter part, as this is uncommon in literature, but it would be more precise because the matrices of our model can also only be estimated with some uncertainty). So the difference between the Luenberger observer and the Kalman filter is only in the computation of the gain which is multiplied with the difference between measured and simulated output before being added to 𝐱 𝑡 as a correction. The theory behind the Luenberger observer enables the user to place the poles of the observer at a desired space and therefore determine the dynamic behaviour of the observer according to his will, assuming that all signals behave deterministically. The Kalman filter gives an algorithm to compute the gain in a way that results in a minimum variance of the estimation error, assuming that the deterministic system is penetrated by two normally distributed zero‐mean stochastic processes with known covariance matrices.
Figure 4‐45: Block diagram of the linear continuous Kalman filter
Concluding the discussions on linear Kalman filters, in what follows we will discuss possibilities to deal with nonlinear systems. 4.3.3.6 Extended Kalman Filter for Nonlinear Systems In reality, many systems behave in a nonlinear matter. If we assume the scenarios defined within section 3.3, we can see that range measurements between underwater objects and reference objects can be modelled using the Pythagorean theorem based on the position data, which clearly results in a nonlinear output equation. Also, the state equation can be nonlinear. For a moving object, if we intend to use surge speed and heading as states, the course of the horizontal coordinates will be in a trigonometric relation with these quantities. In general, we will look at nonlinear systems and assume that they are influenced by normally distributed
4.3 Parameter and Variable Estimation
191
zero‐mean Gaussian white process and measurement noises. Following the discussions on continuous nonlinear systems, see equation (4‐101), we can define the following discrete time system as a base for further discussion: 𝐱 𝑘 𝐲 𝑘
1
𝐟 𝐱 𝑘 , 𝐮 𝑘 , 𝐰 𝑘 , 𝑘 ,
𝐡 𝐱 𝑘 ,𝐯 𝑘 ,𝑘
(4‐324)
𝐰 𝑘 ~𝒩 𝟎, 𝐐 𝑘 𝐯 𝑘 ~𝒩 𝟎, 𝐑 𝑘
,
where 𝐟 ∙ and 𝐡 ∙ denote arbitrary nonlinear functions, assumed to be differentiable within the definition area of the state vector. The demonstrated Kalman filter can no longer be used, as it is not possible to describe a nonlinear system in a way according to equation (4‐253). However, it is a common approach in control theory to perform a linearization on nonlinear system models to be able to employ the manifold mathematical tools for linear systems. A standard approach is the employment of the Taylor series. A Taylor series represents a function 𝑓 𝑥 as an infinite sum of terms based on the values of the derivatives at a certain point, denoted as 𝑥 (operating point). The necessary condition for the existence of such a Taylor series representation is that 𝑓 𝑥 must be differentiable ad infinitum around 𝑥 . According to Papula, 2001, it can be stated that 𝑓 𝑥
𝑓 𝑥 d 𝑓 𝑥 d𝑥
d𝑓 𝑥 d𝑥 ∙
𝑥
∙ 𝑥 𝑛!
𝑥
𝑥 1!
d 𝑓 𝑥 d𝑥
∙
𝑥
𝑥 2!
⋯ (4‐325)
.
It is obvious that as long as 𝑥 remains close to 𝑥 , it might be justified to cancel the series after the linear term. That way, we can find a linear approach to any nonlinear function which meets the requirement stated above. It is important to notice that the usage of the linearized function is only justified in close vicinities of the operating point 𝑥 . That means on the other hand, that a linearization can only be performed if such a point exists. In control theory, often the task arises to design a controller that keeps the output of a plant at some reference value, neglecting all interfering disturbances. In this scenario, if the plant behaves in a nonlinear matter, it is common to use a linearized model of the plant around the reference value. Under the assumption that a good controller can be designed based on the linearized model and that the controller will keep the output value around the reference value, the employment of the linearized model can be considered justified. However, as soon as there is a significant discrepancy between real output and reference output (possibly due to a large step‐like disturbance), the controller designed for the linear model might not be able to bring back the real nonlinear system’s output to the reference value, and an unstable situation might occur. This is an important issue that we have to keep in mind. When we intend to somehow use the method of linearization to employ the Kalman filter algorithms for a nonlinear system, we need an operation point to perform the linearization around. However, in many real world systems, the states will not remain around fixed points. For instance, a moving marine robot might exhibit a fixed surge speed, but its position will
192
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
constantly change. Therefore, it is not possible to simply perform a single linearization of a nonlinear system; it is necessary to redo the linearization in every time step. However, the linearization must be developed around the current state vector, which we do not know. We only have our current estimation. The principle of the so called Extended Kalman Filter (EKF) is to use the current estimate as operation point to linearize the system model around; then the linearized system model is used to perform the next estimation according to the algorithms of the linear Kalman filter. Coming back to the discussions in the last paragraph, we see that this approach incorporates some dangers: If our current estimate is too far away from the true state, then the algorithm will not be able to bring the estimate back to the true state, even if ‘good’ measurement data with only little noise are available. This is due to the fact that the linearized model which was built around a ‘wrong’ operation point exhibits large inaccuracies in the area of the true states. As we have seen above, the shortening of the Taylor series is only justified in the vicinity of the operating point. Therefore, whenever there is a certain estimation error at some time in an EKF, the estimates might not go back to the true values, even if measurement and process noise were ‘switched off’. This is denoted as divergence and must be avoided at all cost in real applications. We will look at the discrete time EKF; as stated before, the discrete time domain is of biggest interest for the suggested scenarios. If we start with a system according to equation (4‐324), it is straightforward to employ the same initialization as for the linear Kalman filter according to equations (4‐260) and (4‐261). For the a priori estimation, it is straightforward to use the original nonlinear state difference equation. For the function arguments, we use the most current available estimate of 𝐱 𝑘 , which is 𝐱 𝑘 , and for 𝐰 𝑘 we insert the expected value, which it 𝟎 per definition: 𝐱
𝑘
1
𝐟 𝐱
𝑘 , 𝒖 𝑘 , 𝟎 ,
(4‐326)
But how should we compute the a priori estimation error covariance 𝐏 𝑘 ? We cannot use equation (4‐268), because there is no linear system matrix 𝐀 in this time. The idea is to linearize 𝐟 ∙ around the best currently available estimation, 𝐱 𝑘 . If we take the state difference equation of (4‐324) and apply the Taylor series according to (4‐325), canceling after the linear summand, we obtain: 𝐱 𝑘
1
𝐟 𝐱 𝑘 , 𝒖 𝑘 , 𝒘 𝑘 , 𝑘 , 𝐟 𝐱
𝑘 , 𝒖 𝑘 , 𝟎, 𝑘
∙𝐰 𝑘 𝐟 𝐱
𝑘 , 𝒖 𝑘 , 𝟎, 𝑘
where 𝐅 𝑘
𝜕𝐟 𝜕𝐰
𝜕𝐟 𝜕𝐱
𝐱 𝐱
𝐱 𝐱
,
𝜕𝐟 𝜕𝐱
∙ 𝐱 𝑘
𝐱 𝐱
𝐅 𝑘 𝜕𝑓 ⎡ ⎢𝜕𝑥 ⎢ 𝜕𝑓 ⎢𝜕𝑥 ⎢ ⎢ ⋮ ⎢ 𝜕𝑓 ⎣𝜕𝑥
𝐱 𝑘 𝜕𝑓 𝜕𝑥 𝜕𝑓 𝜕𝑥 ⋮ 𝜕𝑓 𝜕𝑥
𝐱
𝐱
𝑘 𝜕𝑓 𝜕𝑥 𝜕𝑓 ⋯ 𝜕𝑥 ⋱ ⋮ 𝜕𝑓 ⋯ 𝜕𝑥 ⋯
𝜕𝐟 𝜕𝐰
𝑘
𝐱 𝐱
𝐆 𝑘 𝐰 𝑘 , ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦𝐱
(4‐327) ,𝐆 𝑘
𝐱
4.3 Parameter and Variable Estimation
193
are also denoted as Jacobian matrices. With this we can continue: 𝐱 𝑘
1
𝐟 𝐱
𝑘 , 𝒖 𝑘 , 𝟎, 𝑘
𝐅 𝑘 𝐱 𝑘
𝐟 𝐱
𝐅 𝑘 𝐱 𝑘
𝒖 𝑘
with 𝒖 𝑘
𝐟 𝐱
𝐰 𝑘
𝐅 𝑘
𝐱 𝑘
𝐱
𝑘 , 𝒖 𝑘 , 𝟎, 𝑘
𝑘
𝐆 𝑘 𝐰 𝑘
𝐅 𝑘 𝐱
𝑘
𝐆 𝑘 𝐰 𝑘
𝐰 𝑘
𝑘 , 𝒖 𝑘 , 𝟎, 𝑘
(4‐328)
𝐅 𝑘 𝐱
𝑘 ,
𝐆 𝑘 𝐰 𝑘 ~𝒩 𝟎, 𝐆 𝑘 𝐐 𝑘 𝐆 𝑘
In the last step, we have summarized all terms that does neither depend on 𝐱 𝑘 nor on 𝐰 𝑘 in the square bracket and substituted them as ‘pseudo‐inputs’ 𝒖 𝑘 . Also, we have introduced a new process noise 𝐰 𝑘 𝐆 𝑘 𝐰 𝑘 . Note that its covariance matrix can be computed to be 𝐐 𝑘
𝐸 𝐆 𝑘 𝒘 𝑘 𝒘 𝑘 𝐆 𝑘
𝐆 𝑘 𝐐 𝑘 𝐆 𝑘 .
(4‐329)
The system in equation (4‐328) is a linear system according to equation (4‐253), with 𝐀 𝑘 𝐅 𝑘 , 𝐁 𝑘 𝐈. Hence, it is straightforward to employ equation (4‐268) or any of the two other forms in (4‐290) or (4‐294) to compute the a priori estimation error covariance matrix: 𝐏
𝑘
𝐅 𝑘
1 𝐏
𝑘
1 𝐅 𝑘
1
𝐆 𝑘
1 𝐐 𝑘
1 𝐆 𝑘
1 .
(4‐330)
We can use the same procedure to derive the algorithm for the a posteriori estimation. Linearizing the output equation around the best available estimation, 𝐱 𝑘 , we get: 𝐲 𝑘
𝐡 𝐱 𝑘 ,𝒗 𝑘 ,𝑘 𝐡 𝐱
𝑘 , 𝟎, 𝑘
𝜕𝐡 𝜕𝐱
𝐡 𝐱
𝑘 , 𝟎, 𝑘
𝐇 𝑘
𝐇 𝑘 𝐱 𝑘
𝐡 𝐱
𝐇 𝑘 𝐱 𝑘
𝒛 𝑘
∙ 𝐱 𝑘
𝐱 𝐱
𝐱 𝑘
𝑘 , 𝟎, 𝑘
𝐱
𝐱 𝑘
𝐇 𝑘 𝐱
∙𝐯 𝑘
𝑘
𝐌 𝑘 𝐯 𝑘 (4‐331)
𝐯 𝑘
𝜕𝐡 𝜕𝐱
𝒛 𝑘
𝐡 𝐱
𝐯 𝑘
𝐌 𝑘 𝐯 𝑘 ~𝒩 𝟎, 𝐌 𝑘 𝐑 𝑘 𝐌 𝑘
,𝐌 𝑘
𝑘 , 𝟎, 𝑘
𝐱 𝐱
𝐌 𝑘 𝐯 𝑘
with 𝐇 𝑘
𝐱 𝐱
𝜕𝐡 𝜕𝐯
𝑘
𝜕𝐡 𝜕𝐯 𝐇 𝑘 𝐱
𝐱 𝐱
,
𝑘 , .
This is again a linear output equation, where 𝒛 𝒌 can be considered as a kind of feedthrough from the input. As the equation is linear, we can directly employ the a posteriori Kalman filter algorithms according to Figure 4‐39 to obtain: Kalman gain: 𝐊 𝑘
𝐏
𝑘 𝐇 𝑘
𝐌 𝑘 𝐑 𝑘 𝐌 𝑘
𝐇 𝑘 𝐏
𝑘 𝐇 𝑘
.
(4‐332)
194
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
A posteriori estimation: 𝐱
𝑘
𝐱
𝑘
𝐊 𝑘 𝐲 𝑘
𝐇 𝑘 𝐱
𝐱
𝑘
𝐊 𝑘 𝐲 𝑘
𝐡 𝐱
𝑘
𝒛 𝑘
𝑘 ,𝟎
𝟎
(4‐333)
A posteriori estimation error covariance matrix: 𝐏
𝑘
𝐈
𝐊 𝑘 𝐇 𝑘
𝐏
𝑘
𝐈
𝐊 𝑘 𝐇 𝑘
(4‐334)
𝐊 𝑘 𝐌 𝑘 𝐑 𝑘 𝐌 𝑘 𝐊 𝑘 .
We will look at an example to demonstrate the operation of the EKF. We assume the following nonlinear system: 𝐱 𝑘
1
𝑥 𝑘
𝐟 𝐱 𝑘 ,𝐮 𝑘 ,𝐰
𝑥 𝑘
ln 𝑥 𝑘 𝑢 𝑘 𝑦 𝑘
ℎ 𝐱 𝑘 ,𝑣
𝐰~𝒩 𝟎, 𝐐 , 𝐐 𝑣~𝒩 𝟎, 𝑅 , 𝑅
𝑥 𝑘 𝑥 𝑘
𝑤
,
𝑤 (4‐335)
𝑣 ,
diag 0.01,0.01 , 0.25 ,
where the operator diag ∙ , applied to a row vector, returns a diagonal matrix with the elements of the vector in the main diagonal. We initialize the system at 𝐱 0 𝐈 and 1 1 and the filter at 𝐱 0 5 5 , 𝐏 0 assume a sampling time 𝑇 1 s. The input 𝑢 𝑘 be constant 1. In order to employ the presented algorithm, we need to compute the Jacobi matrices 𝐅, 𝐆, 𝐇, and 𝐌 according to equations (4‐327) and (4‐331). As the process and measurement noise enters into the system and output equations in a linear matter, the relevant matrices become unity matrices: 𝐆 𝐈 ; H 1. The remaining ones can be computed to be:
𝐅 𝑘
𝐇 𝑘
𝑥 𝑘 ⎡𝜕 𝑥 𝑘 ⎢ 𝜕𝑥 ⎢ ⎢ 𝜕 ln 𝑥 𝑘 𝑢 𝑘 ⎣ 𝜕𝑥 𝑥 ⎡ 𝑥 𝑥 ⎢ ⎢ 2 ⎢ ⎣ 𝑥 𝜕 𝑥 𝑘 𝑥 𝑘 𝜕𝑥
𝑥 𝑘 ⎤ 𝜕 𝑥 𝑘 ⎥ 𝜕𝑥 ⎥ 𝜕 ln 𝑥 𝑘 𝑢 𝑘 ⎥ 𝜕𝑥 ⎦𝐱 𝑥 ⎤ 𝑥 𝑥 ⎥ , ⎥ ⎥ 0 ⎦𝐱 𝐱 𝜕 𝑥 𝑘 𝑥 𝑘 𝜕𝑥
𝐱
(4‐336)
𝑥
𝑥
𝐱 𝐱
𝐱 𝐱
With these results, the prediction can be performed with equations (4‐326) and (4‐330), while the corrections can be computed employing equations (4‐332)‐(4‐334).
4.3 Parameter and Variable Estimation
195
Figure 4‐46: Original States and Extended Kalman filter estimation for the nonlinear system; color scheme is the same as in Figure 4‐41 to Figure 4‐43
Figure 4‐46 shows on the left side the original and the estimated state 𝑥 the first 20 steps of a simulation. As before, the estimation error covariance is displayed by the dotted red curves representing a confidence interval of the standard deviation of the estimation error at 𝑥
𝑘
𝑃 , 𝑘 . On the right, state 𝑥 is displayed at a later stage. For a later comparison,
the Root Mean Square Errors (RMSE) have been computed according to the equation
𝑅𝑀𝑆𝐸
For 𝑁
1 𝑁
𝑥 𝑘
𝑥
𝑘
.
1000 steps, the results were 𝑅𝑀𝑆𝐸
(4‐337)
1.4 and 𝑅𝑀𝑆𝐸
0.21.
4.3.3.7 Unscented Kalman Filter (UKF) As it was discussed, the EKF is based on a linearization of a nonlinear system in terms of employing a Tayler series composition which is canceled after the linear term. This linearization can cause severe inaccuracies up to divergence, in which the estimations drift away from the true values. Specially, as discussed in Wan and van der Merwe, 2000, it can be stated that the a posteriori state mean (which was chosen as estimation, see equation (4‐256)) and covariance updates are only accurate up to the first order of the Taylor series composition. To find a better working solution, it is possible to employ the so‐called Unscented Transformation, as introduced by Uhlmann, 1995 as well as Julier et al., 1995. Assume that 𝐲
𝐡 𝐱
(4‐338)
is a nonlinear function of the stochastic variable 𝐱 which is defined by its expected value 𝐸 𝐱 and covariance 𝐏 . We are now interested in computing expected value and covariance of the output variable 𝐲, denoted as 𝐸 𝐲 and 𝐏 . As discussed, when performing a linearization, the accuracy of the computation will be limited to the first order of the Taylor series. We can improve the quality by employing the unscented transformation which gives us accuracy to the third order for normally distributed variables, and at least to the second order for others. To be more precise, the principle of the unscented transformation involves the usage of minimal set
196
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
of carefully chosen sample points, which are also denoted as sigma points. These points can be seen as a deterministic representation of the underlying PDF of the stochastic variable. They are then propagated through the nonlinear function, and the solution can be found by computing the mean and covariance of the points after propagation. The 2 𝑛 sigma points (according to Simon, 2006) are selected according to: 𝐱
𝐸 𝐱
𝐱
𝐱 , 𝑖
√𝑛 𝐏 , 𝑖
1, … ,2 𝑛 1, … , 𝑛
√𝑛 𝐏 , 𝑖
𝐱
(4‐339)
1, … 𝑛 ,
where the square root √𝐀 of a matrix 𝐀 fulfils the equation √𝐀 √𝐀 𝐀 and can be computed employing the so‐called Cholesky decomposition, and 𝐀 denotes the 𝑖th row of 𝐀. The sigma points are propagated though the nonlinear function, which gives: 𝐲
𝐡 𝐱
, 𝑖
1, … ,2 𝑛 .
(4‐340)
The required values can be approximated by the mean and covariance of the output values of the sigma points: 𝐸 𝐲
y
1 2𝑛
𝐲 (4‐341)
𝐏
𝐏
1 2𝑛
𝐲
y
𝐲
y
.
The described concept can be used to derive a nonlinear Kalman filter, denoted as Unscented Kalman Filter (UKF), which was suggested by Wan and van der Merwe, 2000, Julier et al., 2000, and Julier and Uhlmann, 1997. It will usually show a better performance than an EKF, as the approximation of expected value and covariance of the state vector achieve a higher order accuracy than in a simple linearization approach. The computation effort is comparable with the one necessary for an EKF. The base idea behind the UKF, as stated in a similar form in Julier et al., 2000, is the following consideration: In the framework of a Kalman filter, we have to propagate a stochastic variable, namely the state vector, through nonlinear functions. As there is no direct approach for this problem, we can do two different things: Either we propagate the original stochastic variable through approximated (a.k.a. linearized) nonlinear functions, which is what the EKF does. Or we approximate the probability distribution instead and propagate it through the original nonlinear function. The second solution seams intuitively easies, and that is why the UKF is based on it. We assume a nonlinear system according to equation (4‐324). At this point we need to distinguish between two scenarios. Equation (4‐324) allows for the process and measurement noise to enter into the state or output vector in a nonlinear matter. If this is the case, it is necessary to augment the noise onto the state vector, that is, the noise is estimated as well. According to Julier and Uhlmann, 2004 as well as Wan and van der Merwe, 2001, the augmented state vector 𝐱 𝑘 can be written as
4.3 Parameter and Variable Estimation 𝐱 𝑘 𝐰 𝑘 𝐯 𝑘
𝐱 𝑘
197
,
(4‐342)
and it is initialized as 𝐱
0
𝐏
0
𝐸 𝐱 0 𝟎 𝟎
,
𝐸 𝐱 0
(4‐343) 𝐸 𝐱 0
𝐱 0
𝐸 𝐱 0
𝟎 𝐐 0 𝟎
𝟎 𝟎
𝟎 𝟎 𝐑 0
.
(4‐344)
In many cases, it can be assumed that the noises enter in a linear matter. If the system description can be written as 𝐱 𝑘
1
𝐲 𝑘
𝐟 𝐱 𝑘 ,𝐮 𝑘 ,𝑘
𝐡 𝐱 𝑘 ,𝑘
𝐰 𝑘 ,
(4‐345)
𝐯 𝑘 ,
it is possible to simply use the normal state vector, with the initialization according to equations (4‐260) and (4‐261). For the sake of simplicity, we will assume this case in the following and explicitly point out if there is a difference in both algorithms. For the prediction, it is necessary to compute the sigma points according to equation (4‐339). It is common to add the current a posterior estimation as one point, so that there are in sum 2 𝑛 1 points according to 𝛘
𝑘
𝐱
𝑘
1
𝛘
𝑘
𝐱
𝑘
1
𝑛
𝜆 𝐏
𝑘
1
, 𝑖
1, … , 𝑛 (4‐346)
𝑘
𝛘 where 𝜆
𝐱 𝛼
𝑘 𝑛
1 𝜅
𝑛
𝜆 𝐏
𝑘
1
, 𝑖
1, … , 𝑛 ,
𝑛
is a scaling parameter, incorporating the constant 𝛼 as a measure for the spreading of the sigma points around 𝐱 𝑘 which is usually set to a small value, e.g. 1e 3, and 𝜅 which is usually set to 0. In the next step, the sigma points are propagated through the nonlinear state equation which gives: 𝐲
𝑘
𝐟 𝛘
𝑘
1 ,𝐮 𝑘
1 ,𝐸 𝐰 𝑘
𝟎, 𝑘 , 𝑖
0, … ,2 𝑛.
(4‐347)
The variable 𝐲 is not an output of the overall system; it is just the sigma points after being propagated through the state vector differential equation. As discussed in equation (4‐341), we can now use the mean and the covariance of the propagated sigma points as a priori estimation and estimation error covariance matrix. It is
198
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
common to replace the quotient in front of the sums in equation (4‐341) by some scaling parameters, which gives: 𝐱
𝑘
𝐏
𝑘
𝑤
𝐲
𝑤
with 𝑤 𝑤
𝑘
𝐲
𝑘
𝜆⁄ 𝑛
𝜆 , 𝑤
𝑤
1⁄ 2 𝑛
𝐱
𝑘 𝜆⁄ 𝑛
𝜆
𝐲
𝑘 𝜆
𝐱 1
𝛼
𝑘
𝐐 𝑘
(4‐348)
𝛽 ,
,
where 𝛽 represents a priori knowledge on the distribution of 𝐱, e. g. 𝛽 2 if the distribution is Gaussian. Note that in the equation for 𝐏 𝑘 , we have added 𝐐 𝑘 , as the first part only describes the propagation of the estimation error covariance through the system equation, and the covariance of the process noise occurs on top of that. However, if we work with the augmented state vector 𝐱 𝑘 , the process noise covariance is already included in 𝐏 𝑘 , and we have to omit the adding of 𝐐 𝑘 at this point. Figure 4‐47 displays again the usage of the unscented transformation (ignoring the usage of the 𝛘 ‐point for the sake of clarity) for the a priori estimation of the UKF for the two‐ dimensional state vector: Starting from the a posteriori estimation of the last time step, 𝐱 𝑘 1 , the sigma points for the current time step are computes (blue arrows), where the a posteriori estimation error covariance matrix of the last time step, 𝐏 𝑘 𝟏 , influences how far the sigma points are away from 𝐱 𝑘 1 . Then the sigma points are propagated through the system vector difference equation (green arrows) to obtain the propagated sigma points 𝐲 𝑘 . The current a priori estimation 𝐱 𝑘 is than computed as mean of the propagated sigma points (red arrows), and the distance of the propagated sigma points from the mean value determines the a priori estimation error covariance matrix 𝐏 𝑘 . Legend:
Transformation acc. to equation (4‐346)
Transformation acc. to equation (4‐347)
Transformation acc. to equation (4‐348)
Figure 4‐47: The unscented transformation as employed for the a priori estimation of the UKF
In order to perform the a posteriori estimation, we need to find a way to compute two variables. To employ equation (4‐333), we need 𝐊 𝑘 and a replacement for 𝐡 𝐱 𝑘 , 𝟎 ,
4.3 Parameter and Variable Estimation
199
which is the estimated output based on the a priori estimation. We will start with the latter one. It is straightforward to use the unscented transformation once more, now based on the current a priori estimate. It is possible to compute new sigma points as: 𝛘
𝑘
𝐱
𝑘
𝛘
𝑘
𝐱
𝑘
𝛘
𝑘
𝐱
𝑛
𝜆 𝐏
𝑘
𝑛
𝑘
𝜆 𝐏
, 𝑖 𝑘
1, … , 𝑛
, 𝑖
(4‐349)
1, … , 𝑛 .
Alternatively, to save some computational effort for the sacrifice of some accuracy, we can reuse the propagated sigma points of the a priori estimation: 𝛘
𝑘
𝐲
𝑘 , 𝑖
0, … ,2 𝑛 .
(4‐350)
No matter which strategy was employed, the next step is to propagate 𝛘 nonlinear output equation to obtain: 𝐲
𝑘
𝐡 𝛘
𝑘 ,𝐸 𝐯 𝑘
𝟎, 𝑘 , 𝑖
0, … ,2 𝑛.
𝑘 through the
(4‐351)
Now we can compute the estimated outputs 𝐲 𝑘 as well as the covariance of the estimated output, 𝐏 𝑘 , according to equation (4‐341). Note that we have not used this quantity before; its usability at this point will become visible soon. Also, we had introduced the cross‐ covariance matrix of two stochastic signals at the end of section 4.3.1.7, see (4‐178). Now, we also compute 𝐏 𝑘 as the cross‐covariance matrix between 𝐱 𝑘 and 𝐲 𝑘 . In both cases, the necessary multiplication with 1⁄ 2 𝑛 according to equation (4‐341) is realized by the factor 𝑤 , which can be used to tune the filter. We can write: 𝐲 𝑘
𝑤
𝑘
𝐲
𝐏
𝑘
𝑤
𝐲
𝑘
𝐲 𝑘
𝐏
𝑘
𝑤
𝛘
𝑘
𝐱
𝑘
𝐲
𝐲
𝑘
𝑘
𝐲 𝑘
𝐲 𝑘
𝐑 𝑘
(4‐352)
Note that in the computation of 𝐏 𝑘 , we have added 𝐑 𝑘 to represent the measurement noise, for the same reason we added 𝐐 𝑘 in equation (4‐348). Again, if the augmented state vector 𝐱 𝑘 is used, 𝐑 𝑘 has to be omitted. The described procedure gives us the required 𝐲 𝑘 , but we must still find a way to compute 𝐊 𝑘 . To proceed, we need to look at a statistical derivation of the Kalman filter, according to Simon, 2006. This will give us a relation between 𝐊 𝑘 and the covariance and cross‐ covariance matrices we have just computed. To this extend, we assume that we have just performed an a priori estimation and have obtained 𝐱 𝑘 . We are now looking for a way to
200
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
find an optimal a posteriori estimation. In order for the algorithm to be linear, we might use the following approach: 𝐱
𝑘
𝐊 𝑘 𝐲 𝑘
𝐛 𝑘 ⇒ 𝐸 𝐱
𝑘
𝐊 𝑘 𝐸 𝐲 𝑘
𝐛 𝑘
(4‐353)
with 𝐊 𝑘 and 𝐛 𝑘 being (deterministic) parameters to estimate in a way that the algorithm qualifies as Kalman filter algorithm: The estimation has to be unbiased, and the trace of the estimation error matrix must be minimal. Unbiasedness is fulfilled if 𝐱 𝑘 𝐸 𝐱 𝑘 , while the current measurements 𝐲 𝑘 are employed, see equation (4‐256). Looking at equation (4‐353), we see that this condition holds for 𝐛 𝑘
𝐸 𝐱 𝑘
⇒𝐛 𝑘
𝐊 𝑘 𝐲 𝑘
𝐸 𝐱 𝑘
𝐊 𝑘 𝐸 𝐲 𝑘
(4‐354)
.
In the last transformation, we applied the expected value on both sides of the equation and made use of the fact that the expected value of deterministic variables (𝐛 𝑘 , 𝐊 𝑘 ) equal the variables themselves, and that two convoluted expected values can be replaced by a single expected value, see equations (4‐166) and (4‐167).We can see from our computations: If 𝐛 𝑘 fulfils the last equation, then the estimation is unbiased, independent from 𝐊 𝑘 . Therefore, we now need to find that 𝐊 𝑘 that minimizes the trace of the estimation error covariance matrix 𝐏 𝑘 : 𝐊 𝑘
argmin tr 𝐏
𝑘
𝐊
argmin tr 𝐸
𝐱 𝑘
𝐊
𝐱
𝑘
𝐱 𝑘
𝐱
𝑘
.
(4‐355)
We will now find a better way to express 𝐏 𝑘 . We see from equation ) that for a stochastic vector 𝐳 𝑘 , the covariance matrix 𝐏 𝑘 can be computed to be: 𝐏
𝑘
𝐸 𝐳 𝑘
𝐸 𝐳 𝑘
𝐸 𝐳 𝑘 𝐳 𝑘 Now let 𝐳 𝑘 write: 𝐏
𝑘
𝐸 𝐳 𝑘
𝐱 𝑘
𝐏
𝑘
𝐱
𝐳 𝑘
𝐸 𝐳 𝑘
𝑘 and note that 𝐏
𝐸 𝐳 𝑘
𝑘
𝐸 𝐳 𝑘
𝐸 𝐳 𝑘
𝐸 𝐱 𝑘
𝐱
Inserting for 𝐱 𝐏
𝑘
𝐸 𝐳 𝑘 𝐳 𝑘 holds. Thus we can
(4‐357)
𝑘 :
𝐳 𝑘
𝐸 𝐳 𝑘
𝐸 𝐱 𝑘
𝐱
…
𝑘
(4‐358)
.
𝑘 according to equation (4‐353) gives:
𝐸 𝐳 𝑘 𝐸 𝐱 𝑘 𝐸
𝑘
𝑘
.
The next step is the development of 𝐏 𝐏
(4‐356)
.
𝐸 𝐳 𝑘
𝐸 𝐳 𝑘
𝐱 𝑘
𝐸 𝐳 𝑘
𝐳 𝑘
𝐊 𝑘 𝐲 𝑘
𝐛 𝑘
𝐸 𝐱 𝑘
𝐊 𝑘
𝐸 𝐳 𝑘 𝐸 𝐱 𝑘 𝐲 𝑘
𝐊 𝑘 𝐲 𝑘
𝐸 𝐲 𝑘
…
𝐛 𝑘
…
(4‐359)
4.3 Parameter and Variable Estimation 𝐸 𝐱 𝑘
𝐸 𝐱 𝑘 … 𝐊 𝑘 𝐸 𝐲 𝑘
𝐸 𝐱 𝑘
𝐸 𝐱 𝑘
𝐊 𝑘 𝐸 𝐲 𝑘
201
𝐸 𝐲 𝑘
𝐲 𝑘
𝐱 𝑘
𝐸 𝐲 𝑘
𝐸 𝐲 𝑘
…
𝐸 𝐱 𝑘
𝐊 𝑘
𝐊 𝑘 .
We end up with a sum of four expected values In the first one, we might replace 𝐸 𝐱 𝑘 with the a priori estimation 𝐱 𝑘 , as the a posteriori estimation of the current time step is yet unknown. That way, the first expected value becomes to the a priori estimation error covariance matrix 𝐏 𝑘 . Note that the second and third expected value equals the cross‐ covariance 𝐏 𝑘 respectively 𝐏 𝑘 , as discussed in equation (4‐352), and the fourth expected value equals the covariance of the output, 𝐏 𝑘 . Using the fact that 𝐏 𝑘 𝐏 𝑘 , we can write: 𝐏
𝑘
𝐏
𝑘
𝐊 𝑘 𝐏
𝑘
𝐏
𝑘 𝐊 𝑘
𝐊 𝑘 𝐏
𝑘 𝐊 𝑘 .
(4‐360)
We insert this result into equation (4‐357) and build the trace on both sides to obtain: tr 𝐏
𝑘
tr 𝐏
𝑘
tr 𝐸 𝐳 𝑘
𝐊 𝑘 𝐏
𝑘
𝐸 𝐳 𝑘
.
𝐏
𝑘 𝐊 𝑘
𝐊 𝑘 𝐏
𝑘 𝐊 𝑘
(4‐361)
Our next goal is to split the first trace into two, of which only one depends on the adjustable variable 𝐊 𝑘 . To this extend, we will add one more summand in the trace, which is also subtracted to keep the equation even (in brighter script). For the second trace, we replace 𝐳 𝑘 by 𝐱 𝑘 𝐱 𝑘 and 𝐸 𝐱 𝑘 according to equation (4‐353). tr 𝐏
𝑘
tr 𝐏
𝑘 𝐏 𝑘 𝐏 𝑘 𝐏 𝐊 𝑘 𝐏 𝑘 𝐊 𝑘 𝐏
tr 𝐸 𝐱 𝑘
𝐊 𝑘 𝐸 𝐲 𝑘
𝑘 𝐊 𝑘 𝐏 𝑘 𝐏 𝑘 𝐏 𝐛 𝑘
…
𝑘 𝐏 𝑘
𝑘 𝐊 𝑘 (4‐362)
.
We will now split the first trace into two: The first two summands will build the first new one, and the remaining four summands will build the second one, where we factor out the term 𝐏 𝑘 . For the last trace in the former equation, note that the trace of a matrix, multiplied with its own, equals the squared norm: tr 𝐏
𝑘
tr 𝐏 tr
𝑘
𝐏
𝑘 𝐏
𝑘 𝐏
𝐊 𝑘
𝐏
𝑘 𝐏
𝑘
‖𝐸 𝐱 𝑘
𝐊 𝑘 𝐸 𝐲 𝑘
𝑘 𝐏 𝑘
𝐊 𝑘
𝐏
𝑘 𝐏
𝑘
(4‐363)
𝐛 𝑘 ‖ .
We need to keep in mind that we still try to find that 𝐊 𝑘 that minimizes this expression. Note that the first trace in the above equation contains neither 𝐊 𝑘 nor 𝐛 𝑘 ; therefore we cannot influence its value. The squared norm at the end will become zero if we set 𝐛 𝑘 according to equation (4‐354), which we have to do anyway in order to obtain an unbiased estimator. The second trace which will always be positive due to its structure becomes zero if we use the following expression for 𝐊 𝑘 :
202
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
𝐊 𝑘
𝐏
𝑘 𝐏
𝑘 .
(4‐364)
This is our required result. We can further transform our a posteriori estimation according to equation (4‐365) by replacing 𝐛 𝑘 according to equation (4‐354) to obtain: 𝐱
𝑘
𝐊 𝑘 𝐲 𝑘
𝐸 𝐱 𝑘
𝐊 𝑘 𝐸 𝐲 𝑘
.
(4‐365)
Note that the expected value for 𝐲 𝑘 equals the estimation 𝐲 𝑘 which we can perform based on the knowledge of the outputs up to 𝐲 𝑘 1 . As discussed above, we can also replace 𝐸 𝐱 𝑘 by 𝐱 𝑘 to obtain: 𝐱
𝑘
𝐱
𝑘
𝐊 𝑘
𝐲 𝑘
𝐲 𝑘
.
(4‐366)
Note that this equation will become the one derived for the linear Kalman filter according to Figure 4‐39 if we replace 𝐲 𝑘 by 𝐂 𝑘 𝐱 𝑘 , and the one for the EKF according to equation (4‐333) if we replace 𝐲 𝑘 by 𝐡 𝐱 𝑘 , 𝟎 . But what will be the result of 𝐏 𝑘 in this case? If we look at equation (4‐357), we see from the recent discussions that the second summand of the right side becomes zero if we set set 𝐛 𝑘 according to equation (4‐354), so that 𝐏 𝑘 equals 𝐏 𝑘 according to equation (4‐360). If we substitute equation (4‐364) into (4‐360), the second, third and fourth summand all individually become 𝐏 𝑘 𝐏 𝑘 𝐏 𝑘 . Two summands have a negative sign, so we can finally use both of the following writings for 𝐏 𝑘 𝐏
𝑘
𝐏
𝑘
𝐏
𝐏
𝑘
𝐏
𝑘
𝐊 𝑘 𝐏
𝑘 𝐏
𝑘 𝐏
𝑘
𝑘 𝐊 𝑘 .
(4‐367)
Therefore, the correction step of the UKF is given by the equations (4‐364), (4‐365), and (4‐367), where the necessary quantities 𝐲 𝑘 , 𝐏 𝑘 , and 𝐏 𝑘 can be computed using equation (4‐352).
Figure 4‐48: Original States and Unscented Kalman filter estimation for the nonlinear system; color scheme is the same as in Figure 4‐41 to Figure 4‐43
The described filter algorithms have been employed to the nonlinear system according to equation (4‐335), with the same initialization used with the EKF in the last section. Also, the
4.4 Comparison Between Observation and Estimation
203
same variates were used for process and measurement noise. For the UKF, the following parameters have been employed: 𝛼 1e 3, 𝜅 0, and 𝛽 2. The results are shown in Figure 4‐48 in the very same way as before for the EKF in Figure 4‐46. The results look equally, however, the performance of the UKF was better, as the RMSEs computed according to equation (4‐337) were 𝑅𝑀𝑆𝐸 1.17 and 𝑅𝑀𝑆𝐸 0.07. This concludes our discussions on state estimation. We have introduced the Kalman filter, a powerful tool for state estimation of linear systems. For nonlinear systems, we have upgraded the filter to the EKF and the UKF. These filters will be employed in chapters 5 and 7 for the cooperative navigation problems. Readers who are interested in further filtering concepts, like the Particle filter or the H filter, can find information in Simon, 2006, or Ristic et al., 2004.
4.4 Comparison Between Observation and Estimation In this chapter, we have discussed some important issues of the control and system theory domain. Especially we looked at the task to retrieve information about the internal states of a system. In this relation, we compared the concepts of (deterministic) observation and (stochastic) estimation. Both are similar within the following issues:
In both cases, we assume that we possess an adequate model of the system of interest. The inputs are considered as known; as in control theory one of the main tasks is to design a controller for a given plant, the inputs of the plant can be computed by the controller algorithms. We assume that we are able to measure the outputs. Employing the stated issues, we strive to find information on the internal states of the system; especially for those which cannot directly be measured.
We have seen the comparability exemplarily in the fact that the Luenberger observer and the continuous Kalman filter exhibit the same structure, while they only interfere in the algorithms to compute the gain factor, employing different optimization strategies. However, both concepts also exhibit some important differences. First of all, the observation concept is strictly deterministic. That means, it is assumed that all signals and system descriptions behave deterministic and can theoretically be modelled with any arbitrary accuracy, according to the concrete requirements in a specific mission scenario. The estimation explicitly incorporates stochastic behavior, that means it is assumed that certain signal or system parts behave in a completely unforeseeable way. We have discussed the available basic mechanisms to deal with stochastic occurrences, namely to employ their main moments, the expected value and the variance. We found ways to incorporate this knowledge into the estimation algorithms, which gave us traceable mathematical problems and enables us to find solutions that are optimal with respect to defined quality measurements, namely cost functions. Figure 4‐49 displays the general idea behind the observation concept. In a truly deterministic system, we employ the knowledge of inputs, outputs, and the modelled system behavior to obtain information on the states, or, to be more precise, of the deterministic value added behind the integral/ delay block. This value can be considered as a deterministic disturbance, mostly in form of unknown initial states 𝐱 . If the system is observable, it is possible to determine the initial states after a finite time by the knowledge of input and output values.
204
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
This idea was extended to design an observer that can deliver a constant state observation, while its dynamic behavior can be controlled by setting the value of the gain block.
Figure 4‐49: The principle of observation: Employ knowledge on model, inputs and outputs to obtain information on states, mainly in terms of unknown initial states
Figure 4‐50 displays the principle of estimation, mainly focuses on the Kalman filtering. In comparison to the observation concept, two important additions have been made in terms of the process and the measurement noise, 𝐰 and 𝐯, which are stochastic signals. The former one represents all inaccuracies in the modelling process and possibly in the knowledge of the input values. The latter one models the measurement error that is inevitably occurring when measuring the outputs. It is important to distinguish between the two noises in terms of their meaning for the overall concept. The measurement noise can be considered as disturbing. It is responsible for the fact that we do not have access to the original outputs of the system. We would rather prefer to know the outputs without this noise, but this is not feasible for most real systems. The process noise, on the other hand, represents the inaccuracies in our modelling and input determination, so it expresses where the true system differs from our model. That means, the derivative state vector 𝒙 𝑡 respectively the state vector of the next time step 𝐱 𝑘 , defined by equation (4‐253) or (4‐310), which includes the process noise, is considered to be closer to the ‘real’ state than the a priori prediction according to equation (4‐263), which does not include the process noise. That means, in difference to the situation with outputs and measurement noise, we are very interested in knowing the states including the process noise. This is the reason why in Figure 4‐50, the green arrows point at the process noise, as it is this noise whose influence we want to know, but cannot predict, as the noise is a stochastic signal. Therefore, we need to make use of the measurements, because they base on the true states including the process noise. But we have to consider the measurement noise. In this respect, the two‐step approach of the discrete Kalman filter becomes clearer: With the a priori estimation, we can only estimate the states without the influence of the system noise. Therefore, we need to incorporate the measurements within the a posteriori estimate. However, we cannot completely rely on the measurements due to the measurement noise. Therefore, we performed the a priori estimation before which brings our (limited) knowledge about the system behavior and the inputs into the equation. By employing the noise
4.4 Comparison Between Observation and Estimation
205
covariance matrices 𝐐 and 𝐑, we have a possibility to tune the algorithm according to our trust in the measurement errors respectively modeling inaccuracies.
Figure 4‐50: The principle of estimation: Employ knowledge on model, inputs and measurements to obtain information on states, mainly in terms of the influences of the process noise
It might also be worth mentioning that within observability, we used the definition that for an observable system the term ‘observable’ denotes that we can determine its initial states from a finite observation of inputs and outputs. Note that for the nonlinear estimation, in terms of the EKF, the initial states should be known with good accuracies for the filter initialization; otherwise there is the risk that the filter will diverge. In real applications, the operator has to decide whether the stochastic elements of the measurement/ process are small in comparison to the deterministic parts and can therefore be neglected. In this case, the methodologies from the observation theory can be employed. In the scenarios of interest for this thesis, stochastic elements have to be considered. Nevertheless, the observation methodologies will play an important role in the discussions on Optimal Sensor Placement. Figure 4‐51 displays the usage of estimation and observation theory within this thesis. Basic task is the estimation of internal system states, namely navigation data. As the stochastic elements cannot be neglected, estimation is preferred over observation. Based on measurements and system knowledge, estimation rules will be derived that base on the minimization of cost function. Cost functions will usually be based on estimation errors. As 𝐱 is the parameter of interest, it is straightforward to build a cost function on the estimation error 𝐞 . This is in general only possible, if the original vector 𝐱 is available, or if a recursive approach is employed, see e. g. the deviation of the a posteriori estimation for the discrete Kalman filter, starting with equation (4‐273). Alternatively, the estimated vector 𝐱 can be transferred through a system model to obtain an estimated output, 𝐲, which can be used to compute an alternative output error 𝐞 . The disadvantage of this concept lies in the fact that 𝐞 also contains the measurement noise of 𝐲; however, this approach leads to a traceable problem and is used if the other discussed method is not available.
206
4. Mathematical Tools Used From the Areas of Control and Systems Engineering
Figure 4‐51: Observation and Estimation and their usage within this thesis
5 Methods for Cooperative Navigation We have now introduced the basic methodologies for the discussions on Cooperative Navigation of marine robots. In the following three chapters, the basic scientific ideas are introduced and evaluated. To help the reader to understand the principal topic discussed in a certain chapter and therefore to follow the common thread, each chapter will be opened by a drawing displaying the general content of discussion. For the current chapter, the introduction picture is displayed in Figure 5‐1.
Figure 5‐1: Introduction to chapter 5
According to the discussions so far, we can summarize the general task which we will put into focus of the current chapter to estimate the position of an underwater target, which cannot be directly measured. As shown in Figure 5‐1, it is an important precondition that we must assume that the instance responsible for the target position estimation (right block in the Figure) does not know the true target position. Anyway, it is assumed that the instance has access to a model of the target and the ROs (middle block), and it might even know what the target is in general intended to do (denoted by the controller block on the left). While we will not discuss possible control strategies for target or ROs at this point, it might be part of our a priori knowledge how the control is performed. In any case, we get some information on the target in the form of noise range measurements, and we have a model of the movement possibilities which can be employed for the navigation task. In detail, in this chapter we will develop concrete solutions to the navigation tasks that make usage of several cooperating agents. The point to start from will be the GIB set‐up, as described in section 2.5.5. It is related to the task to estimate the position of an underwater target based on range‐only measurements, performed by surface buoys with access to GPS measurements. In section 5.1, we will lay the theoretical fundament for this task, limited to a static scenario, that is, the position of the target is to be estimated based on the range measurements, without considering movement of target or buoys. These discussions base on similar ones from literature. They are related to problem 1 in the problem formulation of section 3.2. We will extend these discussions in section 5.2 as we will further allow movement both of buoys and underwater targets. As we try to perform navigation for an underwater target, being at the position of the surface ROs, this is related to the ‘external navigation’, according to the definition made in section 3.1. As we explicitly allow for movement of the buoys, it is straightforward to replace them later by surface robots. The following of this principle requires
© Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2020 T. Glotzbach, Navigation of Autonomous Marine Robots, https://doi.org/10.1007/978-3-658-30109-5_5
208
5. Methods for Cooperative Navigation
a modeling strategy that incorporates knowledge about possible performances of the target in terms of maximum acceleration, change of turning rate, etc. as well as considers the acoustic communication between the team members and the fact that they move during the communication takes place. The solution of this problem is based on a GIB scenario from literature, which was extended by the author of this thesis to fit for the described benchmark scenario I in section 3.3. The work was done in the framework of the research project CONMAR (see section 1.4.2), so the underlying scenario is based on a diver which is assisted by three surface robots, as defined as benchmark scenario I. The problem to be solved refers to problem 2 according to section 3.2. We will describe the methodology with respect to the discussions in chapter 4, show some comparative simulative results as well as the results from real sea trials, which were performed in the framework of the CONMAR project. In section 5.3, we will propose a solution for ‘Internal navigation’ that was developed by the author in the framework of the research project MORPH (s. section 1.4.3). In this scenario, a team of three marine robots, of which two are moving under water, have to be enabled to perform cooperative navigation as a base for their controllers to maintain a close formation. In this scenario, it is not possible to perform range measurements to several ROs simultaneously, as before. Therefore, the measurement of the bearing angles between the robots by employment of a USBL‐system is used. This relates to the problem 3 of the problem formulation, and the accordant benchmark scenario II. We will discuss a general solution to the problem and present the results of Hardware‐In‐The‐Loop simulations performed during the MORPH project. Additionally, we will discuss the usage of an advanced filter concept and compare the results based on simulations in MATLAB. At the end of this chapter, we will summarize the results discussed before and explain how the next two chapters will continue the discussions on cooperative navigation.
5.1 Static Navigation Problem In this section, we will describe the range‐only localization problem as a static problem, following the discussions in Alcocer, 2009, especially in sections 5.1.3 to 5.1.5. We refer to the problem definition and notation introduced in section 3.2, especially in Figure 3‐3. The most important issues are summarized again: We want to estimate the position of a stationary underwater agent, denoted as target, by employing range‐only measurements performed by 𝑛 stationary surfaced reference objects (ROs). We assume that the positions of the ROs are known, and that they can communicate with each other without any significant data loss or time delay. However, the communication between target and ROs is limited. The target sends an acoustic ping at defined time instances, allowing the ROs to measure their individual range to the target, based on the TOA‐ measurements and an estimated speed of sound. The target might or might not be able to measure its own depth and to send this information instead of the ping via the acoustic channel. If this is possible, both the ranges between target and ROs as well as the depth of the target can be assumed as known, which reduces the computation to a two‐dimensional problem.
5.1 Static Navigation Problem
209
It can be stated that this approach solely uses the last available range measurements for the purpose of position estimation; it does not consider any dynamics of the involved vehicles, or measurements from earlier times. We will expand this idea for a dynamic situation in section 5.2. At this place, it is intended to provide an introduction into the topic and to show which kind of problems need to be solved for the static approach. 5.1.1 Problem Formulation Recapitulating the discussions from problem formulation in section 3.2, we can state that we consider a target located at 𝐩 , and 𝑛 ROs at positions 𝐩 ‐ 𝐩 , with 𝐩 ∈ ℝ , 𝑗 0, … , 𝑛, where 𝑚 2 ⋁ 𝑚 3 defines the geometrical dimensions to consider. We introduce the position matrix 𝐏 ∈ ℝ which contains all RO positions as 𝐩 …𝐩
𝐏
.
(5‐1)
Due to the static character of the scenario, we do not need to write these coordinates as functions of time. With 𝑟 being the distance between target and RO number 𝑖, we can state that ‖𝐩
𝑟
𝐩 ‖; 𝑖
1, … , 𝑛 ,
(5‐2)
and we can introduce the vector with the real ranges 𝐫 𝑟 …𝑟 𝑟̀ will be overlaid by zero mean Gaussian white noise 𝑣 , : 𝑟̀
𝑟
𝑣,; 𝑖
1, … , 𝑛 ,
(5‐3)
which gives rise to the measurement vector 𝐫̀ 𝐯
𝑣
…𝑣
,
,
. Each range measurement
𝑟̀ … 𝑟̀
and the disturbance vector
.
Our task will be to find an estimate for the target position vector, 𝐩 , as a function of the measurement vector 𝐫̀ . This procedure is also referred to as trilateration. While we would be interested to minimize the estimation error, 𝐩 𝐩 , this approach is not suitable, as the true 𝐩 is not known. Therefore, we can introduce the vector of the estimated ranges 𝐫 𝑟̂ … 𝑟̂ with 𝑟̂
𝐩
𝐩 ; 𝑗
1, … , 𝑛 ,
(5‐4)
and we will be able to find a mathematically traceable approach by comparing this vector with the measurement vector 𝐫̀ . For the measurement disturbance vector 𝐯 , we will assume an additive white Gaussian noise, as introduced in section 4.3.1.7, with an expected value of zero and a covariance matrix 𝐑 given by
𝐑 ∶
E𝐯𝐯
𝜎 ⎡ , ⎢𝜎 , ⎢ ⋮ ⎢ ⎣𝜎 ,
𝜎 𝜎 𝜎
, ,
⋮ ,
⋯ ⋯ ⋱ ⋯
𝜎 𝜎
,
⎤ ⎥ . ⋮ ⎥⎥ 𝜎, ⎦ ,
(5‐5)
210
5. Methods for Cooperative Navigation
Under certain condition we have yet to discuss, this matrix will be simplified to 𝐑
𝐈 𝜎 ,
(5‐6)
where 𝜎 equals the common measurement noise variance for all ROs, and 𝐈 being the Identity matrix in the dimension 𝑛. The task we have to solve is to find an estimation 𝐩 that is optimal in some sense, based noisy range measurements according to equation (5‐3). This task is referred to as parameter estimation, which we will look at in the following subsection, before returning to the problem just formulated in section 5.1.3. 5.1.2 On Parameter Estimation The methods of parameter estimation support the task to find parameters within a system model based on noisy measurements of the outputs. In the discussions in this section and the accordant subsection, we are referring to Glotzbach and Ament, 2016a. The general principle is depicted in Figure 5‐2. We assume there is a MISO‐system, representing the transformation of a input vector 𝐮 to an output 𝑦, to be described by some function 𝑓 which might be linear or even nonlinear. This function contains a number of parameters, to be summarized in a parameter vector 𝐱 . For the system being linear, 𝑓 would be represented by an ordinary differential equation, while 𝐱 would contain the coefficients of this ODE. Our task is the following: By controlling or at least observing the input vector 𝐮 for a certain amount of time, and by measuring the output 𝑦, which is overlaid by a zero‐mean Gaussian measurement noise 𝑣, and assuming we have a good knowledge of the true system structur 𝑓, formulated as 𝑓 , how can we make an estimation of the true parameter vector, denoted as 𝐱 , which is optimal in some sense. The measuring of input and output values can be performed in continuous and discrete time manner; for the sake of simplicity, we will concentrate on the discrete time option in the ongoing discussion. Summing up, we can write: 𝑦 𝑘, 𝐱
𝑓 𝐱 , 𝐮, 𝑘
𝑣 𝑘 .
(5‐7)
Figure 5‐2: The process of parameter estimation
5.1 Static Navigation Problem
211
The task described here is closely related to the method of estimation theory, as discussed in section 4.3.2. As we assume we have some basic knowledge of the system and also of the input vector, the estimation to be performed would be classified as Bayes estimation. As depicted in Figure 5‐2, we might be able to run a system model based on our a priori knowledge on 𝑓 and some estimated parameter vector 𝐱 parallel to the real system, employing the same input vector. The model will provide an estimated output vector 𝑦 as a function of 𝐱 . We shall further assume that we took 𝑛 measurements with identical sample time 𝑇. Clearly we are interested in minimizing the estimation error 𝐞 𝐱 𝐱 𝐱 , but this approach is not realizable, as the true 𝐱 is unknown. Therefore, we might employ the output error 𝑒 𝑘, 𝐱
𝑦 𝑘
𝑦 𝑘, 𝐱 , 𝑘
1, … , 𝑛 .
(5‐8)
We have to keep in mind that this error would be unequal to zero even if we managed to 0 holded. This is due to the measurement noise. We need estimate the true 𝐱 , and 𝐞 𝐱 to try and equalize that effect by performing a large number of experiments. According to Figure 5‐2, the output error is used by the assessment block to compute a cost function that shall then be minimized by the selection of a proper 𝐱 . We have discussed different typical cost functions in section 4.3.2.1. Here, we will employ the sum of the squared estimation error. Thus, it holds true that 𝐽 𝐱
𝑒
𝑖, 𝐱
.
(5‐9)
The estimated parameter vector can then be found as 𝐱
argmin 𝐽 𝐱
,
𝐱
(5‐10)
which can be classified as a Least Squares (LS) approach. In order to solve this optimization problem, we need to distinguish between two possible situations: If all Parameters in 𝐱 enter into the model equation in a linear matter, it is possible to find a direct solution; that means that the loop ‘model’, ‘assessment’, ‘optimization’ according to Figure 5‐2 is only executed once. The stated condition is fulfilled for all linear models, but also for nonlinear ones, if the parameters to estimate do not belong to the nonlinear part. The latter ones are referred to as parameter linear. For instance, the model approach 𝑦 𝑘, 𝐱
𝑥
,
𝑥
,
sin 𝜔 𝑘 𝑇 .
(5‐11)
is indubitably nonlinear. However, if 𝜔 can somehow be determined in advance, and the parameter vector to be estimated using measurements only contains 𝑥 , and 𝑥 , , the model is parameter linear, and a direct solution can be found. For models that are also parameter nonlinear, no direct solution can be found. In these cases, one usually employs iterative approaches to converge to the optimal solution in several steps. In relation to Figure 5‐2, one can state that the loop ‘model’, ‘assessment’, ‘optimization’ is executed several times. In the
212
5. Methods for Cooperative Navigation
example according to equation (5‐11), it is straightforward to say that if 𝜔 is unknown and shall be estimated together with 𝑥 , and 𝑥 , , an iterative solution has to be employed. We will discuss the direct approach in section 5.1.2.1, and iterative solutions in 5.1.2.2. 5.1.2.1 Direct Solution For the following discussions, we assume that a system is described by a model 𝑦 𝑘, 𝐱 with 𝑥 , 𝑥 , ⋯ 𝑥 , 𝑚 parameters, 𝐱 , of which all enters into to model linearly, so that the model is parameter linear. We now strive to find an algorithm to compute an optimal 𝐱 that minimizes the cost function according to equation (5‐9). Given a parameter linear model, the approach can always be written in the following form: 𝑦 𝑘, 𝐱
𝑚 𝑘 𝑥
,
𝑚 𝑘 𝑥
⋯
,
𝑚
𝑘 𝑥
,
(5‐12)
𝒎 𝑘 𝐱 ,
where 𝒎 𝑘 is a row vector containing all elements that are multiplied with the parameters in the current time step (like input values, known coefficients etc.) For instance, if the model 𝑝̂ 𝑝̂ , it can be described by equation (5‐11) possesses the parameter vector 𝐱 written as 𝑦 𝑘, 𝐱
1 sin 𝜔 𝑘 𝑇
𝑥 𝑥
, ,
(5‐13)
.
Based on our assumption that our knowledge 𝑓 of the true system structure 𝑓 is very precise, we can write the system equation of the parameter linear system as a special case of the general expression in equation (5‐7): 𝑦 𝑘, 𝐱
𝒎 𝑘 𝐱
𝑣 𝑘 .
(5‐14)
We can now insert equation (5‐12) into (5‐8) and afterwards write the 𝑛 equations for the 𝑛 taken measurements in vector form: 𝑒 𝑘, 𝐱
𝑦 𝑘
⇒𝐞 𝐱
𝐲
with 𝐞 𝐱
𝐦 𝑘 𝐱 , 𝑘
1, … , 𝑛 ,
𝐌 𝐱 , 𝑒 1, 𝐱 ⋮ 𝑒 𝑛, 𝐱
,𝐲
𝑦 1 ⋮ 𝑦 𝑛
,𝐌
𝐦 1 ⋮ 𝐦 𝑛
(5‐15) .
Thus we have introduced vectors containing the 𝑛 single values of errors and outputs respectively the matrix 𝐌 which contains all quantities that influence the true output values, especially the inputs. 𝐌 is usually referred to as design matrix, as we can influence its contents by the selection of input values. From the last discussed equations, we can introduce the following vector equations: 𝐲 𝐱
𝐌 𝐱 ,
(5‐16)
5.1 Static Navigation Problem 𝐲 𝐱
𝐌𝐱
213
𝑣 1 ⋮ 𝑣 𝑛
𝐯; 𝐯
~𝒩 𝟎, 𝐑 ,
where we have also summarized the noise in the stochastic vector 𝐰, which is zero‐mean and is described by the covariance matrix 𝐑. As discussed in section 4.3.1.7, it is common to consider typical measurement noise as Additive White Gaussian Noise (AWGN). In this case, there are no covariances between the noises at different time steps, and 𝐑 is a diagonal matrix. If we further assume that the noise process is stationary, the variance 𝜎 is constant, and we can write: 𝐑
𝜎 𝐈 .
(5‐17)
Looking at equation (5‐9) and noting that the sum of the squared elements of a vector equals the product of the transposed vector with itself, and employing equation (5‐15), the value of the cost function can be written as 𝐽 𝐱
𝐞
𝐱
𝐞 𝐱
𝐲 𝐲
𝐱
𝐲
𝐌 𝐲
𝐌𝐱
𝐲 𝐌𝐱
𝐲 𝐱
𝐌𝐱
𝐌 𝐌 𝐱 ,
where we employed the fact that for two matrices 𝐚 and 𝐛, it holds true that 𝐚 𝐛
(5‐18) 𝐛 𝐚 .
We now need to determine that 𝐱 that minimizes 𝐽 𝐱 . The necessary condition for the minimum of a function is that its first derivative has to be zero. We have to compute 𝜕 𝐽 𝐱 𝜕𝐱
0
(5‐19)
and solve the equation for 𝐱 . For the derivation of matrices with respect to matrices, the following relations hold: 𝜕 𝐚 𝐱 𝜕𝐱
𝜕 𝐱 𝐚 𝜕𝐱
𝐚 and
𝜕 𝐱 𝐀𝐱 𝜕𝐱
2 𝐀 𝐱 .
(5‐20)
Applying these relations to equation (5‐18) gives: 𝜕 𝐽 𝐱 𝜕𝐱 ⇒𝐌 𝐌𝐱
0
𝐌 𝐲
𝐌 𝐲
2𝐌 𝐌𝐱
0 (5‐21)
𝐌 𝐲
At this point, one might be tempted to multiply with the inverse of 𝐌 from the left, followed by a multiplication by the inverse of 𝐌 again from the left to obtain the final solution 𝐱 𝐌 𝐲. However, this solution is only suitable if 𝐌 (and consequently 𝐌 ) are invertible, for which they necessarily have to be squared. By inspecting the definition of 𝐌 and its contend in equations (5‐12) and (5‐15), we see that the number of rows is determined by the number of experiments executed, 𝑛, while the number of columns relates to the number 𝑚 of parameters to estimate in the parameter vector 𝐱 . It is easy to see that solving the problem currently under discussion for 𝑛 𝑚 is identical to solve a linear equation system with 𝑛
214
5. Methods for Cooperative Navigation
equations and 𝑚 unknown parameters. Under the premise that the 𝑛 equations are linearly independent, 𝐱 𝐌 𝐲 is indeed the right solution. However, we have to consider the measurement noise 𝑣 which is added to the system output according to Figure 5‐2. In order to minimize its influence, we will usually execute more measurements than we have parameters to estimate, so that 𝑛 𝑚, but the resulting equation system is inconsistent due to the noise. We need to find an approach that will adapt the parameters to all available measurements. By multiplying equation (5‐21) by the inverse of 𝐌 𝐌 from the left side, we obtain the following solution to our problem: 𝐱
𝐌 𝐌
𝐌 𝐲 .
(5‐22)
Note that in comparison to the simple solution 𝐱 𝐌 𝐲, the inverse of 𝐌 that might not exist in all cases was replaced the term 𝐌 𝐌 𝐌 , which is also referred to as Moore– Penrose pseudoinverse (see Moore, 1920, and Penrose, 1955), written as 𝐌 . The pseudoinverse fulfills the the following equations: 𝐌𝐌 𝐌
𝐌 and 𝐌 𝐌 𝐌
𝐌 .
(5‐23)
By that, we can handle situations where 𝑛 𝑚. The pseudoinverse can be computed in the stated way if 𝐌 𝐌 exists, that is if the columns in 𝐌 are linearly independent. This can be achieved by a proper selection of the single row vectors 𝐦 𝑘 in 𝐌, which usually depend on the input signal(s). As it was stated above for a linear equation system, at least 𝑚 linearly independent equations are necessary, so the input values have to be selected in that way. We still have to proof that the solution of equation (5‐22) is really a minimum of the cost function. To this extend, we compute the second derivative of 𝐽 𝐱 : 𝜕 𝜕𝐱
𝐽 𝐱
𝜕 𝜕𝐱
2𝐌 𝐲
2𝐌 𝐌𝐱
2 𝐌 𝐌
(5‐24)
By definition, for any matrix 𝐌, the product 𝐌 𝐌 is always positive semi‐definite. If 𝐌 was selected properly according to the above discussions, then 𝐌 𝐌 is invertible, this implies that all eigenvalues differ from zero. Combining these two statements, the term 2 𝐌 𝐌 is always positive definite, which proofs that the extremum according to equation (5‐21) is indeed a minimum. The algorithm according to equation (5‐22) is also referred to as linear regression. We should study its properties, especially whether it is an unbiased estimator. This is the case if the expected value of the estimation error approaches zero. Employing at first equation (5‐21) and afterwards the second equation of (5‐16), we can write: 𝐸 𝐱
𝐱
𝐸 𝐌 𝐌 𝐸 𝐌 𝐌 𝐸 𝐱 𝐌 𝐌
𝐌 𝐲 𝐌
𝐌 𝐌
𝐱
𝐌𝐱 𝐌 𝐯
𝐌 𝐸 𝐯 .
𝐯 𝐱
𝐱
(5‐25)
5.1 Static Navigation Problem
215
As we have assumed that the measurement noise is zero‐mean, it holds true that 𝐸 𝐯 0, which proofs that the estimator is unbiased. Its covariance matrix 𝐏 can be computed to be 𝐏
𝐸
𝐱
𝐱
𝐸 𝐌 𝐌
𝐱
𝐌 𝐯𝐯 𝐌
𝐌 𝐌 with
𝐸 𝐱
𝐌 𝐸 𝐯𝐯 𝐌 𝐌
…
𝐌 𝐌
𝐌 𝐌 𝐌
(5‐26)
,
𝐌 𝐌
𝐌 𝐌
.
Note that 𝐸 𝐯 𝐯 is exactly the covariance vector 𝐑 of the measurement noise. If 𝐑 can b expressed according to equation (5‐17), we can write: 𝐏
𝐌 𝐌 𝜎
𝐌
𝐌 𝐌
𝜎 𝐈 .
𝐌 𝐌 𝐌
(5‐27)
Summarizing our discussions we can state that the linear regression is an unbiased estimator with a minimal variance of the estimation error. As it was shown, it will fit a given model approach 𝑓 𝐱 to given measurements in an optimal manner, whereas optimality is achieved in terms of minimal sum of the squared error between measurement and estimated output values. But is has to be kept in mind that no statement is possible about the general eligibility of the chosen model approach 𝑓 which might tremendously differ from the true system structure. 5.1.2.2 Iterative Solution The approach discussed before cannot be employed if the parameters enter into the model approach in a nonlinear matter. Using the same analogy as before, we can say that a parameter nonlinear model will result in a nonlinear equation system for which no analytical solution exists. Alternatively, a numerical solution based on a numerical approach has to be used. That means, an initialization is performed, starting from an initial parameter estimation based on educated guessing, and the iteration counter 𝑖 is set to 0. Afterwards, an 𝐱 iterative loop is executed several times, while in every loop a new estimate 𝐱
is
computed from the current estimation 𝐱 , until a certain quality criterion is reached or a maximum number of executions 𝑖 is reached. Among the several available methods, we will discuss the line search strategy. It consists of three steps that have to be executed within every loop iteration: At first, a descent direction is computed that is supposed to point from the current estimation 𝐱 in a direction where 𝐽 𝐱 is descending. Afterwards, it is decided how far the estimation is moved into that descent direction. Finally, the new estimated vector 𝐱 is computed, and 𝑖 is increased. In the following, we will discuss these three steps: To compute a suitable descend direction, several methods can be employed. In the simplest case, the so called gradient descent is employed. It is based on the idea to compute the gradient of 𝐽 𝐱 at the current estimation 𝐱 and use the negative gradient as direction to proceed. That way, the decent direction 𝛈 ∈ ℝ can be computed to be:
216 𝛈
5. Methods for Cooperative Navigation ∇𝐽 𝐱
𝐱
,
𝐱
(5‐28)
where ∇𝐽 𝐱 is the gradient of 𝐽, developed at 𝐱 . With d𝐽 𝐱 differential of 𝐽 at point 𝐱 , d𝐽 𝐱
𝜕𝐽 𝐱 𝜕𝐱
d𝐱 ,
∈ℝ
being the first
(5‐29)
we can apply the so called first identification theorem according to Magnus and Neudecker, 1999, which gives ∇𝐽 𝐱
d𝐽 𝐱 d𝐱
,
(5‐30)
to compute ∇𝐽 𝐱 . Especially for the differential given as 𝑎 d𝐱 , it is easy to compute the gradient: d𝐽 𝐱
𝑎 d𝐱 ⇒ ∇𝐽 𝐱
𝑎 .
(5‐31)
As one can see, this method requires the cost function to be differentiable at 𝐱 . If it can be guaranteed that 𝐽 𝐱 is even two times differentiable at 𝐱 , the so called Newton’s method can be employed that usually finds a more suited descend direction and therefore accelerates the iteration at the cost of higher computational requirements. In this method, 𝛈 is given as: 𝛈
∇ 𝐽 𝐱
∇𝐽 𝐱
𝐱
,
𝐱
(5‐32)
where 𝐇 𝐱 ∇ 𝐽 𝐱 ∈ℝ derivatives of the cost function:
𝐇 𝐱
𝜕 𝐽 𝐱 ⎡ ⎢ 𝜕𝑥 , ⎢ 𝜕 𝐽 𝐱 ⎢ ⎢ 𝜕𝑥 , 𝜕𝑥 ⎢ ⋮ ⎢ 𝜕 𝐽 𝐱 ⎢ ⎣𝜕𝑥 , 𝜕𝑥
is the Hessian matrix containing the second‐order partial
𝜕 𝐽 𝐱 𝜕𝑥 , 𝜕𝑥
,
𝜕 𝐽 𝐱 ,
,
𝜕𝑥
⋯
,
⋮ 𝜕 𝐽 𝐱 𝜕𝑥 , 𝜕𝑥
⋯
⋱ ,
⋯
𝜕 𝐽 𝐱 𝜕𝑥 , 𝜕𝑥 𝜕 𝐽 𝐱 𝜕𝑥 , 𝜕𝑥 ⋮ 𝜕 𝐽 𝐱 𝜕𝑥
,
,
,
⎤ ⎥ ⎥ ⎥ ⎥ . ⎥ ⎥ ⎥ ⎦
(5‐33)
The Newton’s method only provides a descending direction if 𝐇 𝐱 is positive definite. If this is not the case, the modified Newton’s method can be used, which is not being discussed in detail here. The second step is to find an optimal step size, represented by the scalar 𝛼 , that will be multiplied with 𝛈 to determine how far the estimation will be moved in the computed descent direction. The selection of this variable is of big importance for the performance of the
5.1 Static Navigation Problem
217
algorithm. Even if 𝛈 points perfectly in the direction from the current estimate 𝐱 to the global minimum of the cost function, the following might happen: If 𝛼 is chosen too small, the improvement of the estimate in the current iteration step will be quite small, and the overall algorithm might need too much time to find the minimum of the cost function. On the other hand, if 𝛼 is chosen too big, it might happen that the new estimation will overrun the minimum and reach a final position where the cost function is clearly bigger. The perfect solution would be to find that 𝛼 that results in the lowest cost value that exists on the line that is spanned in the direction of 𝛈 from 𝐱 . This is, strictly spoken, another nonlinear optimization problem that can be formulated as: 𝛼
arg min 𝐽 𝐱
𝛼
𝛈
.
(5‐34)
In order to save computational effort, it is common to employ the so‐called Armijo rule that will return a step size which might not be optimal in the terms of equation (5‐34), but guarantees a reasonable descent of the cost function in the step to be executed. The step size is computed as follows: With 𝛽, 𝜌 ∈ 0,1 being tuning parameters, the step size is set to 𝛼 𝛽 , where 𝑙𝜖ℤ is the smallest integer to fulfill the following inequality:
𝐽 𝐱
𝐽 𝐱
𝛽 𝛈
𝜌𝛽
𝜕𝐽 𝐱 𝜕𝐱
𝛈 𝐱
.
(5‐35)
𝐱
It is common to start with 𝑙 1. If the inequality is fulfilled, 𝑙 is decreased until it is not fulfilled; otherwise it is increased until it is fulfilled. Finally, the new estimation is computed as 𝐱
𝐱
𝛼
𝛈
with 𝛼
𝛽 .
(5‐36)
This concludes the current iteration loop. The loop is entered again with an increased 𝑖, as long as the abort criterion has not been reached. It is common to compare the norm of the descending direction against a predefined small value 𝜀 and to terminate, if 𝛈
𝜀 .
(5‐37)
holds true. As we have now introduced methods both for linear and nonlinear optimization, we can return to our primary problem of static range‐only localization. 5.1.3 Position Estimation Based on Squared Range Measurements As it was stated before, section 5.1.3 to 5.1.5 borrow tremendously from Alcocer, 2009 and the references therein, where the topic is discussed at a more detailed level. At this point, we strive to give an overview of possible solutions to the static estimation problem, before shifting over to the dynamic one. Also, the methods introduced in section 5.1.4 will later be employed to validate the theoretical discussions on Optimal Sensor Placement in section 6.3.3.
218
5. Methods for Cooperative Navigation
In the situation at hand, we can say that the vector with the unknown target position data 𝐩 acts as parameter vector. We are looking for an optimal estimation 𝐩 . Both the real system description according to equation (3‐1) as well as the employed model according to equation (5‐4) are parameter nonlinear, as the parameter vector is inside of a matrix norm which includes squaring and extracting of root. As a consequence, one could use the iterative method as described above. To be able to find a direct solution, we might try to use the squared ranges instead of the ‘regular’ ones, which would already cancel the square root. But at first we have to investigate the properties of the squared range measurements. 5.1.3.1 Properties of Squared Range Measurements Let 𝑑
𝑟 𝐩
‖𝐩 𝐩
𝐩‖ 2𝐩
𝐩 𝐩
𝐩
𝐩
𝐩
𝐩 ; 𝑖
𝐩
(5‐38)
1, … , 𝑛
be the squared ranges, 𝑑
‖𝐩
𝑟̂
𝐩‖ ; 𝑖
1, … , 𝑛
(5‐39)
be the squared estimated ranges, and 𝑑
𝑟̀
𝑟
𝑣
,
𝑟
2𝑟 𝑣
,
𝑣,; 𝑖
1, … , 𝑛,
(5‐40)
be the squared measured ranges which can easily be obtained from the true measurement values. It can be stated that we generate a kind of pseudo‐measurements. The question arises: If we write the pseudo‐measurements in the same style as the true ones, namely real value plus added noise, given as 𝑑
𝑑
𝜔 ,; 𝑖
1, … , 𝑛 ,
(5‐41)
can we then assume that the pseudo measurement noise 𝜔 , is still zero‐mean white Gaussian noise? This is a precondition for the employment of the direct solution for the parameter estimation problem according to section 5.1.2.1. By transfer of equation (5‐40) respectively (5‐41) into matrix form, we obtain 𝐝
𝐝
𝛚
⇒𝛚
𝐫∘𝐫 2𝐫∘𝐯
2𝐫∘𝐯
𝐯 ∘𝐯
𝐯 ∘𝐯 ,
where for 𝐀 , 𝐁 ∈ ℝ , the Hadamard product 𝐀 ∘ 𝐁 is given by: 𝑎 ⋯ 𝑎 ⋯ 𝑏 ⋯ 𝑎 𝑏 𝑏 𝑎 𝑏 ⋮ ⋱ ⋮ ∘ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ 𝐀∘𝐁 𝑎 ⋯ 𝑎 𝑏 ⋯ 𝑏 𝑎 𝑏 ⋯ 𝑎 𝑏 and for any pair of vectors 𝐀 , 𝐁 ∈ ℝ diag 𝐚 𝐛, thus: 𝛚
2 diag 𝐫 𝐯
𝐯 ∘𝐯
,
, it holds true that 𝐚 ∘ 𝐛
(5‐42)
5.1 Static Navigation Problem
219
By looking at the equation for 𝛚 , one might already suspect that it is not Gaussian as it contains the squares of the Gaussian variables 𝐯 . Computation of the expected value yields: 𝐸 𝛚
𝐸 2𝐫∘𝐯 𝑣
𝐸
𝐯 ∘𝐯 𝜎
⋮
𝑣
2𝐫∘𝐸 𝐯
𝐸 𝐯 ∘𝐯 (5‐43)
diag 𝐑 ,
⋮
𝜎
where the operation diag ∙ , applied to a square matrix, returns a vector containing the elements in the main diagonal of the matrix, and 𝐸 𝐯 0 by definition. That already proofs that the squared error is not zero mean. For the sake of completeness, the covariance matrix 𝐏 can be computed to be: 𝐏
𝐸 𝛚 𝐸 𝛚 𝛚
𝐸 𝛚
𝛚
𝐸 𝛚 𝛚
𝐸 𝛚
diag 𝐑
diag 𝐑
𝐸 𝛚
𝐸 𝛚
,
(5‐44)
according to the definition in equation ), and using equation (5‐44) in the final transformation. The first summand gives: 𝐸 𝛚 𝛚
𝐸 2 diag 𝐫 𝐯
𝐯 ∘𝐯
4 diag 𝐫 E 𝐯 𝐯
diag 𝐫
2 diag 𝐫 𝐸 𝐯
𝐯 ∘𝐯
⋯
𝐸 𝐯 ∘𝐯 2𝐸 𝐯 ∘𝐯
𝐯 ∘𝐯 𝐯
(5‐45)
diag 𝐫
To proceed, we make use of Isserlis' theorem (Isserlis, 1918). It states that, for a multivariate 𝑋 𝑋 ⋯ 𝑋 zero‐mean normally distributed vector 𝐗 , 𝑛 ∈ ℕ, the following holds true: 𝐸 𝑋 𝑋 ⋯𝑋 𝐸 𝑋 𝑋 ⋯𝑋
𝐸 𝑋 𝑋 ,
(5‐46)
0 ,
where ∑ ∏ 𝐸 𝑋 𝑋 returns the sum over all distinct ways to group 𝑋 𝑋 ⋯ 𝑋 into pairs 𝑋 𝑋 of which always 𝑛 are multiplied. This results in the expected value of a product of three multivariate normally distributed variables being zero. As a consequence, the third and fourth summand in equation (5‐45) become zero. For a product of four variables, it follows from equation (5‐46) that 𝐸 𝑋 𝑋 𝑋 𝑋
𝐸 𝑋 𝑋 𝐸 𝑋 𝑋 𝐸 𝑋 𝑋 𝐸 𝑋 𝑋 𝐸 𝑋 𝑋 𝐸 𝑋 𝑋 .
(5‐47)
We can continue to transform equation (5‐45) to obtain: 𝐸 𝛚 𝛚
4 diag 𝐫 E 𝐯 𝐯 4 diag 𝐫 E 𝐯 𝐯
diag 𝐫 diag 𝐫
𝐸 𝐯 ∘𝐯 𝐸
𝑣
⋮ 𝑣 𝑣
𝐯 ∘𝐯 ⋯ ⋱ ⋯
𝑣
𝑣 ⋮ 𝑣
(5‐48)
220
5. Methods for Cooperative Navigation
Employing equation (5‐47), we see that 𝐸 𝑣 which gives: 𝐸 𝛚 𝛚
4 diag 𝐫 E 𝐯 𝐯 diag 3𝜎 , ⋮ 𝜎, 𝜎, 2𝜎 ,
4 diag 𝐫 E 𝐯 𝐯
diag 𝐫 𝜎
𝜎 4 diag 𝐫 E 𝐯 𝐯
,
⋮ ,
𝜎
,
𝐫 ⋯ ⋱ ⋯
𝜎
,
3𝜎
,
𝜎
,
, and 𝐸 𝑣 𝑣
2𝜎
⋮ 3𝜎 ,
2𝐑 ∘𝐑
diag 𝐑
,
𝜎
,
2𝜎
,
,
2𝜎 , ⋯ 2𝜎 , ⋮ ⋱ ⋮ 2𝜎 , ⋯ 2𝜎 , ⋯ 𝜎, 𝜎, ⋱ ⋮ ⋯ 𝜎,
diag 𝐫
𝜎
(5‐49)
diag 𝐑
.
Together with equation (5‐44), this gives: 𝐏
4 diag 𝐫 E 𝐯 𝐯
diag 𝐫
2 𝐑 ∘ 𝐑 .
(5‐50)
To check whether 𝛚 is normally distributed, the PDF has been generated based on numerical simulations of 𝐯 with 𝜎 1 m , and afterwards applying of equation (5‐42) with 𝑟 5 m. The resulting PDF is displayed in blue in the left diagram of Figure 5‐3. It is easy to see that it is not symmetric around its maximum; therefore it does not possess a Gaussian distribution.
Figure 5‐3: Probability density functions, based on numerical simulations, for real and simplified squared range measurement errors, with different relations between range r and single range measurement error variance r2
However, for realistic scenarios, it can usually be assumed that the measured ranges are tremendously bigger than the measurement error variance, as otherwise the measurement data could be considered useless. Under the assumption 𝑟 ≫ 𝜎 , we may introduce a simplified pseudo measurement noise 𝛚 as 𝛚
2 𝐫 ∘ 𝐯 .
(5‐51)
5.1 Static Navigation Problem
221
It is easy to see that this noise is zero‐mean, and it features a Gaussian distribution. Figure 5‐3 displays the PDFs of the simplified error as dashed lines. In the left diagram, where the demanded condition is not fulfilled, one can note a clear difference of PDFs for the real and the simplified error. In the right diagram, both PDfs are almost identical. We therefore require 𝑟 ≫ 𝜎 to hold, and assume the pseudo measurement noise to be zero mean and Gaussian. As we did in section 5.1.1, we also introduce the vectors 𝐝 𝛚
𝜔 𝐝
,
…𝜔
𝑑 ⋮ 𝑑
, and 𝐝
,
𝐩
𝐩
𝐩
2𝐩
𝐩
‖𝐩 ‖
⋮
2𝐩 1 ⋮ 1
, 𝐝
. From equation (5‐38), it follows that
𝐩
𝐩
𝐩
𝐩
𝐩
𝐩
𝑑 …𝑑
,
𝐩
2
𝑑 …𝑑
𝑑 …𝑑
𝐩
⋮
𝐩
𝐩 𝐩
𝐩 ⋮
(5‐52) .
𝐩
Looking at the second summand and at equation (5‐1), it becomes clear that the term in the square bracket equals the transpose of the position matrix 𝐏. The third summand is a vector containing the squared norms of the single position vectors of the ROs. We find the same expressions in the main diagonal of the matrix 𝐏 𝐏. If we replace 𝐩 by 𝐩 , we retrieve an equation to compute the estimated squared range 𝐝: 𝐝
‖𝐩 ‖ 𝟏
2𝐏 𝐩
diag 𝐏 𝐏 ,
𝐝
‖𝐩 ‖ 𝟏
2𝐏 𝐩
diag 𝐏 𝐏 .
(5‐53)
Note that the second equation expresses the estimated pseudo‐measurements with respect to the unknown parameters to estimate and is therefore related to the first equation in (5‐16). Next step is to introduce the estimation error 𝐞 between the measured and the estimated squared ranges according to equation (5‐15): 𝐞
𝐝
‖𝐩 ‖ 𝟏
𝐝
𝐝
𝐝
diag 𝐏 𝐏
2𝐏 𝐩 2𝐏 𝐩
diag 𝐏 𝐏 ‖𝐩 ‖ 𝟏
.
(5‐54)
In the last step, we have already separated the terms on the right side into those without the parameters to estimate (𝐝 diag 𝐏 𝐏 ) and those that contain the yet unknown estimation. The first group refers to 𝐲 in equation (5‐15), the second group to 𝐌 𝐱 . We are almost set for the employment of the linear regression algorithm, but there is still the problem that the 𝐌 𝐱 ‐part contains both the parameters to estimate, 𝐩 , as well as their squared norm, ‖𝐩 ‖ . In the following two subsections, we will discuss two possible ways to deal with this fact. 5.1.3.2 Unconstrained Least Squares Algorithm One possible strategy to overcome the described problem is to treat 𝐩 and ‖𝐩 ‖ as independent parameters and to neglect their relation. Thus we formulate the parameter , vector 𝐱 ∈ ℝ as 𝐱
𝐩 ‖𝐩 ‖
.
(5‐55)
222
5. Methods for Cooperative Navigation
Then we can reformulate equation (5‐54) as 𝐞
𝐝
𝐝
𝐝 𝐲
‖𝐩 ‖ 𝟏
𝐝
diag 𝐏 𝐏
2𝐏 𝐩 2𝐏
diag 𝐏 𝐏 𝐩 ‖𝐩 ‖
𝟏
𝐌 𝐱 ,
with 𝐲
𝐝
‖𝑝 ‖ ⋮ ‖𝑝 ‖
𝑑
diag 𝐏 𝐏
𝑑 𝐌
2𝐏
𝐩
𝟏
𝐩
1 ⋮
(5‐56)
;
.
1
Note that the equation for error 𝐞 is now identical to the one in equation (5‐15), which allows us to employ the described algorithm with the final solution according to equation (5‐22). This will finally deliver an estimation 𝐱 with values for 𝐩 and ‖𝐩 ‖ . As stated before, we neglect we relation between these quantities, which leads to the notation ‘unconstrained’ for the approach currently under discussion. We can multiply 𝐱 from the left side with the matrix 𝐈 𝟎 𝐍 to finally obtain the Unconstrained Least Square (LS‐U) estimate of the target position, 𝐩 , : 𝐩
𝐍 𝐌 𝐌
,
𝐌 𝐲 .
(5‐57)
In order to improve the estimation result and to employ a priori knowledge about the measurement error, it is possible to add a weighting matrix 𝐖 ∈ ℝ which has to be positive definite and symmetric. The matrix is entered into the cost function according to equation (5‐18) which gives: 𝐽 𝐱
𝐞
𝐱
𝐞 𝐱
𝐲 𝐖𝐲
𝐱 𝐱
𝐲
𝐌𝐱
𝐖 𝐲
𝐌 𝐖𝐲 𝐲 𝐖𝐌𝐱 𝐌 𝐖 𝐌 𝐱 ,
𝐌𝐱 (5‐58)
employing the fact that 𝐖 𝐖. Redoing the computations according to equations (5‐19) to (5‐22) and combing the result with (5‐57), we obtain the Unconstrained Weighted Least Square (LS‐UW) estimate 𝐩 , : 𝐩
,
𝐍 𝐌 𝐖𝐌
𝐌 𝐖 𝐲 .
(5‐59)
For the weighting matrix 𝐖, it is common to use the inverse of the covariance matrix of the squared measurement error 𝐏 , which fulfills the demands of 𝐖. As the vector with the true distances 𝐫 is unknown, equation (5‐50) cannot be used for computation. To this extend, it is common to replace 𝐫 by measurement vector 𝐫̀. Using gain the assumption that 𝑟 ≫ 𝜎 , it is straightforward to estimate the covariance matrix as:
5.1 Static Navigation Problem 𝐏
𝐏
223
4 diag 𝐫̀ E 𝐯 𝐯
diag 𝐫̀ .
(5‐60)
As the LS‐U and LS‐UW estimates are stochastic variables, it is reasonable to study the mean and covariance of the estimation error 𝐞 𝐩 , 𝐩 . We will do this for the LS‐ UW case, which includes the LS‐U case if 𝐖 is set to be a unity matrix. We already performed the computation of the expected error of the linear regression in equation (5‐25), assuming that the expected value of the measurement error, 𝐸 𝐯 , is zero. This condition is no longer fulfilled, as we have seen in equation (5‐43) that 𝐸 𝛚 diag 𝐑 . Combining this information with the introduction of the matrices 𝐍 and 𝐖, the expected value can be computed to be: 𝐍 𝐌 𝐖𝐌
𝐸 𝐞
𝐌 𝐖 diag 𝐑
,
(5‐61)
which is non‐zero. However, under the condition 𝑟 ≫ 𝜎 , we can treat is as zero. We use this for the computation of the covariance matrix of the estimation error, 𝐏 , which is given as: 𝐏
𝐸 𝐞
𝐸 𝐞
𝐍 𝐌 𝐖𝐌
⋯
𝐌 𝐖𝐸 𝛚 𝛚
𝐸 𝐞
𝐞
𝐖𝐌
𝐌 𝐖𝐌
𝐍 .
(5‐62)
We can insert 𝐏 𝐸 𝛚 𝛚 . If we further set W to be the inverse of the squared range measurement error covariance matrix, 𝐏 , as discussed above, with 𝐏 being approached according to equation (5‐60), we can write: 𝐏
𝐍 𝐌 𝐖𝐌
𝐌 𝐖𝐌
𝐍 𝐌 𝐖𝐌
𝐍 .
𝐌 𝐖𝐌
𝐍
(5‐63)
5.1.3.3 Centered Least Squares Algorithm As stated before, we will discuss another way to deal with the unwanted relation between 𝐩 and ‖𝐩 ‖ in the estimation error equation (5‐54). It is possible to multiply the equation with a matrix 𝚪 ∈ ℝ that has 𝟏 on its null space, which will erase the unwanted parameter ‖𝐩 ‖ . A possible selection would be
𝚪
𝐈
1 𝟏 𝟏 𝑛
1 𝑛
𝑛
1 1 ⋮ 1
𝑛
1 1 ⋯ ⋯
1 ⋯ ⋱ 1
1 ⋮ . 1 𝑛 1
Note that for this matrix, it hold true that 𝚪 𝟏 equation (5‐54) by 𝚪 from the left yields: 𝚪𝐞 with 𝐲
𝚪 𝐝 𝚪 𝐝
diag 𝐏 𝐏 diag 𝐏 𝐏 ; 𝐌
2𝚪 𝐏 𝐩 2𝚪𝐏
𝟎, 𝚪
𝐲
(5‐64)
𝚪, and 𝚪 𝚪
𝐌𝐱 2𝐏
,
𝚪. Multiplication
(5‐65)
224
5. Methods for Cooperative Navigation
where we introduce the centered position matrix 𝐏 𝐏 𝚪 . Its name comes from the fact that it contains the coordinates of the ROs, but in a frame with its origin at the centroid of the ROs. This can be shown by introducing the coordinates of the describes centroid, 𝐜 ∈ ℝ , as 1 𝑛
𝐜
1 𝐏 𝟏 , 𝑛
𝐩
(5‐66)
and computing 𝐏
𝐏𝚪
1 𝟏 𝟏 𝑛 𝐜…𝐜 𝐩
𝐏 𝐈
𝐩 …𝐩
1 𝐏𝟏 𝟏 𝑛 𝐜…𝐩 𝐜 . 𝐏
𝐏
𝐜𝟏
(5‐67)
Going back to equation (5‐65) and treating 𝚪 𝐞 as the error to minimize, we can redo the computations in equations (5‐18) ‐ (5‐22) to obtain the so‐called Centered Least Square (LS‐C) estimate 𝐩 , as: 𝐩
,
𝐌 𝐌
𝐌 𝐲
1 𝐏 𝐏 4 With 𝐏 𝚪 𝐩
,
with 𝚯
𝐏𝚪𝚪 1 𝐏 𝐏 2 1 𝐏 𝐏 2
𝐏𝚪
2 𝐏 𝚪 𝐝 𝐏 (due to 𝚪 𝚪 𝐏
diag 𝐏 𝐏
(5‐68)
diag 𝐏 𝐏 . 𝚪), we can write: 𝐝
𝚯 diag 𝐏 𝐏
𝐝 (5‐69)
𝐏 .
The following can be concluded from this result: In order for the LS‐C solution to be well defined, a set of at least 𝑚 1 ROs is required that must not lie on an affine lower dimension subspace of ℝ . This is due to the requirement of 𝐏 𝐏 being invertible, which requires that the matrix has the full column rank 𝑚. This already makes it necessary that 𝑛 𝑚. If the ROs are placed on an affine lower subspace smaller than 𝑚, the centered version would yield a set of linear dependent vectors, which would result in a rank smaller than 𝑚. Also, as m points define an affine proper subspace of dimension 𝑚 1, it becomes clear that 𝑚 1 ROs are necessary. For a two‐dimensional problem, that means that at least three ROs are necessary which do not lie on a straight line. If three dimensions are to be considered, one requires at least four ROs which must not lie within a plane. This awareness brings some complications, as in scenarios where surface objects are used as ROs (buoys or surface crafts), they will always be within a plane spanned by the sea surface. However, as it was discussed in section 2.4.1, precise and cheap depth sensors are available, and it is common to send the measured depth of the target via the acoustic channel, thus reducing the overall localization task to a two‐dimensional problem. From a mathematical point of view, it is necessary to state that all solution that we might yield using this approach cannot distinguish between the target being placed at the depth below or above the water at the same absolute value. From a practical point of view, we can exclude the solution with the underwater object flying above the surface.
5.1 Static Navigation Problem
225
With 𝑧̀ being the measured and communicated depth of the target, and 𝑟̀ , being the measured 3‐dimensional distance between target and the ith RO, the corresponding 2‐ dimensional distance 𝑟̀ , can be computed to be 𝑟̀ ,
𝑟̀ ,
𝑧̀
.
(5‐70)
Thus, the problem can be solved as 2‐dimensional localization problem. Even if no measurement of the depth is available, the problem can be formulated as 2‐ dimensional localization task. Let 𝐩 , and 𝐩 , , 𝑖 0, … , 𝑛 be the 3‐ or 2‐dimensional position vector of the involved objects, 𝑧 be the true depth of the target, and 𝑧 0, 𝑗 1, … , 𝑛 be the depth of the ROs, it holds true that: 𝑑,
𝐩 𝐩 𝐩
𝐩,
,
𝑥
𝐩,
,
𝐩
,
,
x
𝑦
y
𝑧
z
𝟐
(5‐71)
𝑧 𝑧
𝟐
2𝐩,
𝐩
𝐩,
,
𝐩,
At this step, we redo the step from equation (5‐38) to (5‐53), which gives: 𝐝
𝐩
𝑧
,
𝟐
𝟏
2𝐏
𝐩
,
diag 𝐏
𝐏
.
(5‐72)
Note that the unknown parameter 𝑧 is now added to the 𝐩 , ‐term and multiplied by 𝟏 . If we redo the computations of equation (5‐54) and those done within this subsection, the whole summand will disappear when multiplied with 𝚪, as it has 𝟏 on its null space. Thus we can solve the problem like a 2‐dimensional one from here on. Like it was done before in the LS‐U case, we can again introduce a weighting matrix W to the LS‐C solution. This gives rise to the Centered Weighted Least Square (LS‐CW) estimate 𝐩
𝚯
,
with 𝚯
diag 𝐏 𝐏
1 𝐏 𝐖𝐏 2
𝐝 𝐏 𝐖 .
(5‐73)
In the LS‐UW case, the inverse of the squared error covariance matrix respectively its estimate according to equation (5‐60) has been employed as weighting matrix. Now, however, we have to consider that due to the multiplication with 𝚪 performed in equation (5‐65), the error equals 𝚪 𝐞 , thus the covariance can be approximated (for 𝑟 ≫ 𝜎 ) as: 𝐏
4 𝚪 diag 𝐫̀ 𝐑 diag 𝐫̀ 𝚪 ,
(5‐74)
which is a singular matrix, because 𝚪 is rank deficient. In section 5.1.2.1 we have introduced the Moore–Penrose pseudoinverse for cases in which the inverse of a singular matrix is needed. The algorithm derived in equation (5‐22) cannot be employed in this case, as 𝚪 𝚪 is not invertible. An alternative way to find the pseudoinverse of a matrix is based on the Singular Value Decomposition (SVD). Let 𝐀 be the eigenvalue decomposition
226
5. Methods for Cooperative Navigation 𝜆 𝐕 ⋮ 0
𝐀
⋯ 0 ⋱ ⋮ 𝐕 ⋯ 𝜆
𝐕𝐁𝐕
(5‐75)
,
where 𝜆 represent the eigenvalues of 𝐀. In this case, its pseudoinverse 𝐀 is given by: 𝜃 𝐕 ⋮ 0
𝐀
⋯ 0 ⋱ ⋮ 𝐕 ⋯ 𝜃
1⁄𝜆 0
; 𝜃
if 𝜆 if 𝜆
0 . 0
(5‐76)
As a special case, if 𝐀 a real symmetric matrix, as it can be assumed here, 𝐴 posseses real eigenvalues, and it is possible to choose the eigenvectors in a way that they are orthogonal to each other. As for orthogonal matrices, its transposed is equal to its inverse, and we can write: 𝐀
𝐕 𝐁 𝐕 ,
(5‐77)
Summing up, the LS‐CW estimate with the weighting matrix set to be the pseudoinverse of the pseudo measurement covariance matrix equals: 𝐩
𝚯
,
with 𝚯
diag 𝐏 𝐏
1 𝐏 𝐏 2
𝐝
𝐏
𝐏 𝐏
(5‐78)
.
To compute the pseudoinverse, it shall be noticed that 𝐏 according to equation (5‐74) is positive definite as long as all measured ranges are larger than zero. Also, due to containing matrix 𝚪, it can be stated that 𝐏 has rank 𝑛 1, and it has the vector 𝟏 in its nullspace. Thus 𝐏 can be written as
𝐏
𝜆 𝐕 ⋮ ⋮ 0
𝐕𝐁𝐕
⋯ ⋱
⋯ 𝜆
⋯
⋯
0 ⋮ 𝐕 ⋮ 0
𝐕 𝟏
𝐁 0 0 0
𝐕 𝟏
,
(5‐79)
such that the columns of 𝐕 ∈ ℝ build an orthonormal base for the orthogonal 0. This results in the complement 𝟏 , and it holds true that 𝐕 𝐕 𝟏 and 𝐕 𝟏 following pseudoinverse:
𝐏
⎡ ⎢ 𝐕⎢ ⋮ ⎢⋮ ⎣0
⋯
⋯
⋱ ⋯
⋯
0⎤ ⋮⎥ 𝐕 ⎥ ⋮⎥ 0⎦
𝐕𝐁
𝐕 ,
(5‐80)
Note that, because 𝐏 𝟏 𝟎, it can be stated that the columns of 𝐏 belong to the orthogonal complement 𝟏 , therefore they can be writen as a linear combination of the columns of 𝐕. As a consequence, it is possible to decompose the centered position matrix 𝐏 𝐕 𝐃, where 𝐃 ∈ ℝ admits full rank if 𝐏 also does.
5.1 Static Navigation Problem
227
To conclude this section, we will shortly discuss expected error and covariance matrix of the estimation error 𝐞 𝐩 , 𝐩 , which can easily be transferred to a statement about 𝐞 by replacing 𝚯 with 𝚯. Combining equations (5‐25), (5‐61), (5‐68), and (5‐69) yields: 1 𝐏 𝐖𝐏 2 𝚯 diag 𝐑
𝐸 𝐞
𝐏 𝐖𝐸 𝚪𝛚
𝚯 𝚪𝐸 𝛚
(5‐81)
.
For the computation of the covariance matrix, we make usage of equation (5‐26), (5‐62), (5‐68), and (5‐69): 𝐏
𝐸 𝐞
⋯
𝐸 𝐞
𝐸 𝐞
𝐞
𝐸 𝚯 𝚪𝛚 𝛚 𝚯 𝐏 𝚯
𝐸 𝐞 𝚪𝚯
𝐸 𝐞 𝚯 diag 𝐑
𝚯 diag 𝐑
𝚯
4 diag 𝐫 𝐑 diag 𝐫
𝚯
4 diag 𝐫 𝐑 diag 𝐫
𝚯
diag 𝐑
𝚯
(5‐82)
2𝐑 ∘𝐑 𝚯
diag 𝐑
𝚯
,
employing the already discussed assumption that 𝑟 ≫ 𝜎 holds. In the following section, we will discuss an alternative way to perform the strived position estimation. 5.1.4 Position Estimation by Minimizing the Maximum Likelihood Function Another approach to solve the problem at hand can be found by recapitulating the discussions in section 4.3.2.3. Note that we are in a similar situation as it was assumed for the nonrandom estimation; see also Figure 4‐31: We are looking for an unknown parameter 𝐩 on the base of noisy range measurement, all but a kind of probabilistic mapping, and we have no further information on the source that created the unknown parameter. It seems to be a sound idea to employ the concept of maximum likelihood estimation, as described in section 4.3.2.4. This will result in an optimal estimate, in terms of minimal variances of the estimation error, given that the estimate is unbiased. To this extend, we need to describe the probabilistic mapping in a mathematical manner. Recall the interpretation of the likelihood function: Under the condition that a certain parameter has a specific value, what is the probability that a certain observation, that is measurement, is made? With 𝐫̀ being the measurement vector as defined before, we introduce 𝐩 ∈ ℝ as the 𝑥 ‐, 𝑦‐ and possibly 𝑧̃ ‐ coordinates of a place where the target could be. This gives rise to a range vector 𝐫 𝐩 ∈ ℝ , containing the true distances between the target, if it is at 𝐩, and the 𝑛 ROs. We know need to formulate the likelihood function, which is the PDF of the conditional probability that 𝐫̀ has occurred, given that 𝐩 are the true target coordinates. We have defined that the measurement error can be classified as zero‐mean Gaussian noise with the covariance matrix 𝐑. As the measurement noise is zero‐mean, it holds true that 𝐸 𝐫̀ 𝐫 𝐩 . As there is a total number of 𝑛 measurements available, we can treat 𝐫̀ as a multivariate Gaussian stochastic variable. We have introduced the PDF for this kind of variables in equation (4‐179). The Likelihood function 𝛬 𝐩 can therefore be written as:
228
5. Methods for Cooperative Navigation
𝛬 𝐩
1
𝑓𝐫̀ |𝐩 𝑥
2𝜋
det 𝐑
The maximum likelihood estimate 𝐩 𝐩
,
𝐫̀ 𝐫 𝐩
𝑒 ,
𝐑
𝐫̀ 𝐫 𝐩
,
(5‐83)
equals that 𝐩 at which this function has its maximum:
arg max 𝛬 𝐩 .
(5‐84)
𝐩
In order to ease the computation progress, it is common to work with the natural logarithm of 𝛬 𝐩 , also referred to as the log likelihood function. As the logarithm is strictly increasing, it does not change the position of a maximum of a function when it is applied. For Gaussian distribution according to equation (5‐83), the employment of the logarithm will turn the product into a sum and neglect the exponential function to obtain: ln
log 𝛬 𝐩 with 𝐜
ln
2𝜋 2𝜋
det 𝐑 det 𝐑 ;
1 𝐫̀ 2 f 𝐩
𝐫 𝐩
𝐑
𝐫̀
𝐫 𝐩
𝐜
1 𝐫̀ 2
𝐫 𝐩
𝐑
𝐫̀
𝐫 𝐩
𝐟 𝐩 (5‐85) .
Note that 𝐜 is a constant which is not effected by 𝐩, and the result of f 𝐩 is positive, as they are computed by squaring the values of a vector and multiplying with a positive definite matrix. For that reason, our strived estimate is the one that minimizes 𝐟 𝐩 : 𝐩
,
arg min 𝑓 𝐩 .
(5‐86)
𝐩
As it becomes obvious, this gives raise to the minimization of a nonlinear function. Adequate methods have been discussed in section 5.1.2.2. We will have a deeper look in the computation process for two different approaches. 5.1.4.1 Maximum Likelihood With Ranges (ML‐R) The approach according to equation (5‐85)/(5‐86) is also referred to as Maximum Likelihood with ranges (ML‐R) estimate. In order to use the algorithm described in section 5.1.2.2, we need to derive a descend direction, which can be the negative gradient or be computed according to Newton’s method based on the Hessian matrix. We will discuss the computation of the gradient at this point. With d𝑓 𝐩 denoting the first differential of 𝑓 at 𝐩, and with 𝑓 𝐩 according to equation (5‐85), we can write: d𝑓 𝐩
1 d𝐫 𝐩 2 𝐫̀
𝐫 𝐩
𝐑
𝐫̀ 𝐑
𝐫 𝐩
1 𝐫̀ 2
𝐫 𝐩
𝐑
d𝐫 𝐩
(5‐87)
d 𝐫 𝐩 ,
where the last transformation can be performed because 𝐑 is symmetric. The differential of 𝐫 𝐩 is a vector containing the single differentials of the pseudo distances between the possible target position 𝐩 and the ROs. Its 𝑖 th element can be computed to be:
5.1 Static Navigation Problem dr 𝐩
d ‖𝐩
𝐩‖
229
d 𝐩
𝐩
1 𝐩 𝐩 𝐩 𝐩 2 1 1 d𝐩 𝐩 𝐩 2r 𝐩 1 1 2 𝐩 2r 𝐩 1 𝐩 r 𝐩
𝐩
𝐩
𝐩
𝐩
d 𝐩
𝐩
𝐩
𝐩
𝐩
𝐩
d𝐩
(5‐88)
d 𝐩
d 𝐩 ,
This gives for the complete vector d 𝐫 𝐩 :
d𝐫 𝐩
1 ⎡ 𝐩 𝐩 r ⎢ 𝐩 ⋮ ⎢ ⎢ 1 𝐩 𝐩 ⎣r 𝐩
dr 𝐩 ⋮ dr 𝐩 1 ⎡ r ⎢ 𝐩 ⎢ ⋮ ⎢ 0 ⎣
0 ⎤ ⎥ ⋮ ⎥ 1 ⎥ r 𝐩 ⎦
⋯ ⋱ ⋯
diag 𝐫 𝐩 with 𝚽
𝐩 𝐩
d 𝐩⎤ ⎥ ⎥ d 𝐩⎥ ⎦ (5‐89)
𝐩 ⋮ 𝐩
d 𝐩
𝚽 d 𝐩 𝐩𝟏
𝐏.
Inserting this result into equation (5‐87) yields: d𝑓 𝐩
𝐫̀
𝐫 𝐩
which is in the form d𝑓 𝐩 equation (5‐31) to be ∇𝑓 𝐩
𝚽 diag 𝐫 𝐩
𝐑
diag 𝐫 𝐩
𝚽 d 𝐩 ,
(5‐90)
𝑎 d 𝐩. Thus we can compute the gradient of 𝑓 𝐩 according to
𝐑
𝐫̀
𝐫 𝐩
,
(5‐91)
which enables us to employ the algorithm presented in section 5.1.2.2 to find the ML‐R estimate. Due to the high nonlinearity of the employed cost function, there is a significant risk that the iterative algorithm might get stuck in a local minimum. In general, the performance of the algorithm bases strongly on the position of the reference objects, and the chosen staring point. To lower these risks, alternative approaches can be used, as discussed in the following. 5.1.4.2 Maximum Likelihood With Squared Ranges (ML‐SR) As it was done in section 5.1.3 and the relevant subsection, the usage of the squared ranges might improve the performance of the estimation algorithm. Employing again 𝐝 as the pseudo
230
5. Methods for Cooperative Navigation
measurement, which are the sum of the true squared ranges 𝐝 and the artificial squared error 𝛚 , we will again approach the error covariance 𝐏 according to equation (5‐60). Following the same discussions as at the beginning of section 5.1.4, we can find the Maximum likelihood with Squared Ranges (ML‐SR) estimate by minimizing the following cost function: 𝐩
1 𝐝 2
arg min 𝑓 𝐩 ; f 𝐩
,
𝐩
𝐝 𝐩
𝐏
𝐝
𝐝 𝐩
,
(5‐92)
where 𝐝 𝐩 denote the true squared ranges that would exhibit for a target positioned at 𝐩. It can be shown that the cost function is convex under certain conditions; mainly requiring the target to be placed close to the centroid of the ROs. In order to compute the gradient of the cost function, we have to redo the computations done in the last section, mainly replacing 𝑟 by 𝑑. Following (5‐87), we obtain: d𝑓 𝐩
𝐝
𝐝 𝐩
𝐏
d 𝐝 𝐩 .
(5‐93)
The computation of the differential of 𝐝 𝐩 will even be easier to perform: 𝐩 ‖𝟐
d ‖𝐩
dd 𝐩
d𝐩
𝐩
2 𝐩
𝐩
d 𝐩
𝐩
𝐩
𝐩
𝐩
𝐩
d 𝐩
𝐩
𝐩 ⋮ 𝐩
d𝐩
𝐩
(5‐94)
d 𝐩 ,
and dd 𝐩 ⋮ dd 𝐩
d𝐝 𝐩
2
𝐩
d𝐩
2 𝚽 d 𝐩 .
(5‐95)
d𝐩
Thus we obtain d𝑓 𝐩
2 𝐝
𝐏
𝐝 𝐩
𝚽 d 𝐩 ,
(5‐96)
and finally ∇𝑓 𝐩
2𝚽𝐏
𝐝
𝐝 𝐩
.
(5‐97)
5.1.4.3 Maximum Likelihood With Centered Squared Ranges (ML‐CSR) The following approach guaranties for a convex cost function, independent of the position of target and ROs. To this extend, we copy the procedures that led to the LS‐C and LS‐CW approaches in the former section. The minimization of the cost function f 𝐩
1 𝐝 2
𝐝 𝐩
𝚪𝐏
𝚪 𝐝
𝐝 𝐩
,
(5‐98)
with 𝚪 selected according to equation (5‐64) gives the Maximum Likelihood with Centered Squared Ranges (ML‐CSR) estimate 𝐩 , . The selection of the covariance matrix 𝐏
5.1 Static Navigation Problem
231
will be discussed later. To ease the computation of the gradient, we can recall from equations (5‐42) and (5‐53): 𝐝
𝐝
𝐝 𝐩
‖𝐩 ‖ 𝟏
𝛚 ‖𝐩‖ 𝟏
2𝐏 𝐩
2𝐏 𝐩
diag 𝐏 𝐏
𝛚
(5‐99)
diag 𝐏 𝐏
Subtracting both equations and multiplying with 𝚪 from the left gives: 𝚪 𝐝
𝐝 𝐩
2𝐏
𝐩
𝐩
𝚪 𝛚 .
(5‐100)
Inserting this into equation (5‐98), one obtains: f 𝐩
1 2𝐏 2 1 4 𝐩 2 2 𝐩
𝐩
𝐩
𝚪𝛚
𝚪𝐏
𝚪
2𝐏
𝐩
𝐏 𝐏
𝐏
𝐩
𝐩
2 𝐩
2𝛚
𝚪𝐏
𝐏
𝐩
𝐩
𝛚
𝐩 𝐏 𝐏 1 𝛚 𝚪𝐏 2
𝐏
𝐩
𝐩
2 𝐩
𝐩 𝐩
𝐩 𝐏 𝐏
𝚪𝐏 𝐩
𝚪𝛚 𝚪𝛚
𝚪𝛚 𝐏 𝐏
𝚪𝛚
𝚪 𝛚 .
(5‐101)
As this expression directly contains 𝐩 instead of 𝐝 𝐩 , it is easy to compute the gradient, employing equation (5‐20), to obtain: ∇f 𝐩
4𝐏 𝐏
𝐏
𝐩
𝐩
2𝐏 𝐏
𝚪 𝛚 .
(5‐102)
For the Hessian, one further derivation yields: ∇ f 𝐩
4𝐏 𝐏
𝐏
,
(5‐103)
which is always positive semidefinite. This shows that the cost function is convex. It can also be shown that for setting 𝐏
𝐈 , the minimum of the cost function is at the
same position than the LS‐C estimate according to equation (5‐69), while for 𝐏 𝐏 according to equation (5‐74), the minimum of the cost function is at the same location than the LS‐CW estimate according to equation (5‐73). 5.1.5 Comparison and Evaluation We have discussed seven different approaches for the target tracking based on range measurements only. In this section, we will conclude the results and compare the performance. As described in the last chapter, the results of ML‐CSR are identical with those of LS‐C or LS‐CW, depending on the selection of the covariance matrix 𝐏 . In the following, some results are shown of the different approaches that have been obtained by Monte Carlo simulations. We use different numbers 𝑛 of ROs and placements to demonstrate the effect. In all cases, 100 range measurements are simulated between the
232
5. Methods for Cooperative Navigation
target and each RO, based on the true distance overlaid with an AWGN which is assumed zero‐ mean with a covariance matrix 𝐑 𝐈 . For the iterative ML‐approaches, a position close to the origin has been selected as initial values for the first estimate; while for all following ones, the final result of the last iteration was employed. We treated the situation in two dimensions. We are going to study two different scenarios. For the first one, the general set‐up is depicted on the left side of Figure 5‐4 on page 234: Three ROs are placed in a triangular shape, while the target is located near the centroid of the ROs. On the right side, the estimates according to the discussed approaches are depicted, together with the 3𝜎 confidence ellipsoids based on the data. Actually, all the LS‐ methods delivered absolutely the same results, so only the black dots of the LS‐CW approach are visible. Both ML‐approaches delivered again identical results, yet a little bit different from the LS‐ methods. These interesting results motivate a discussion whether and under which conditions the different LS‐ methods deliver identical results. We will start with the LS‐U algorithm according equations (5‐56)/(5‐57): 𝐩
𝐍 𝐌 𝐌
,
with 𝐲
𝐝
𝐌 𝐲
diag 𝐏 𝐏 ; 𝐌 𝐈 𝟎
2𝐏
(5‐104)
𝟏 ;𝐍
and compare it with the LS‐algorithm according to equations (5‐68)/(5‐69): 𝐩
𝚯 diag 𝐏 𝐏
,
with 𝚯
1 𝐏 𝐏 2
𝐝 𝐏 ;𝐏
𝐏 𝚪; 𝚪
1 𝟏 𝟏 𝑛
𝐈
(5‐105)
.
It is easy to see that both deliver the same results if 𝐍 𝐌 𝐌
𝐌
𝚯 ,
(5‐106)
what we will evaluate in the following. We will start with writing the term 𝐌 𝐌 as: 𝐌 𝐌
2𝐏 𝟏
𝟏
2𝐏
4𝐏𝐏 2𝟏 𝐏
2𝐏𝟏 𝑛
,
(5‐107)
An inversion of a matrix which is partitioned into four blocks can be computed as 𝐀 𝐂
𝐀
𝐁 𝐃
𝐃
𝐁𝐃 𝐂 𝐂 𝐀 𝐁𝐃
𝐀 𝐂
𝐃
𝐁𝐃 𝐂 𝐁𝐃 𝐂𝐀 𝐁
,
(5‐108)
given that all necessary matrixes and matrix combinations are invertible, see Lütkepohl, 1996. Note that for the scenario currently under discussion, it holds true that 𝐀
𝐁𝐃 𝟒𝐏𝚪𝐏
𝐂
4𝐏𝐏 𝟒𝐏 𝐏
4𝐏𝟏 .
1 𝟏 𝑛
𝐏
𝟒𝐏
𝐈
1 𝟏 𝟏 𝑛
𝐏
(5‐109)
5.1 Static Navigation Problem
233
This gives 1 𝐏 𝐏 4
𝐌 𝐌
1 𝟏 2𝑛
1 𝐏 𝐏 2𝑛
𝐏 𝐏 𝐏
𝑛
𝟏
𝐏𝟏
𝐏 𝐏𝐏
,
(5‐110)
𝐏𝟏
and finally: 𝐍 𝐌 𝐌
𝐌
2𝐏 1 𝐏 𝐏 𝐏𝟏 𝟏 2𝑛 1 𝐏 𝐏 𝐏 𝐏𝟏 𝟏 2𝑛 1 𝐏 𝐈 𝟏 𝟏 𝑛
1 𝐏 𝐏 4 1 𝐏 𝐏 2 1 𝐏 𝐏 2 1 𝐏 𝐏 2
𝐏
(5‐111)
𝚯 .
This proves that the LS‐U and LS‐C estimates are always identical. The results of the LS‐C and LS‐CW estimate are identical if 𝚯 according to equation (5‐69) and 𝚯 according to equation (5‐78) are equal. Assume that 𝐏 has full rank, and that we fulfill the minimum requirement of ROs for 𝑚 dimensions to consider, that is, 𝑛 𝑚 1. Then we can write 𝐏 𝐕 𝐁 𝐕 according to equation (5‐80) with 𝐕 ∈ ℝ , and according to the discussions following the stated equation it holds true that 𝐏 𝐃 𝐕 , where 𝐃 ∈ ℝ is a square and nonsingular matrix. Then we can write: 𝚯
1 𝐏 𝐏 2 1 𝐃 𝐃 2
𝐏 𝐃 𝐕
1 𝐃 𝐕 𝐕𝐃 2 1 𝐃 𝐕 . 2
1 𝐃 𝐃 2
𝐃 𝐕
𝐃 𝐕 (5‐112)
For the LS‐CW case, we obtain: 𝚯
1 𝐏 𝐏 𝐏 2 1 𝐃 𝐕 𝐕𝐁 2 1 𝐃 𝐁 𝐃 2 1 𝐃 2
𝐏 𝐏
𝐕 𝐕𝐃
𝐃 𝐕 𝐕𝐁
𝐃 𝐁
1 𝐃 2
𝐕
𝐕
𝐁𝐃
(5‐113) 𝐃 𝐁
𝐕
𝐕 .
This proves that the LS‐C and the LS‐CW algorithm deliver the same result, if 𝐏 has full column rank and if the number of ROs 𝑛 equals 𝑚 1. The results of the second scenario are depicted in Figure 5‐5 on the following page. As general setup, we chose a larger number of ROs than necessary to fulfill the requirement, 𝑛 𝑚 1, and the target was placed away from the centroid of the ROs. We used the same parameters as for the first scenmario, but we omitted the unconstrained algorithms, as it has been shown
234
5. Methods for Cooperative Navigation
that their results are equal to the centralized ones. On the right side of Figure 5‐5, it becomes obvious that now the LS‐C and the LS‐CW approach deliver different results, and evaluating the numbers it can be stated LS‐CW results in smaller estimation error variances. The ML approaches, which again delivered identical results, clearly outperform the LS approaches.
Figure 5‐4: Position of RO (red triangles) and target (green dot) for scenario 1, right: zoom into the area around the target with position estimations employing different approaches
Figure 5‐5: Position of RO (red triangles) and target (green dot) for scenario 2, right: zoom into the area around the target with position estimations employing different approaches
In order to compare the different ML approaches, we have studied the course of the cost functions of the ML‐R, ML‐SR, and ML‐CSR algorithms, according to equations (5‐87), (5‐92), and (5‐98). We used 𝐑 𝐏 𝐈 for the sake of simplicity, and we realized again two scenarios with the target close and away from the centroid of the ROs. Again, the problem was treated in two dimensions. Figure 5‐6 shows the general setup of the first scenario, and the three dimensional plot as well as a contour plot of the cost functions in the area around the ROs, computed with the equations stated above. We have proved above that the ML‐CSW cost function is always convex, which can also be seen in the figure. The cost function of the ML‐R method might not be exactly convex, especially around the ROs, hence it can be stated that for an initialization within the range of the ROs, but not directly at it, one can expect that the minimization algorithm will be able to find the global minimum.
5.1 Static Navigation Problem
235
Figure 5‐6: Setup of ROs (red triangles) and target (green dot) and display of the cost functions with contour map below of the three different ML cost functions for scenario 3
For the second scenario, whose results are depicted in Figure 5‐7, the situation is different. We chose a poor placement of the ROs and positioned the target away from their centroid. It can clearly be seen that both the ML‐R and the ML‐SR cost function exhibit numerous minima. Therefore, depending on the selection of the initial estimate, the optimization process might run into the wrong minima. Only the ML‐CSR cost function again is convex. In section 6.3.3, we will need to employ one of the discussed algorithms for Monte Carlo simulations to validate our theoretical results on Optimal Sensor Placement. As for the scenario under discussion it can be guaranteed that the target is exactly in the centroid of the ROs, we will use the ML‐R algorithm, as this one is the simplest one of the ML approaches, and should be adequate due to the described target placement. Numerical simulation and further analysis performed in Alcocer, 2009 suggest that the results of the LS‐C, LS‐CW, ML‐R, and ML‐SR estimates are similar to each other, and the variance of the estimation error is also close to the Cramér‐Rao bound (CRB), according to the discussions in section 4.3.2.4, if the target position is close to the centroid of the ROs. However, the farer the target moves away from the centroid, the more it can be stated that the performance of the ML‐R and ML‐SR approach outperform the LS‐approaches, and are also still close to the CRB. For the LS‐approaches, LS‐CW performs slightly better than LS‐C, yet both can no longer considered as efficient if the target position is significantly away from the centroid. Summing up the discussions made so far, we have shown and compared numerous possibilities for target position estimation based on noisy range measurements, and can therefore conclude Problem 1, Range‐Only Localization, according to the problem formulation in section 3.2. The question might arise whether the discussed approaches could also be employed for
236
5. Methods for Cooperative Navigation
Problem 2, Range‐Only Target Tracking, and therefore the Benchmark Sceanrio I according to section 3.3.1. We prescind from that idea mainly for two reasons: Firstly, we intend to bring available information about the maneuverability of the involved ROs and target into the equation. This is beyond the scope of the algorithms discussed so far. Secondly, it is straightforward to notice the following: For economic reasons, one will often be interested to operate with the minimum number of ROs possible. For the discussed algorithms, this requires three objects for two‐dimensional and four objects for three‐dimensional scenarios. However, as we have discussed in chapter 2, the acoustic communication as the base for the range measurements is very error‐prone. The success rate for a communication (and therefore a measurement) to be successful is about 50% ‐ 90%, depending on the equipment, the environment, and a lot of other issues. If we assume a success rate of 80% and a scenario with three RO, this means that only in 51.2% of all cases there will be measurement from all three ROs available. In all other cases, no estimation can be performed, even though at least some measurement is available. This is a waste of the precious information and will result in a bad overall performance.
Figure 5‐7: Setup of ROs (red triangles) and target (green dot) and display of the cost functions with contour map below of the three different ML cost functions for scenario 4
In the remaining part of this chapter we will discuss possibilities to enhance the performance by considering these problems. This will give rise to the Kalman filter concept as discussed in the last chapter, which allows to easily include information on the maneuverability of target and ROs via the employed system model as well as an improved handling of noisy measurements.
5.2 External Navigation: Supervision of a Diver by Three Surface Robots
237
5.2 External Navigation: Supervision of a Diver by Three Surface Robots Continuing our discussions from section 5.1, we will now allow for a movement of the target and the reference objects. This gives give to the problem 2, Range‐Only Target Tracking, according to the definitions in section 3.2. Precisely, we will study the benchmark scenario I, as introduced in section 3.3.1. We will give a detailed description in the following subsection, using the setup that was employed in the CONMAR research project according to section 1.4.2. As it is easy to see, the overall scenario is similar to the GIB scenario we have discussed in section 2.5.5. Therefore, a promising strategy is to mimic the base idea when formulating the system model and the estimator. We will therefore introduce a possible general solution in section 5.2.2, which is based on the discussions in Alcocer, 2009 as well as Alcocer et al., 2007. However, the solution in the described literature cannot completely be copied, as an important condition is not given in the CONMAR scenario. We will discuss this situation precisely in section 5.2.3 and suggest two possible approaches to deal with the problem. Strictly spoken, it is necessary to develop a new measurement model of the overall system to deal with the situation. We will introduce a first simplistic approach to cover the problem, followed by a more advanced one that was developed to improve the overall performances. Both approaches will be compared in simulation in section 5.2.4. The results of the employment of the advanced method in real sea trials will be discussed in section 5.2.5. The realization of the overall system is a part of the scientific work of the author. As described, the general idea was adopted from a similar scenario in literature. Especially the improvements described in section 5.2.3 are own work of the author. Together with the validations described in sections 5.2.4 and 5.2.5, they have been published in Glotzbach et al., 2012. 5.2.1 General Setup The application that motivated the scenario under discussion is the localization of a diver by a group of autonomous surface vehicles, based on noisy range measurements. In addition to the static navigation problem discussed before, we now assume that there is a continuous movement of the diver, also denoted as target, as well of the surface vehicles, also denoted as Reference Objects (ROs). The latter ones have access to a GNSS like GPS and are therefore able to determine their position with high precision. All members carry acoustic radio modems that enable a noisy measurement of ranges between sender and receiver, whenever a successful communication occurs. One can see the similarity to the described GIB concept. For the classical GIB, the target transmits an acoustic ping at fixed time intervals. This allows for the RO to measure the times of arrival (TOA) and, due to the fixed, known transmitting times, to compute the overall signal runtime and the ranges, employing the sound speed. Practical problems with that issues have been discussed in section 3.2. To stress again the difference to the static navigation problem, it shall be noticed that at the time when an acoustic signal arrives at a RO, the target is no longer at the position at which it sent the signal, as it has most likely moved in between. This fact precludes the employment of a simple position algorithm. It is straightforward to model the movement of the target and to combine a priori knowledge of the target movement with the real measurements. This will give rise to estimation concepts as the ones discussed in section 4.3.3. On top of that, the problem just described demands the employment of a back and forward propagation strategy, which will be described in section 5.2.2.3.
238
5. Methods for Cooperative Navigation
At first, we will return to the concrete scenario under discussion, where a group of surface crafts has the task to localize a human diver. As stated, this was the main application under discussion in the research project CONMAR (see section 1.4.2) or the similar joint project CO3AUV (see Birk et al., 2011, for instance). For the latter one, it was assumed that the diver has performed a planning of a desired path which he/she intents to execute in the following dive. During the dive, the diver carries a special equipment, which contains a device for acoustic communication, an inertial measurement unit (see section 2.4.1), and computational hardware for the computations to be performed. As described before, the ROs have to measure their distances to the diver and to estimate its position. In return, they provide suggestions for heading corrections via the acoustic channel to the diver in order for him to stay at the preplanned path. The information can be presented to the diver in terms of an array of light emitting diodes (LEDs) installed on his/hers goggles; instructing him/her to change directions to the left or right. Figure 5‐8 illustrates the described scenario.
Figure 5‐8: Scenario for a diver assistant system (Glotzbach et al., 2012)
5.2.2 Solution Copied from The GIB Concept In the following, we refer to the definitions and variables according to the introduction given in section 3.2, especially for the position of target and ROs, ranges, and measurement noise. In what follows, we describe the method employed by a typical GIB system, based on discussions in Alcocer, 2009 as well as Alcocer et al., 2007. We will start with the model for the target, formulated in a discrete time state space representation, referring to the state equation. The measurement model will be related to the output equation of the state space model. Due to the specific situation under discussion, we will need to apply a so‐called Back‐and‐forward approach, which will be discussed before we finally introduce the Extended Kalman filter employed for the estimation of the target’s navigation data. 5.2.2.1 Target Model The target is described by a discrete‐time kinematic model, denoted as Random Walk with Constant Turning Rate (RWCTR). This approach allows for the adaptation of circular movement paths. We assume a constant sampling time 𝑡 and use the counting variable 𝑘 ∈ ℕ to describe the time instance 𝐱 𝑘 𝐱 𝑡 𝑘∙𝑡 . As discussed before, we assume that the equipment of the diver contains a standard depth cell, which provides access to very precise depth measurements, and the measured values can be transferred over the acoustic link to the ROs. This enables the possibility to treat the overall problem as 2D, while the depth value can either directly be taken according to the measurements, or a separate linear Kalman filter might be employed that assumes in the state model that the depth remains constant. The
5.2 External Navigation: Supervision of a Diver by Three Surface Robots
239
latter method allows incorporating the maximum possible diving and submerging rates for the target by the covariance matrix of the process noise, and additionally it will result in smoother estimates in cases of communication losses. We will concentrate on the filter for the estimation of the horizontal navigation data. With the approach just described, the state vector of a kinematic 2D RWCTR model, 𝐱 𝑘 , contains the five following sizes: 𝑥 𝑘 and 𝑦 𝑘 describe the target position in the local Cartesian XY‐frame, 𝑣 𝑘 is the magnitude of the linear velocity vector, 𝑘 represents the course angle, that is according to the discussions in section 2.2.5 the angle of the total velocity vector with respect to the x‐inertial axis, and 𝑟 𝑘 is the rate of change of . As discussed in section 2.2.5, the course angle might differ from the heading angle 𝜓 in cases where significant sea currents are acting upon the target. With these definitions, it is straightforward to formulate the target model as state difference equation in order to describe the state vector of the next time step as function of the state vector of the current time step. However, we must keep in mind that in the scenario under discussion, we are ‘located’ at the ROs and do not know what the diver does at any moment. To be more precise, he/she might change his/her velocity, resulting in changes in the values of 𝑣 𝑘 , 𝑘 , and 𝑟 𝑘 . On the other hand, he/ she cannot immediately chance his/ her position coordinates, 𝑥 𝑘 and 𝑦 𝑘 . Classically, one would incorporate the changes of 𝑣 𝑘 , 𝑘 , and 𝑟 𝑘 into the input vector 𝐮 𝑘 . However, as we are currently in an external navigation scenario according to section 3.1, the model is executed on a central computer which might be located on one of the ROs, at a supply ship or the close shore, and there is no access to the concrete actions of the target. The ‘trick’ used by RWCTR is to treat the state space system as autonomous, that is, not to use any inputs, and to describe all possible changes of the velocity as impact of the process noise. To this extend, we introduce the stochastic processes 𝑤 , 𝑤 , and 𝑤 , which are assumed to be stationary, independent, zero‐mean, and Gaussian, with constant standard deviations 𝜎 , 𝜎 , and 𝜎 , respectively. Consequently, we can write the model equations as 𝐱 𝑘
1
𝐟 𝐱 𝑘 ,𝐰 𝑘 𝑥 ⎧ ⎪𝑦 𝑣 ⎨ ⎪ ⎩𝑟
𝑘 𝑘 𝑘 𝑘 𝑘
1 1 1 1 1
with state vector 𝐱 𝑘
𝑥 𝑘 𝑦 𝑘
𝑘 𝑥 𝑘
and disturbance vector 𝐰 𝑘
∙ 𝑣 𝑘 ∙ cos 𝑘 𝑡 𝑡 ∙ 𝑣 𝑘 ∙ sin 𝑘 , 𝑣 𝑘 𝑤 𝑘 𝑡 ∙𝑟 𝑘 𝑤 𝑘 𝑟 𝑘 𝑤 𝑘
𝑦 𝑘
𝑣 𝑘
𝑘
𝑤 𝑘
𝑤 𝑘
𝑤 𝑘
(5‐114)
𝑟 𝑘 .
The target model is displayed in Figure 5‐9. In order to separate the process noise from the state transformation matrix, we can write the state model as vector equation. It is easy to see that the model is nonlinear. In section 4.3.3.6, we have introduced the notation for nonlinear systems and applied the Taylor series in order to linearize. This gave rise to the Jacobian matrices 𝐅 and 𝐆, see equation (4‐327). For the time being, we will keep the nonlinear notation, but introduce the nonlinear state space matrix 𝐅 𝐱 𝑘 . Note that the process noise enters into the state equations in a linear matter, therefore 𝐆 𝐱 𝑘 𝐆.
240
5. Methods for Cooperative Navigation
Figure 5‐9: Discrete‐time kinematic target model
As a summary, the state equation admits the representation: 𝐱 𝑘
1
𝐅 𝐱 𝑘
with 𝐅 𝐱 𝑘
and 𝐆
0 ⎡0 ⎢ ⎢1 ⎢0 ⎣0
0 0 0 1 0
𝐱 𝑘 1 0 ⎡ ⎢0 1 ⎢0 0 ⎢0 0 ⎣0 0 0 0⎤ ⎥ 0⎥ . 0⎥ 1⎦
𝐆 𝐰 𝑘 , ∙ cos 𝑘 ∙ sin 𝑘 1 0 0
𝑡 𝑡
0 0 0 1 0
0 ⎤ 0 ⎥ 0 ⎥ 𝑡 ⎥ 1 ⎦
(5‐115)
5.2.2.2 Measurement Model To get a first glance on the problem, we will look at a possible solution for the measurement model as first approach. In the later sections, this concept will be further developed to give a better fit for the stated problem. For the time being, let us assume that the used equipment allows for a noisy range measurement 𝑟̀ 𝑘 between target and the 𝑖th RO at time 𝑡 . 𝑟̀ 𝑘
𝑟 𝑘
1
𝜂∙𝑟
𝑘
∙𝑣
,
𝑘 .
(5‐116)
In the following, we will explain this equation and the variables in detail. It can be stated that 𝑟 𝑘 refers to the true distance between target and RO, 𝑟 𝑘
‖𝐩 𝑘
𝐩 𝑘 ‖
𝑥 𝑘
𝑥 𝑘
𝑦 𝑘
𝑦 𝑘
.
(5‐117)
It shall be noticed that the described approach is 2D opposite of the 3D‐charakter of reality. But as we assumed that the measurement of the vehicle depth is uncomplicated, the problem can be formulated in 2D with little computational efforts.
5.2 External Navigation: Supervision of a Diver by Three Surface Robots
241
In the second part of equation (5‐116), 𝑣 , 𝑘 represents the measurement noise which is assumed to exhibit a Gaussian white noise distribution and to be zero‐mean with variance 𝜎′ . Furthermore, practical experiences as well as physical considerations lead us to the assumption that the true variance of the measurement noise might grow with the distance. To this extend, we adopt a model where 𝜎′ represents the variance if the range approaches zero. To bring the range dependency into the equation, 𝑣 , is multiplied by the factor 1 𝜂 ∙ 𝑟 𝑘 , where 𝜂 [m‐1] expresses the rate of growth of the measurement error with respect to distance, and 𝑟 𝑘 is the true three‐diemnsional distance. In certain scenarios, depending on the target depth and position of the ROs, one might replace 𝑟 𝑘 by the two‐ dimensional 𝑟 𝑘 for the sake of simplicity. All in all, it can be stated that the true variance 𝜎 𝑘 can be expressed as: 𝜎 𝑘
𝑓 𝑟 𝑘
1
𝜂∙𝑟
𝑘
∙ 𝜎′ .
(5‐118)
It shall be noted at this point that the impact of the dependency between range and range measurement error variance is still under discussion. Experiences have shown that the effect can be neglected at small ranges (