300 83 9MB
English Pages 468 [467] Year 2020
Neural Network Control of Robot Manipulators and Nonlinear Systems
Neural Network Control of Robot Manipulators and Nonlinear Systems
F.L. LEWIS Automation and Robotics Research Institute The University of Texas at Arlington S. JAGANNATHAN Systems and Controls Research Caterpillar, Inc., M ossville A. YE~ILDIREK Manager, New Product Development Depsa, Panama City
UK USA
Taylor & Francis Ltd, I Gunpowder Square, London, EC4A 3DE Taylor & Francis Inc., 325 Chestnut Street, Philadelphia PA 19106 Copyright© Taylor & Francis 1999
All rights reserved. No part ofthis publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, electrostatic, magnetic tape, mechanical, photocopying, recording or otherwise, without the prior permission ofthe copyright owner.
British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library. ISBN 0-7484-0596-8 (cased) Library of Congress Cataloguing-in-Publication Data are available Cover design by Amanda Barragry
Contents xi
List of Tables of Design Equations List of Figures
XVlU
Series Introduction
XIX
Preface
XXl
1
2
Background on Neural Networks 1.1 NEURAL NETWORK TOPOLOGIES AND RECALL 1.1.1 Neuron Mathematical Model . . . . . . . . 1.1.2 Multilayer Perceptron . . . . . . 1.1.3 Linear-in-the-Parameter (LIP) Neural Nets 1.1.4 Dynamic Neural Networks . . . . . . . 1.2 PROPERTIES OF NEURAL NETWORKS . . . . 1.2.1 Classification, Association, and Pattern Recognition 1.2.2 Function Approximation . . . . . . . . . . . . 1.3 NEURAL NETWORK WEIGHT SELECTION AND TRAINING 1.3.1 Direct Computation of the Weights . . . . . 1.3.2 Training the One-Layer Neural Network- Gradient Descent 1.3.3 Training the Multilayer Neural Network- Backpropagation Tuning. . . . . . . . . . . . . . . . 1.3.4 Improvements on Gradient Descent . 1.3.5 Hebbian Tuning . . . . . 1.3.6 Continuous-Time Tuning 1.4 REFERENCES 1.5 PROBLEMS . . . . . . . . . . .
43 53 56 57 60 63
Background on Dynamic Systems 2.1 DYNAMICAL SYSTEMS . . . . 2 .1.1 Continuous-Time Systems 2.1. 2 Discrete-Time Systems . . 2.2 SOME MATHEMATICAL BACKGROUND 2.2.1 Vector and Matrix Norms . . . . . . . 2.2.2 Continuity and Function Norms . . . . 2.3 PROPERTIES OF DYNAMICAL SYSTEMS
67 67 68 71 75 75 76 77
v
1
2 2 7 10 14 24 25 30 32 33 35
CONTENTS
VI
2.4
2.5
2.6 2.7 3
4
2.3 .1 Stability . . . . . . . . . . . . . . 78 2.3.2 Passivity . . . . . . . . . . . . . 80 2.3.3 Observability and Controllability 83 FEEDBACK LINEARIZATION AND CONTROL SYSTEM DESIGN 86 2.4.1 Input-Output Feedback Linearization Controllers . . 87 2.4.2 Computer Simulation of Feedback Control Systems . . . . . . 92 2.4.3 Feedback Linearization for Discrete-Time Systems . . . . . . 96 NONLINEAR STABILITY ANALYSIS AND CONTROLS DESIGN 97 2.5.l Lyapunov Analysis for Autonomous Systems . . . 97 2.5.2 Controller Design Using Lyapunov Techniques . . . . . . . 103 2.5.3 Lyapunov Analysis for Non-Autonomous Systems . . . . . . 106 2.5.4 Extensions of Lyapunov Techniques and Bounded Stability 109 REFERENCES 115 PROBLEMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Robot Dynamics and Control 3.0.1 Commercial Robot Controllers 3.1 KINEMATICS AND JACOBIANS .. 3.1.l Kinematics of Rigid Serial-Link Manipulators 3.1.2 Robot Jacobians . . . . . . . . . . . . 3.2 ROBOT DYNAMICS AND PROPERTIES . 3.2.l Joint Space Dynamics and Properties 3.2.2 State Variable Representations . . . . 3.2.3 Cartesian Dynamics and Actuator Dynamics 3.3 COMPUTED-TORQUE (CT) CONTROL AND COMPUTER SIMULATION . . . . . . . . . . . . . . . . . . . . . . . 3.3.l Computed-Torque (CT) Control . . . . . . . . . . . . . . . . 3.3.2 Computer Simulation of Robot Controllers . . . . . . . . . . 3.3.3 Approximate Computed-Torque Control and Classical Joint Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4 Digital Control . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 FILTERED-ERROR APPROXIMATION-BASED CONTROL . . . 3.4.1 A General Controller Design Framework Based on Approximation. . . . . . . . . . . . . . . . 3.4.2 Computed-Torque Control Variant 3.4.3 Adaptive Control . 3.4.4 Robust Control . 3.4.5 Learning Control 3.5 CONCLUSIONS 3.6 REFERENCES 3.7 PROBLEMS . .
123 123 124 125 128 129 130 134 135
136 136 138 143 145 147 154 156 156 162 165 167 168 169
Neural Network Robot Control 173 4.1 ROBOT ARM DYNAMICS AND TRACKING ERROR DYNAMICS 176 4.2 ONE-LAYER FUNCTIONAL-LINK NEURAL NETWORK CONTROLLER . . . . . . . . . . . . . . . . . . . . . . . . . . 179 4.2.1 Approximation by One-Layer Functional-Link NN . . . . . . 180
CONTENTS
vn
4.2.2 4.2.3 4.2.4
4.3
4.4
4.5
4.6 4.7 4.8 5
NN Controller and Error System Dynamics . . . . . . . . . . 181 Unsupervised Backpropagation Weight Tuning . . . . . . . . 182 Augmented Unsupervised Backpropagation Tuning- Removing the PE Condition . . . . . . . . . . . . . . . . . . . . . . 187 4.2.5 Functional-Link NN Controller Design and Simulation Examplel90 TWO-LAYER NEURAL NETWORK CONTROLLER . . . . . . . . 191 4.3.1 NN Approximation and the Nonlinearity in the Parameters Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 194 4.3.2 Controller Structure and Error System Dynamics . . . . . . 196 4.3.3 Weight Updates for Guaranteed Tracking Performance . . . 198 4.3.4 Two-Layer NN Controller Design and Simulation Example 206 PARTITIONED NN AND SIGNAL PREPROCESSING 206 4.4.1 Partitioned NN . . . . . . . . . . . . . . . . . . . . . 206 4.4.2 Preprocessing of Neural Net Inputs . . . . . . . . . . 209 4.4.3 Selection of a Basis Set for the Functional-Link NN 209 PASSIVITY PROPERTIES OF NN CONTROLLERS 212 4.5.1 Passivity of the Tracking Error Dynamics 212 4.5.2 Passivity Properties of NN Controllers 213 CONCLUSIONS 216 REFERENCES 217 PROBLEMS . . 219
Neural Network Robot Control: Applications and Extensions 5.1 FORCE CONTROL USING NEURAL NETWORKS 5.1.1 Force Constrained Motion and Error Dynamics . . . . . . 5.1.2 Neural Network Hybrid Position/Force Controller . . . . 5.1.3 Design Example for NN Hybrid Position/Force Controller 5.2 ROBOT MANIPULATORS WITH LINK FLEXIBILITY, MOTOR DYNAMICS, AND JOINT FLEXIBILITY . . . . . . . . . . . . . . 5.2.1 Flexible-Link Robot Arms . . . . . . . . . . . . . . . . . . . . 5.2.2 Robots with Actuators and Compliant Drive Train Coupling 5.2.3 Rigid-Link Electrically-Driven (RLED) Robot Arms 5.3 SINGULAR PERTURBATION DESIGN . . . . . . . . . . . . . . . 5.3. l Two-Time-Scale Controller Design . . . . . . . . . . . . . . . 5.3.2 NN Controller for Flexible-Link Robot Using Singular Perturbations . . . . . . . 5.4 BACKSTEPPING DESIGN . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Backstepping Design . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 NN Controller for Rigid-Link Electrically-Driven Robot Using Backstepping 5.5 CONCLUSIONS 5.6 REFERENCES 5.7 PROBLEMS . .
221 222 223 225 232
233 233 238 244 245 246 249 258 258 262 267 270 272
CONTENTS
VIII
6
Neural Network Control of Nonlinear Systems 6.1 SYSTEM AND TRACKING ERROR DYNAMICS. 6.1.1 Tracking Controller and Error Dynamics . 6.1.2 Well-Defined Control Problem 6.2 CASE OF KNOWN FUNCTION g(x) . . . . . . 6.2.1 Proposed NN Controller . . . . . . . . . . 6.2.2 NN Weight Tuning for Tracking Stability 6.2.3 Illustrative Simulation Example .. 6.3 CASE OF UNKNOWN FUNCTION g(x) ... . 6.3.1 Proposed NN Controller . . . . . . . . . . 6.3.2 NN Weight Tuning for Tracking Stability 6.3.3 Illustrative Simulation Examples 6.4 CONCLUSIONS 6.5 REFERENCES . . . . . . . . . . . . . .
277 278 279 281 281 282 283 286 287 287 289 296 301 303
7
NN Control with Discrete-Time Tuning 7.1 BACKGROUND AND ERROR DYNAMICS 7.1.l Neural Network Approximation Property 7.1.2 Stability of Systems . . . . . . . . . . . . 7.1.3 Tracking Error Dynamics for a Class of Nonlinear Systems 7.2 ONE-LAYER NEURAL NETWORK CONTROLLER DESIGN 7.2.1 Structure of the One-layer NN Controller and Error System Dynamics . . . . . . . . . . . . . . . . . . . 7.2.2 One-layer Neural Network Weight Updates . . . . . . . . 7.2.3 Projection Algorithm . . . . . . . . . . . . . . . . . . . . . 7.2.4 Ideal Case: No Disturbances or NN Reconstruction Errors . 7.2.5 One-layer Neural Network Weight Tuning Modification for Relaxation of Persistency of Excitation Condition . . . . . 7.3 MULTILAYER NEURAL NETWORK CONTROLLER DESIGN .. 7.3.1 Structure of the NN Controller and Error System Dynamics . 7.3.2 Multilayer Neural Network Weight Updates . . . . . . . . . . 7.3.3 Projection Algorithm . . . . . . . . . . . . . . . . . . . . . . 7.3.4 Multilayer Neural Network Weight Tuning Modification for Relaxation of Persistency of Excitation Condition 7.4 PASSIVITY PROPERTIES OF THE NN . . . . . . . . . . . . . . . 7.4.1 Passivity Properties of the Tracking Error System . . . . . . 7.4.2 Passivity Properties of One-layer Neural Networks and the Closed-Loop System . . . . . . . . . . . . . . . . . 7.4.3 Passivity Properties of Multilayer Neural Networks 7.5 CONCLUSIONS 7.6 REFERENCES 7. 7 PROBLEMS . .
305 306 306 308 308 310
311 312 316 321 321 327 330 331 338 340 350 350 352 353 354 354 356
CONTENTS
IX
8 Discrete-Time Feedback Linearization by Neural Networks 359 8.1 SYSTEM DYNAMICS AND THE TRACKING PROBLEM . . 360 8.1.1 Tracking Error Dynamics for a Class of Nonlinear Systems 360 8.2 NN CONTROLLER DESIGN FOR FEEDBACK LINEARIZATION 362 8.2.1 NN Approximation of Unknown Functions . 363 8.2.2 Error System Dynamics . . . . 364 8.2.3 Well-Defined Control Problem . . . . . . . 366 8.2.4 Proposed Controller . . . . . . . . . . . . . 367 8.3 SINGLE-LAYER NN FOR FEEDBACK LINEARIZATION . 367 8.3.l Weight Updates Requiring Persistence of Excitation 368 8.3.2 Projection Algorithm . . . . . . . . . . . . . . . . . . 375 8.3.3 Weight Updates not Requiring Persistence of Excitation 376 8.4 MULTILAYER NEURAL NETWORKS FOR FEEDBACK LINEARIZATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 8.4.1 Weight Updates Requiring Persistence of Excitation . . 384 8.4.2 Weight Updates not Requiring Persistence of Excitation 390 8.5 PASSIVITY PROPERTIES OF THE NN . . . . . . . . . . . . 402 8.5.1 Passivity Properties of the Tracking Error System . . . 405 8.5.2 Passivity Properties of One-layer Neural Network Controllers 406 8.5.3 Passivity Properties of Multilayer Neural Network Controllers 407 8.6 CONCLUSIONS 409 8.7 REFERENCES 409 411 8.8 PROBLEMS . . 9 State Estimation Using Discrete-Time Neural Networks 413 9.1 IDENTIFICATION OF NONLINEAR DYNAMICAL SYSTEMS 415 415 9.2 IDENTIFIER DYNAMICS FOR MIMO SYSTEMS . . . . . . . 9.3 MULTILAYER NEURAL NETWORK IDENTIFIER DESIGN . 418 9.3.1 Structure of the NN Controller and Error System Dynamics . 418 420 9.3.2 Three-Layer Neural Network Weight Updates 425 9.4 PASSIVITY PROPERTIES OF THE NN 9.5 SIMULATION RESULTS 427 9.6 CONCLUSIONS 428 9.7 REFERENCES 428 9.8 PROBLEMS . . 430
ist of Tables 1.3.1 1.3.2 1.3.3
Basic Matrix Calculus and Trace Identities . . Backpropagation Algorithm Using Sigmoid Activation Functions: Two-Layer Net . . . . . . . . . . . . . . . . . . . . . . . . . . . . Continuous-Time Backpropagation Algorithm Using Sigmoid Activation Functions . . . . . . . . ..
38
47 59
3.2.1 3.3.1 3.4.1
Properties of Robot Arm Dynamics Robot Manipulator Control Algorithms Filtered-Error Approximation-Based Control Algorithms .
131 137 153
4.1.1
Properties of Robot Arm Dynamics ...... . FLNN Controller for Ideal Case, or for Nonideal Case with PE FLNN Controller with Augmented Tuning to A void PE . . Two- Layer NN Controller for Ideal Case . . . . . . . . . . Two-Layer NN Controller with Augmented Backprop Tuning Two-Layer NN Controller with Augmented Hebbian Tuning .
176 184
199 201 204
Properties of Robot Arm Dynamics . . . . . NN Force/Position Controller. NN Controller for Flexible-Link Robot Arm . NN Backstepping Controller for RLED Robot Arm .
222 228 255 266
6.2.1
Neural Net Controller with Known g(x) . Neural Net Controller with Unknown f(x) and g(x)
284 290
7.2.1
Discrete-Time Controller Using One-Layer Neural Net: PE Required . . . . . . . . . . . . . . . . . . . . . . Discrete-Time Controller Using One-Layer Neural Net: PE not Required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discrete-Time Controller Using Three-Layer Neural Net: PE Required . . . . . . . . . . . . . . . . . . . . . . . . Discrete-Time Controller Using Three-Layer Neural Net: PE not Required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 4.2.2 4.3.1 4.3.2 4.3.3 5.0.1
5.1.1
5.3.1 5.4.1 6.3.1
7.2.2 7.3.1 7.3.2
Xl
188
313 322 332 344
LIST OF TABLES
xn
8.3.1 8.3.2 8.4.1 8.4.2 9.3.1
Discrete-Time Controller Using One-Layer Neural Net: PE Required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discrete-Time Controller Using One-layer Neural Net: PE not Required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discrete-Time Controller Using Multilayer Neural Net: PE Required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discrete-Time Controller Using Multilayer Neural Net: PE not Required . . . . . . . . . . . . .
392
Multilayer Neural Net Identifier
425
368 377 384
List of Figures 1.1.1 1.1.2 1.1.3 1.1.4 1.1.5
1.1.6 1.1.7 1.1.8 1.1.9 1.1.10 1.1.11 1.1.12 1.1.13 1.1.14 1.1.15 1.1.16 1.1.17 1.1.18 1.1.19 1.1.20 1.1.21 1.1.22 1.1.23 1.2.l 1.2.2
Neuron anatomy. From B. Kosko (1992). Mathematical model of a neuron. Some common choices for the activation function. One-layer neural network. . . . . . . . . . . . . . . Output surface of a one-layer NN. (a) Using sigmoid activation function. (b) Using hard limit activation function. (c) Using radial basis function. . . . . . . . . . . . . . . Two-layer neural network. . . . . . . . . . . EXCLUSIVE-OR implemented using two-layer neural network. Output surface of a two-layer NN. (a) Using sigmoid activation . . . . . function. (b) Using hard limit activation function. . Two-dimensional separable gaussian functions for an RBF NN. Receptive field functions for a 2-D CMAC NN with second-order . . . . . splines. . . . . . . . . . Hopfield dynamical neural net. . . . . . . . . . . . . Continuous-time Hopfield net hidden-layer neuronal processmg element (NPE) dynamics. . . . . . . . . . . . . . . Discrete-time Hopfield net hidden-layer NPE dynamics. Continuous-time Hopfield net in block diagram form. . . Hopfield net functions. (a) Symmetric sigmoid activation function. (b) Inverse of symmetric sigmoid activation function. Hopfield net phase-plane plots; x 2 (t) versus x 1 (t). Lyapunov energy surface for an illustrative Hopfield net. Generalized continuous-time dynamical neural network. Phase-plane plot of discrete-time NN showing attractor. Phase-plane plot of discrete-time NN with modified V weights. Phase-plane plot of discrete-time NN with modified V weights showing limit-cycle attractor. . . . . . . . . Phase-plane plot of discrete-time NN with modified A matrix. Phase-plane plot of discrete-time NN with modified A matrix and V matrix. . . . . . . . . . . . . . . . . . . . . . . Decision regions of a simple one-layer NN. . Types of decision regions that can be formed using single- and multi-layer NN. From R.P. Lippmann (1987). . . . . . . . . . . . Xlll
2 3 4 5
6 8 8 11 13 13 14 15 15 16 18 19 20 21 22 23 23 24 25 26 27
LIST OF FIGURES
XlV
1.2.3
1.3.1 1.3.2 1.3.3 1.3.4 1.3.5 1.3.6 1.3.6 1.3.7 1.3.8 1.5.l 1.5.2 2.1.1 2.1.2 2.1.3 2.3.l 2.3.2 2.3.3 2.4.1 2.4.2 2.4.3 2.5. l 2.5.2 2.5.3 2.5.4 2.5.5 2.5.6 2.5.7
Output error plots versus weights for a neuron. (a) Error surface using sigmoid activation function. (b) Error contour plot using sigmoid activation function. (c) Error surface using hard limit activation function. . . . . . . . . . . . . . . . . . . . Pattern vectors to be classfied into 4 groups: +, o, x, *· Also shown are the initial decision boundaries. . . . . . . . . . NN decision boundaries. (a) After three epochs of training. (b) After six epochs of training. . . . . . . . . . . . Least-squares NN output error versus epoch . . . . . . . . The adjoint (backpropagation) neural network. . . . . . . Function y = f(x) to be approximated by two-layer NN and its samples for training. . . . . . . . . . . . . . . . . . . . . . . . . . Samples of f(x) and actual NN output. (a) Using initial random weights. (b) After training for 50 epochs. . . . . . . . . . . . . Samples of f(x) and actual NN output (cont'd). (c) After training for 200 epochs. (d) After training for 873 epochs. . . . Least-squares NN output error as a function of training epoch. Typical 1-D NN error surface e = Y - a-(VT X). . . . . . . A dynamical neural network with internal neuron dynamics. A dynamical neural network with outer feedback loops. Continuous-time single-input Brunovsky form. . . . . Van der Pol Oscillator time history plots. (a) x 1(t) and x2(t) versus t. (b) Phase-plane plot x2 versus x 1 showing limit cycle. Discrete-time single-input Brunovsky form. . . . . . Illustration of uniform ultimate boundedness (UUB). . System with measurement nonlinearity. . . . . . . . . Two passive systems in feedback interconnection. . . . Feedback linearization controller showing PD outer loop and nonlinear inner loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulation of feedback linearization controller, T= 10 sec. (a) Actual output y(t) and desired output Yd(t). (b) Tracking error e(t). (c) Internal dynamics state x 2(t). . . . . . . . . . . . . . Simulation of feedback linearization controller, T= 0.1 sec. (a) Actual output y(t) and desired output Yd(t). (b) Tracking error e(t). (c) Internal dynamics state x 2(t). . . . . . . . . . . . . Sample trajectories of system with local asymptotic stability. (a) x1(t) and x2(t) versus t. (b) Phase-plane plot of x 2 versus x1 ... Sample trajectories of SISL system. (a) x 1(t) and x 2(t) versus t. (b) Phase-plane plot of x 2 versus x 1. . . . . . . A function satisfying the condition xc(x) > 0. . . . . . . . . . . . Signum function. . . . . . . . . . . . . . . . . . . . . . . . Depiction of a time-varying function L(x, t) that is positive definite (L 0(x) < L(x,t)) and decrescent (L(x,t) ~ L 1(x)). . . . SampletrajectoriesofMathieusystem. (a) x 1 (t) andx2(t) versus t. (b) Phase-plane plot of x 2 versus x 1. . . . . Sample closed-loop trajectories of UUB system. . . . . . . .
29 40 42 43 49 49 51 52 53 54 63 64 69 72 73 79 81 83 89 94 95 100 102 103 104 107 110 114
LIST OF FIGURES
3.1.l 3.1.1 3.1.2 3.2.l 3.3.l 3.3.2 3.3.3 3.3.4 3.3.5 3.3.6 3.3.7 3.3.8 3.3.9 3.3.10 3.3.11 3.3.12 3.4.l 3.4.2 3.4.3 3.4.4 3.4.5 3.4.6 3.7.l 3.7.2 4.0.l 4.1.l 4.2.1 4.2.2 4.2.3 4.2.4 4.2.5
xv
Basic robot arm geometries. (a) Articulated arm, revolute coordinates (RRR). (b) Spherical coordinates (RRP). (c) SCARA arm (RRP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Basic robot arm geometries (cont'd.). (d) Cylindrical coordinates (RPP). (e) Cartesian arm, rectangular coordinates (PPP) . . . . . 127 Denavit-Hartenberg coordinate frames in a serial-link manipulator.127 Two-link planar robot arm. . . . . . . . 132 PD computed-torque controller. 138 PD computed-torque controller, Part I. 139 Joint tracking errors using PD computed-torque controller under ideal conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Joint tracking errors using PD computed-torque controller with constant unknown disturbance. . . . . . . . . . . 142 PD classical joint controller. . . . . . . . . . . . . . . . . . . . 144 Joint tracking errors using PD-gravity controller. . . . . . . . 145 Joint tracking errors using classical independent joint control. 146 Digital controller, Part I: Routine robot.m Part I. . . . . . . 148 Joint tracking errors using digital computed-torque controller, T= 20 msec. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Joint 2 control torque using digital computed-torque controller, T= 20 msec. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Joint tracking errors using digital computed-torque controller, T= 100 msec. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Joint 2 control torque using digital computed-torque controller, T= 100 msec. . . . . . . . . . . . . . . . . . . . 152 Filtered error approximation-based controller . . . . . . . . . . . . 155 Adaptive controller. . . . . . . . . . . . . . . . . . . . . . . . . . 160 Response using adaptive controller. (a) Actual and desired joint angles. (b) Mass estimates. . . . . . . . . . . . . . . . . . . . . . 161 Response using adaptive controller with incorrect regression matrix, showing the effects of unmodelled dynamics. (a) Actual and desired joint angles. (b) Mass estimates. 162 Robust controller. . . . . . . . . . . . 166 Typical behavior of robust controller. 167 Two-link polar robot arm. . . . . 170 Three-link cylindrical robot arm. 171 Two-layer neural net. . . . . . . Filtered error approximation-based controller. . One-layer functional-link neural net. Neural net control structure. . . . . . . . . . . Two-link planar elbow arm. . . . . . . . . . . . Response of NN controller with backprop weight and desired joint angles. . . . . . . . . . . . . . . Response of NN controller with backprop weight sentative weight estimates. . . . . . . . . . . . .
174 178 179 181 190
tuning: actual . . . . . . . . . 191 tuning: repre. . . . . . . . . 192
LIST OF FIGURES
XVJ
4.2.6 4.2.7 4.2.8 4.3.1 4.3.2 4.3.3 4.4.l 4.4.2 4.4.3 4.4.4 4.4.5 4.5.l
Response of NN controller with improved weight tuning: actual and desired joint angles. . . . . . . . . . . . . . . . . . . . . . . . 192 Response of NN controller with improved weight tuning: representative weight estimates. 193 Response of controller without NN. actual and desired joint angles.193 Multilayer NN controller structure. 197 Response of NN controller with improved weight tuning: actual and desired joint angles. . . . . . . . . . . . . . . . . . . . . . . . 207 Response of NN controller with improved weight tuning: representative weight estimates. . . . . . . . 207 Partitioned neural net. . . . . . . 209 Neural subnet for estimating M(q)( 1 (t). 211 Neural subnet for estimating Vm(q, q)(2(t). 211 Neural subnet for estimating G(q). . . . . 211 Neural subnet for estimating F(q). 212 Two-layer neural net closed-loop error system. 214
5.1.1 5.1.2 5.1.3 5.1.4 5.1.5
Two-layer neural net. . . . . . . . . . . . . Neural net hybrid position/force controller. Closed-loop position error system. . . . . . Two-link planar elbow arm with circle constraint. NN force/position controller simulation results. (a) Desired and actual motion trajectories q1d(t) and q1(t). (b) Force trajectory
5.2.1 5.2.2
Acceleration/deceleration torque profile r(t). Open-loop response of flexible arm: tip position qr (t) (solid) and velocity (dashed). . . . . . . . . . . . . . . . . Open-loop response of flexible arm: flexible modes qh (t), qh (t). Two canonical control problems with high-frequency modes. (a) Flexible-link robot arm. (b) Flexible-joint robot arm. . . . DC motor with shaft compliance. (a) Electrical subsystem. (b) Mechanical subsystem. . . . . . . . . . . . . . . . . Step response of DC motor with no shaft flexibility. Motor speed in rad/s. . . . . . . . . . . . . . . Step response of DC motor with very flexible shaft. Neural net controller for flexible-link robot arm. Response of flexible arm with NN and boundary layer correction. Actual and desired tip positions and velocities, E = 0.26 . . . . . . Response of flexible arm with NN and boundary layer correction. Flexible modes, E = 0.26. . . . . . . . . . . . . . . . . . Response of flexible arm with NN and boundary layer correction. Actual and desired tip positions and velocities, E = 0.1. . . . . . Response of flexible arm with NN and boundary layer correction. Flexible modes, E = 0.1. . . . . . . . . . Backstepping controller. . . Backstepping neural network controller.
5.2.3 5.2.4 5.2.5 5.2.6 5.2.7 5.3.1 5.3.2 5.3.3 5.3.4 5.3.5 5.4. l 5.4.2
,\(t). . . . . . . . . .
. . . . . . . . . . . .
. . . . . .
226 227 231 232 234 238 239 239 241 242 244 245 252 256 256 257 258 259 264
LIST OF FIGURES
5.4.3
5.4.4
xvii
Response of RLED controller with only PD control. (a) Actual and desired joint angle q1 (t). (b) Actual and desired joint angle q2(t). (c) Tracking errors ei(t),e2(t). (d) Control torques I 0 there exists a J(c, t 0 ) > 0 such that llx 0 -xell < J(c, to) implies that llx(t) - Xell < f fort ~to. The stability is said to be uniform (e.g. uniformly SISL) if J ( ·) is independent of to; that is, the system is SISL for all to. It is extremely interesting to compare these definitions to those of function continuity and uniform continuity. SISL is a notion of continuity for dynamical
79
2.3. PROPERTIES OF DYNAMICAL SYSTEMS
BoundB
x(t) x,,+B
:_ -----_J_ _--
_T______ T__ _
to to+T
t
T Figure 2.3.1: Illustration of uniform ultimate boundedness (UUB). systems. Note that for SISL there is a requirement that the state x(t) be kept arbitrarily close to Xe by starting sufficiently close to it. This is still too strong a requirement for closed-loop control in the presence of unknown disturbances. Therefore, a practical definition of stability to be used as a performance objective for feedback controller design in this book is as follows. Boundedness. This definition is illustrated in Fig. 2.3.1. The equilibrium point Xe is said to be uniformly ultimately bounded (UUB) if there exists a compact set SC ~n so that for all xo ES there exists a bound Band a time T(B, xo) such that llx(t) - xell ~ B for all t 2 to+ T. The intent here is to capture the notion that for all initial states in the compact set S, the system trajectory eventually reaches, after a lapsed time of T, a bounded neighborhood of Xe. The difference between UUB and SISL is that in UUB the bound B cannot be made arbitrarily small by starting closer to Xe· In fact, the van der Pol oscillator in Example 2.1.2 is UUB but not SISL. In practical closed-loop systems, B depends on the disturbance magnitudes and other factors. If the controller is suitably designed, however, B will be small enough for practical purposes. The term uniform indicates that T does not depend on t 0 . The term ultimate indicates that the boundedness property holds after a time lapse T. If S = ~n the system is said to be globally UUB (GUUB). A Note on Autonomous Systems and Linear Systems. autonomous so that
x = f (x)
If the system is
(2.3.2)
where f(x) is not an explicit function of time, then the state trajectory is independent of the initial time. This means that if an equilibrium point is stable by any of the three definitions, the stability is automatically uniform. Non-uniformity is only a problem with non-autonomous systems. If the system is linear so that
x = A(t)x
(2.3.3)
CHAPTER 2.
80
BACKGROUND ON DYNAMIC SYSTEMS
with A(t) an n x n matrix, then the only possible equilibrium point is the origin. For linear time-invariant (LTI) systems, matrix A is time-invariant. Then, the system poles are given by the roots of the characteristic equation ~(s)
=Isl - A
I= 0,
(2.3.4)
with II ·II the matrix determinant and s the Laplace transform variable. For LTI systems, AS corresponds to the requirement that all system poles be in the open left-half plane (i.e. none are allowed on the jw-axis). SISL corresponds to marginal stability, that is, all the poles in the left-half plane, with any poles on the jw-axis nonrepeated. 2.3.2
Passivity
Passive systems are important in robust control where a feedback control system must be designed to offset the effects of bounded disturbances or unmodelled dynamics. Since we intend to define some new passivity properties of NN, we discuss here some notions of passivity (Goodwin and Sin 1984; Landau 1979; Lewis, Abdallah, and Dawson 1993; Slotine and Li 1991). Passivity is extensively used in the theory of networks and n-port devices. 2.3.2.1
Passivity of Continuous-Time Systems
A continuous-time system (e.g. (2.1.1)) with input u(t) and output y(t) is said to be passive if it verifies an equality of the so-called power form
L(t) = YT u - g(t)
(2.3.5)
for some L(t) that is lower bounded and some g(t) ?: 0. That is (see Problems section), (2.3.6) for all T?: 0 and some I?: 0. Often, L(t) is the total energy, kinetic plus potential; then, the power input to the system is yT u and g(t) is the dissipated power. We say the system is dissipative if it is passive and in addition (2.3.7) A special sort of dissipativity occurs if g(t) is a monic quadratic function of llxll with bounded coefficients, where x(t) is the internal state of the system. We call this state strict passivity (SSP). Then, (2.3.8) for all T?: 0 and some/?: 0, where LOT denotes lower-order terms in llxll· Then, the L 2 norm of the state is overbounded in terms of the L 2 inner product of output and input (i.e. the power delivered to the system).
2.3.
81
PROPERTIES OF DYNAMICAL SYSTEMS
y
x
Figure 2.3.2: System with measurement nonlinearity. Somewhat surprisingly, the concept of SSP has not been extensively used in the literature (Lewis, Liu, and Ye§ildirek 1993; Seron et al. 1994), though see Goodwin and Sin (1984) where input and output strict passivity are defined. We use SSP to advantage in subsequent chapters to conclude some internal boundedness properties of neural network controllers without the usual assumptions of observability (e.g. persistence of excitation) that are required in standard adaptive control approaches (see Chapter 4). Example 2.3.l (Passivity of System with Nonlinearity) Many practical systems have nonlinear and/or discontinuous measurements or actuators, including backlash, deadzones, saturation limits, and so on. The time-varying system with nonlinear measurements (Slotine and Li 1991)
y
h(x)
(2.3.9)
>.(t) :'.'.'. 0, is depicted in Fig. 2.3.2. The nonlinearity h(x) satisfies the positivity condition xh(x) > 0 for x /: 0 which means it has the same sign as its argument. Otherwise, it is arbitrary and can even be discontinuous. Select the trial function
L
=
1x
h(z)dz :'.'.'. 0
and, using Leibniz' rule, differentiate to determine
L = h(x)x = yu - h(x)>.(t)x = yu - g(t), which is in power form. Therefore, the system is passive. Since the condition (2.3. 7) holds 0 if >.(t) is not identically zero, the system is also dissipative.
2.3.2.2
Passivity of Discrete-Time Systems
The passivity notions defined here are used later in Lyapunov proofs of stability. Discrete-time Lyapunov proofs are considerably more complex than their continuous-time counterparts; therefore, the required passivity notions in discretetime are more complex.
CHAPTER 2.
82
BACKGROUND ON DYNAMIC SYSTEMS
Define the first difference of a function L( k) : Z+
-7 ~
as
LlL(k) := L(k + 1) - L(k).
(2.3.10)
A discrete-time system (e.g. (2.1.11)) with input u(k) and output y(k) is said to be passive if it verifies an equality of the power form
LlL(k) = yT(k)Su(k)
+ uT(k)Ru(k) - g(k)
(2.3.11)
for some L(k) that is lower bounded, some function g(k) : '.'.'. 0, and appropriately defined matrices R, S. That is (see Problems section), T
L
(yT(k)Su(k)
T
+ uT(k)Ru(k)) : '.'.'. l:g(k)
-1 2
(2.3.12)
for all T : '.'.'. 0 and some "( ::'.'.'. 0. We say the system is dissipative if it is passive and in addition T
L
(yT(k)Su(k)
+ uT(k)Ru(k)) f-
T
0 implies Lg(k)
> 0,
(2.3.13)
for all T > 0. A special sort of dissipativity occurs if g(k) is a monic quadratic function of Jlxll with bounded coefficients, where x( k) is the internal state of the system. We call this state strict passivity (SSP). Then, T
L
(YT (k)Su(k)
+ uT (k)Ru(k)) : '.'.'.
k=O
T
L
k=O
(llxll 2 +LOT)
- "( 2
(2.3.14)
for all T : '.'.'. 0 and some / : '.'.'. 0, where LOT denotes lower-order terms in Jlxll · Then, the l 2 norm of the state is overbounded in terms of the l 2 inner product of output and input (i.e. the power delivered to the system). We use SSP to conclude some internal boundedness properties of the system without the usual assumptions of observability (e.g. persistence of excitation) that are required in standard adaptive control approaches. 2.3.2.3
Interconnections of Passive Systems
To get an indication of the importance of passivity, suppose two passive systems are placed into a feedback configuration as shown in Fig. 2.3.3. Then,
i1 (t)
i2 (t)
U1
U2
y[u1-g1(t) y§ u2 - g2(t) u - Y2 Y1
(2.3.15)
and it is very easy to verify (see Problems section) that (2.3.16)
2.3. PROPERTIES OF DYNAMICAL SYSTEMS
u
U1
~ ~
j •
-
--
83
Y1
(L,,g,)
-- Y2
~.82)
-
~
~
~
Figure 2.3.3: Two passive systems in feedback interconnection. That is, the feedback configuration is also in power form and hence passive. Properties that are preserved under feedback are extremely important for controller design. If both systems in Fig. 2.3.3 are state strict passive, then the closed-loop system is SSP. However, if only one subsystem is SSP and the other only passive, the combination is only passive and not generally SSP (see Problems section). It also turns out that parallel combinations of systems in power form are still in power form. These results are particular cases of what is known in circuit theory as Tellegen's power conservation theorem (Slotine and Li 1991). Series interconnection does not generally preserve passivity (see Problems section). 2.3.3
Observability and Controllability
Observability and controllability are properties of the open-loop system- when they hold, it is possible to design feedback controllers to fulfill desired closed-loop performance specifications (e.g. track a reference trajectory and keep all internal states stable). The discussion in this subsection centers around the nonlinear continuous-time system
f(x) + g(x)u h(x),
y
(2.3.17) (2.3.18)
which is said to be affine in the control input u(t), and the linear time-invariant (LTI) system y
arn,
Ax+Bu Cx,
(2.3.19) (2.3.20)
which is denoted (A, B, C). Let x E u E arm, y E 3'(P. The definitions extend in a straightforward manner to discrete-time systems. 2. 3. 3.1
Observability
Observability properties refer to the suitability of the measurements taken in a system; that is, the suitability of the choice of the measurement function h(-) in the
84
CHAPTER 2. BACKGROUND ON DYNAMIC SYSTEMS
output equation (2.3.18). A system with zero input u(t) = 0 is (locally) observable at an initial state xo if there exists a neighborhood S of xo such that, given any other state x1 E S, the output over an interval [t 0 , T] corresponding to initial condition x(t 0 ) = x 0 is different from the output corresponding to initial condition x(to) = x1. Then, the initial state can be reconstructed from output measurements over a time interval [to,T]. Consider the time-varying system y
A(t)x
(2.3.21)
C(t)x
(2.3.22)
and define the state-transition matrix (t,t 0 ) E
~nxn
by
d dt (t, to)= A(t)(t, to), (t 0 , t 0 ) =I.
(2.3.23)
The key object in observability analysis is the observability gramian given by
N(to,T)
= iT T(r,to)CTC(r,to) to
dr.
(2.3.24)
The system is said to be uniformly completely observable {UCO) (Sastry and Bodson 1989) if there exist positive constants J, a1, a2 such that (2.3.25) for all to 2: 0. In the linear time-invariant case (A, B, C), observability tests are straightforward (Kailath 1980). The state transition matrix for linear systems is
(t, to) = eA(t-to)
(2.3.26)
and the observability gramian is
N(to, T)
= 1T eAr (r-to)cT CeA(r-to) to
dr.
(2.3.27)
The system is observable if, and only if, N(t 0 , T) has full rank n. This observability condition can be shown equivalent to the requirement that the observability matrix
c
V=
CA CA 2
(2.3.28)
have full rank n. Note that matrix B does not enter into these requirements. Matrix V is of full rank n if, and only if, the discrete-time observability gramian
Go=
vrv
(2.3.29)
85
2.3. PROPERTIES OF DYNAMICAL SYSTEMS
is nonsingular. If the system is observable and the input u(t) is zero, the initial state can be reconstructed from the output y(t) measured over the interval [to, T] using the functional operator (2.3.30) Persistence of Excitation.
An important property related to observability is
persistence of excitation (PE). A vector w(t) E 1RP is said to be PE if there exist positive constants J, a 1 , a 2 such that (2.3.31)
for all to 2: 0. The integral may be interpreted as a gramian and PE could be compared to the definition of uniform complete observability. PE is a notion of a time signal's containing 'sufficient richness' so that the matrix defined by the L 2 outer product in the definition is nonsingular. Note that the p x p vector outer product matrix w(t)wT(t) has rank of only one for any given t. However, when integrated over the interval [t 0 , t 0 + J] the requirement is that the resulting matrix be nonsingular. Roughly speaking, if w(t) is a p-vector, it should have at least p distinct complex frequencies to be PE. For example, if p = 4, w(t) could be the sum of four real exponentials, or contain sinusoidal components at two frequencies, etc. 2.3.3.2
Controllability
Controllability properties refer to the suitability of the control inputs selected for a system; that is, the suitability of the choice of the input function g(-) in the state equation (2.3.17). A system is (locally) controllable at an equilibrium state Xe if there exists a neighborhood S of Xe such that, given any initial state x(to) E S, there exists a final time T and a control input u(t) on [O, T] that drives the state from x(t 0 ) to Xe. A system is (locally) reachable at a given initial state x(t 0 ) if there exists a neighborhood S of x(to) such that, given any prescribed final state xd(T) E S, there exists a final time T and a control input u(t) on [O, T] that drives the state from x(to) to xd(T). (Vidyasagar 1993). In the linear time-invariant case (A, B, C) one may give tests for controllability and reachability that are easy to perform (Kailath 1980). Then, local and global controllabilty properties are the same. For continuous LTI systems, reachability and controllability are the same and the key object in analysis is the controllability gmmzan
M(to, T)
= 1T eA(T-r) BET eAT(T-r) to
dr.
(2.3.32)
The system is controllable (equivalently reachable) if M(t 0 ,T) has full rank. It can be shown that if M(t 0 , T) has full rank for any T > t 0 , then it has full rank for
86
CHAPTER 2. BACKGROUND ON DYNAMIC SYSTEMS
all T > t 0 . This controllability condition can be shown to reduce to a full-rank condition on the controllability matrix (2.3.33) Note that matrix C does not enter into these requirements. Matrix U is of full rank n if, and only if, the discrete-time controllability gramian (2.3.34) is nonsingular. If the system is controllable, the initial state x(t 0 ) can be driven to any desired final state xd(T) using the control input computed according to (see Problems section) (2.3.35) In the case of discrete LTI systems a similar analysis holds, but then reachability is stronger than controllability. That is, for discrete systems it is easier to drive nonzero initial states to zero than it is to drive them to prescribed nonzero final values.
2.4
FEEDBACK LINEARIZATION AND CONTROL SYSTEM DESIGN
For linear time-invariant (LTI) systems there are a wide variety of controller design techniques that achieve a range of performance objectives including state regulation, tracking of desired trajectories, and so on. Design techniques include the linear quadratic regulator, H-infinity and other robust control techniques, adaptive control, classical approaches such as root-locus and Bode design, and so on. Generally, as long as the system is controllable it is possible to design a controller using full state-feedback that gives good closed-loop performance. Some problems occur with non-minimum phase systems, but several techniques are now available for confronting these. If only the outputs can be measured, then good performance can be achieved by using a dynamic regulator as long as the system is both controllable and observable (e.g. linear quadratic gaussian design using a Kalman filter). Unfortunately, for nonlinear systems controls design is very much more complex. There are no universal techniques that apply for all nonlinear systems; each nonlinear system must generally be considered as a separate design problem. Though there are techniques available such as Lyapunov, passivity, hyperstability, and variablestructure (e.g. sliding mode) approaches, considerable design insight is still required. Feedback linearization techniques offer a widely applicable set of design tools that are useful for broad classes of nonlinear systems. They function by basically converting the nonlinear problem into a related linear controls design problem. They are more powerful, where they apply, than standard classical linearization techniques such as Lyapunov's indirect method where the nonlinear system is linearized using Jacobian techniques. References for this section include (Slotine and Li 1991, Isidori 1989, Khalil 1992, Vidyasagar 1993).
2.4.
FEEDBACK LINEARIZATION AND CONTROL SYSTEM DESIGN
2.4.1
87
Input-Output Feedback Linearization Controllers
There are basically two feedback linearization techniques~ input-state feedback linearization and input-output (i/o) feedback linearization. The former requires a complex set of mathematical tools including Frobenius' Theorem and Lie algebra. The control laws derived are often complex due to the need to determine nonlinear state-space transformations and their inverses. On the other hand, i/o feedback linearization is direct to apply and represents more of an engineering approach to control systems design. It is very useful for large classes of nonlinear controls problems including those treated in this book, which encompass robot manipulators, mechanical systems, and other Lagrangian systems. 2.4.1.1
Feedback Linearization Controller Design
Here, we discuss i/o feedback linearization as a controller design technique for systems of the form
F(x, u) h(x).
y
(2.4.1)
The technique is introduced through a sample design. Sample Plant and Problem Specification.
Given the system or plant dynam-
ics
(2.4.2) it is desired to design a tracking controller that causes x 1 (t) to follow a desired trajectory Yd(t) which is prescribed by the user. This is a complex design problem that is not approachable using any of the LTI techniques mentioned above. I/O Feedback Linearization Step. The tracking problem is easily approached using i/o feedback linearization. The procedure is as follows. Select the output
y(t)
= x1(t).
(2.4.3)
Note that the output is defined by the performance specifications. Differentiate y(t) repeatedly and substitute state derivatives from (2.4.2) until the control input u(t) appears. This step yields y y
X1X2 + X3 + X1X2 + X3 [sin x1 + X2X3 +xix~]+ [1 + xi]u f(x) + g(x)u.
X1 =
X1X2
Now define variables as z 1
= z =y y,
2
Z2
(2.4.4)
so that
f(x)
+ g(x)u.
(2.4.5)
88
CHAPTER 2. BACKGROUND ON DYNAMIC SYSTEMS
This may be converted to a linear system by redefinition of the input as
=f(x) + g(x)u(t)
v(t) so that
u(t)
(2.4.6)
1
= g(x) (-f(x) + v(t)),
(2.4.7)
for then one obtains Z2
(2.4.8)
v, which is equivalent to
y= v.
(2.4.9)
This is known as the feedback linearized system. Controller Design Step. Standard linear system techniques can now be used to design a tracking controller for the feedback linearized system. For instance, one possibility is the proportional-plus-derivative (PD) tracking control
(2.4.10) where the tracking error is defined as
e(t) =: Yd(t) - y(t).
(2.4.11)
Substituting this control v(t) into(2.4.9) yields the closed-loop system (2.4.12) or equivalently, in state-space form
] [: ]
.
(2.4.13)
As long as the PD gains are positive, the tracking error converges to zero. The PD gains should be selected for suitable percent overshoot and rise time. According to (2.4.7) and (2.4.10) the complete controller implied by this technique is given by
u(t) =
g(~) [-f(x) +Yd+ Kde + Kpe],
(2.4.14)
where the nonlinear functions f(x),g(x) are defined in (2.4.4).
2.4.1.2
Structure of I/O Feedback Linearization Controller
=[
=
=
The structure of the i/o feedback linearization controller is depicted in Fig. 2.4.1, e e]T, JL [y y]T, JLd [Yd Ydf. It consists of a PD outer tracking where f loop plus a nonlinear inner linearization loop. The function of the inner feedback loop is to linearize the plant so that the system, between the points shown, looks like 1/ sn (in this example, n= 2). Then the PD controller, the design of which is based on the system 1/ sn, achieves tracking behavior. Note that the controller
2.4.
FEEDBACK LINEARIZATION AND CONTROL SYSTEM DESIGN
89
inner feedback linearization loo
+------outer tracking loop
Figure 2.4.1: Feedback linearization controller showing PD outer loop and nonlinear inner loop. incorporates feedforward acceleration compensation through the term Yd; this is a form of predictive control. A major advantage of the feedback linearization controller is that it contains a unity-gain outer tracking loop, which provides robustness and is highly desirable in many practical applications (e.g. aircraft control system design). This design technique also decouples the nonlinear compensation design step from the tracking performance specification design step. Note that the feedback linearization controller generally requires full state feedback in computing g (x), f (x). It is to be emphasized that the feedback linearization controller generally has performance far exceeding that of classical linearization controllers based on Jacobian linearization techniques. No approximation is involved in feedback linearization design. 2.4.1.3
Ill-Defined Relative Degree
For this technique to work, the function g(x) multiplying the control input u(t) in (2.4.5) must never be zero (see (2.4.7)). In this example g(x) = 1 +xi. If g(x) can be zero in a particular plant, the plant is said to have ill-defined relative degree. Otherwise, it has well-defined relative degree. Even if the system is ill defined, i/o feedback linearization may still be applied under some circumstances. In fact, it can be shown that as long as the closed-loop system z2
f(x) + g(x)u g(x).
(2.4.15)
with output ((t) is observable, then a modification of (2.4.7) still works (Commuri and Lewis 1994). The observability requirement means that the control influence coefficient g(x) may only be small, so that control effectiveness is reduced, when the state x(t) is also small.
90
2.4.1.4
CHAPTER 2. BACKGROUND ON DYNAMIC SYSTEMS
Internal Dynamics and Zero Dynamics
In the sample problem, the linearized dynamics (2.4.8) are only of order two while the plant (2.4.2) is of order three. Therefore, something has been neglected in the controls design procedure. A complete dynamical description of the closed-loop system can be obtained by adding to [e some additional state components independent of z; (t). In this example, one choice is x 2 , so that the closed-loop system is
ef
0 -KP -2x2
1
-Kd
] [: ]
+ ~[-f(x) +Yd+ Kde + Kpe]. 1+ x1
(2 .4.16) (2.4.17)
Note that the control u(t) has been substituted into the equation for x2 (t). The additional dynamics neglected in the feedback linearization design are made unobservable by this design procedure; that is, with respect to the design output y(t) = x 1 (t) they are unobservable. They are known as the internal dynamics. A different choice of y(t) results in different internal dynamics (see Problems section). The zero dynamics is defined as the internal dynamics when the control input is selected to keep the output y(t) equal to zero. If the zero dynamics are unstable, the system is said to be non-minimum phase. The zero dynamics extends to nonlinear systems the concept of system zeros. In the LTI case, the zero dynamics define the system zeros. For i/o feedback linearization to function correctly, the internal dynamics must be stable. Then, the controller given in Fig. 2.4.l performs in an adequate manner. In this example the zero dynamics are (2.4.18) which says that x 2 (t) = e- 2 tx 2 (0); this is an asymptotically stable (AS) system. Therefore, the i/o feedback linearization controller performs correctly. Specifically, the internal dynamics (2.4.17) represent an AS system driven by additional signals. If the PD feedback linearization controller operates correctly, these additional signals are all bounded. The state of an AS system with bounded input is also bounded, so all is well. One can see that, as the desired acceleration Yd(t) increases, the magnitude of state x 2 (t) will increase, so that internal dynamics can fundamentally limit the performance capabilities of nonlinear systems. If the pole at s = 2 were further to the left in the s-plane, the situation would be improved and faster prescribed trajectories could be followed. 2.4.1.5
Modelling Errors, Disturbances, and Robustness
In the sample design it was assumed that all the dynamics are exactly known and there are no disturbances. However, in practical situations there can be unmodelled dynamics or unknown disturbances. Such effects degrade the performance of the feedback linearization controller and are investigated in the Problems section. However, the controller is surprisingly robust to such effects. This is in great measure due to the outer PD tracking loop. The robustness properties can be improved
91
2.4. FEEDBACK LINEARIZATION AND CONTROL SYSTEM DESIGN
even further by using various adaptive control techniques, or by adding a specially designed robustifying signal to the control input u(t). Such techniques are discussed in subsequent chapters of the book. In fact, the function of neural networks when used in closed-loop control is exactly to compensate for unmodelled dynamics and unknown disturbances. 2.4.1. 6
Proportional-Integral-Derivative (PID) Outer Tracking Loop
The steady-state error and disturbance rejection capabilities of the controller can be improved in many situations by using a proportional-derivative-integral (PID) controller in the outer tracking loop instead of the PD controller (2.4.10). Then, instead of (2.4.14) one has E
u(t)
e g(x)
[-f(x) +Yd+ Kde + Kpe +Kie].
(2.4.19)
This is an important example of a controller with its own dynamics- the controller now has a state r:;(t) associated with it, so that it has some internal memory. 2.4.1. 7
Feedback Linearization for Systems in Brunovsky Form
If the system is in Brunovsky form, i/o feedback linearization is very easy. For the system X1
X2
X2
X3
Xn
f(x)
(2.4.20)
Y = X1
+ g(x)u (2.4.21)
the design is direct, for the linearization step is not needed. In fact, the system is already in the chained form (2.4.5). Therefore, the input redefinition 1
u(t) = g(x) (-f(x)
+ v)
(2.4.22)
is the feedback linearization controller, for then the closed-loop system is simply y(n)
= v,
(2.4.23)
where superscript (j) denotes the j-th derivative. The outer-loop control v(t) may be selected using an extension of PD or PID control to obtain a control structure like that in Fig. 2.4.l. One possibility is
u(t) = gtx) [-f(x)
+ y~n) + Kne(n-l) + ... + K 1 e],
(2.4.24)
with Yd(t) the desired trajectory and e(t) = Yd(t)-y(t) the tracking error. Note that feedback of the error and its first n -1 derivatives is needed, along with feedforward of y ~n) ( t). The closed- loop dynamics is (2.4.25)
CHAPTER 2. BACKGROUND ON DYNAMIC SYSTEMS
92
which is stable if the feedback gains [Kn ... K 1 ] are suitably selected. Here, there are no internal dynamics. One can write this in matrix form by defining the error vector
(2.4.26)
and the gain vector
(2.4.27)
Then the closed-loop error dynamics may be written as
(A - bK)f
(2.4.28)
(f,~, Yd)
where A, bare the canonical form matrices (2.1.6). An equation has been added for the internal dynamics ~(t), which may be present in some examples. A similar procedure may be used to design feedback linearization for multivariable Brunovsky canonical form systems (2 .1. 7).
2.4.2
Computer Simulation of Feedback Control Systems
In Example 2.1.2 it was shown how to simulate an open-loop nonlinear system
using the state-space description and a Runge-Kutta integrator. A conscientious engineering procedure for design of feedback control systems involves analysis of the open-loop dynamics to determine the system properties, then controller design, then computer simulation of the closed-loop system prior to implementing the controller on the actual plant. The simulation step is essential to verify closedloop performance and ensure that nothing has been overlooked in the design step. Continuous-time dynamical systems with feedback controllers are straightforward to simulate. A Runge-Kutta integrator is required, such as ODE23 in MATLAB. Then, a subroutine must be written that has two parts: the computation of the control input u(t) followed by the plant dynamics. The technique is illustrated using the sample design problem. Example 2.4.1 (Simulation of Feedback Linearization Controller) For the sample design problem (2.4.2), the complete closed-loop description is given by the plant dynamics
xi
X2 :i;3
X1X2
+ X3
-2X2 + X1U sinx1 + 2x1x2
+u
and the controller
f(x) g(x) y
e
u(t)
=
sinx1 + X2X3 + X1X~ 1 + x~ X1 Yd - Y g(~) [-f(x) +Yd+ Kde
+ Kpe],
2.4. FEEDBACK LINEARIZATION AND CONTROL SYSTEM DESIGN
93
where the desired trajectory Yd(t) is prescribed by the user. All this information must be written into a MATLAB M file that is called by ODE23. Suppose that it is desired for the plant output y( t) to follow the desired trajectory
Yd= sin(2-rrt/T). Then, the MATLAB M file for ODE23 is:
%MATLAB
file for closed-loop system simulation function xdot= fblinct(t,x) global T % Computation of the desired trajectory yD= sin(2*pi*t/T) ; yDdot= (2*pi/T) * cos(2*pi*t/T) ; yDddot= -(2*pi/T)-2 * sin(2*pi*t/T) %Computation of the control input kp= 100 ; kd= 14.14 ; f sin(x(1)) + x(2)*x(3) + x(1)*x(2)-2 g = 1 + x(1)-2 ; y = x(1) ; ydot= x(1)*x(2) + x(3) e = yD - y ; edot= yDdot - ydot u = (-f + yDddot + kd*edot + kp*e) I g % Plant dynamics xdot(1) x(1)*x(2) + x(3) xdot(2) = -2*x(2) + x(1)*u xdot(3) = sin(x(1)) + 2*x(1)*x(2) + u ; Note how closely the structure of the M file follows the controller and plant equations. The selection of the PD gains as I
0 for xc(x)>O for
x =f. 0 x=f.O
which simply require that they represent physically meaningful damping and spring effects, respectively. Note that these assumptions mean that b(O) = 0, c(O) = 0.
a. Lyapunov Analysis Select as a Lyapunov function candidate the positive definite function
L
= ~.i 2 +1"' c(z)dz, 2
0
which is the sum of the kinetic and potential energy of the system. Differentiating using Leibniz' rule yields where
x was
t
=.ix+ c(x).i
= -b(.i).i:::; o,
eliminated by substitution from the system dynamics. The entire state is
[xx f, however, only .i appears explicitly in £, which is therefore only negative semidefinite.
Therefore, this shows the system is SISL. In the Problems section it is shown that the
CHAPTER 2. BACKGROUND ON DYNAMIC SYSTEMS
102
11me hlltorles of SISL ayetem
1.4.--.----.----.----.----....--....--..---..----.-----.
~
~o.e
s: "!(
:o.6
I
0.4 0.2 00
0.5
1.5
2
2.5
tlme(aec)
3
3.5
"'
4.5
5
(a)
Phase-plane plot of SISL system
1.4..------.-----.----.-----.-----...---~
1.2
0. system is actually AS by a technique based on Barbalat's Lemma introduced in Section 2.5.4. l. The function L(x) is the power dissipated in the system.
b. Passivity
Lyapunov functions are closely related to the passivity notions introduced in Section 2.3.2. Considering the forced system
x+b(x)+c(x)=F one sees that
t = xF -
b(x)x,
which is of the power form (2.3.5). Thus, the system from input F to output xis passive. D
2.5.2
Controller Design Using Lyapunov Techniques
Though we have presented Lyapunov analysis only for unforced systems in the form (2.5.1), which have no control input, these techniques also provide a powerful set of tools for designing feedback control systems for systems of the form
x = f(x) + g(x)u.
(2.5.7)
Thus, select a Lyapunov function candidate L(x) > 0 and differentiate along the system trajectories to obtain
· L(x)
oL .
oL
= ox x = ox [f(x) + g(x)u].
(2.5.8)
Then, it is often possible to ensure that L :S 0 by appropriate selection of u(t). When this is possible, it generally yields controllers in state-feedback form, that is, where u (t) is a function of the states x (t). Practical systems with actuator limits and saturation often contain discontinuous functions including the signum function defined for scalars x E ~ as
sgn(x)
= { ~l,
x x
2'. 0 0.) Locally positive semidefinite if L(x, t) 2'. Lo(x) for some time-invariant positive semidefinite L 0 (x), for all t 2'. 0 and x ES. (Denoted L(x, t) 2'. 0.) Locally negative definite if L(x, t) :S Lo (x) for some time-invariant negative definite L 0 (x), for all t 2'. 0 and x ES. (Denoted L(x, t) < 0.) Locally negative semidefinite if L(x, t) :S L 0 (x) for some time-invariant negative semidefinite L 0 (x), for all t 2'. 0 and x ES. (Denoted L(x,t) :S 0.) Thus, for definiteness of time-varying functions, a time-invariant definite function must be dominated. All these definitions are said to hold globally if S = A time-varying function L(x, t) : x W--+ R is said to be decrescent if L(O, t) = 0, and there exists a time-invariant positive definite function L 1 (x) such that
wn.
wn
L(x, t) :S Li(x), Vt 2'. 0.
(2.5.24)
The notions of decrescence and positive definiteness for time-varying functions are depicted in Fig. 2.5.5. Example 2.5.5 (Decrescent Function) Consider the time-varying function
L(x, t) = x 1 2
+
x~ . 2 + sm t
.
Note that 1 :=:; 2 + sint :=:; 3, so that
x~
L(x, t) ~ Lo(x) := x 1 + 3' 2
CHAPTER 2. BACKGROUND ON DYNAMIC SYSTEMS
108
and L(x, t) is globally positive definite. Also,
L(x,t) ~ L1(x)
= x~ + x~
so that it is decrescent.
0
To evaluate the non-autonomous candidate Lyapunov function derivative L(x, t) along the system trajectories, as required in the next result, one must use
· 8L L(x,t) = 8t
+
8L. 8L oxx = 7Jt
8L
+ oxf(x,t).
(2.5.25)
Now the following stability results may be stated. Theorem 2.5.5 (Lyapunov Results for Non-Autonomous Systems) a. Lyapunov Stability. If, for system (2.5.23), there exists a function L(x, t) with continuous partial derivatives, such that for x in a compact set S C ~n L(x, t) is positive definite,
L(x, t) > 0
(2.5.26)
L(x, t) is negative semidefinite,
L(x, t) ~ 0
(2.5.27)
then the equilibrium point is SISL. b. Asymptotic Stability. If, furthermore, condition (2.5.27) is strengthened to
L(x, t) is negative definite,
L(x, t)
0, L ( x, t) < 0. Various extensions of these results allow one to determine more about the stability properties by further exammmg the deeper structure of the system dynamics (Slotine and Li 1991). 2.5.4.1
Barbalat's Lemma Extension of Lyapunov Analysis
The first result is based on Barbalat's Lemma (Section 2.2.2) applied to the Lyapunov derivative L. It gives a condition under which L -7 0. Theorem 2.5.6 (Barbalat's Lemma Lyapunov Extension) Let L(x, t) be a Lyapunov function so that L(x, t) > 0, L(x, t) uniformly continuous, then L(x, t)-+ 0 as t-+ oo.
.iei) + aigcosqi + ai)(ijd, + >.iei) + (a3 + aia2 cosq2)(ijd + >.2e2) aia2ci2(cid sinq2 + A1ei) - aia2(cii + ci2)(cjd sinq2 + >.2e2) + a2gcos(q1 + q2) + aigcosqi
W11
ai(iid,
Wi2
(a~+ 2aia2 cosq2
2
2
1
W21
0
W22
(a~+ aia2 cosq2)(ijd,
+
ai a2(qd 1
+ >.iei) + a3(iid + >.2e2) + >.i ei) sin q2 + a2g cos( qi + q2 ). 2
A function M file was written in MATLAB (1994) to simulate the adaptive controllerthis procedure is very direct. The adaptive controller is an easy modification of the function routine given in Example 3.3.1. The PD CT controller equations in Fig. 3.3.2 are simply replaced by the adaptive controller equations (3.4.13)-(3.4.14). The lines of code implementing the adaptive controller are given in Fig. 3.4.2. The arm parameters were taken as ai = a2 = lm, mi = 0.8kg, m2 = 2.3kg. The desired trajectory was selected as qa, ( t) = sin t, qd 2 ( t) = cost. The controller parameters were selected as Kv = diag{20, 20}, 1=diag{lO,10}, A= diag{>. 1 , >.2} = diag{5, 5}. The dialog required to run the adaptive controller is the same as in Example 3.3.1. Remember to modify the output function robout. m to reflect the new desired trajectory parameters! a. Adaptive Control. The response with the adaptive controller is given in Fig. 3.4.3, which is excellent even though the masses mi, m2 are unknown by the controller. It is seen that, after an initial error, the actual joint angles closely match the desired joint angles. In adaptive control, the controller dynamics allow for learning of the unknown parameters, so that the performance improves over time. Note that the parameter (mass) estimates converge to the correct constant values.
b. Adaptive Control with Unmodelled Dynamics. To show how important it is to model all the dynamics in the regression matrix, the simulation was now repeated using the incorrect value
160
CHAPTER 3. ROBOT DYNAMICS AND CONTROL
%file robadapt.m, to be called by MATLAB routine ode23 function xdot= robadapt(t,x) ; 3 compute desired trajectory as in Fig. 3.3.2
%Use period= 2*pi ; ampl= 1 ; amp2= 1 ;
%Adaptive
control input
ml= 0.8 ; m2= 2.3 ; al= 1 ; a2= 1 ; g= 9.8 ; Kv= 20*eye(2) ; lam= 5*eye(2); gam= 10*eye(2)
% arm parameters % controller parameters
%tracking errors
e= qd - [x(l) x(2)]' ep= qdp - [x(3) x(4)]' r= ep + lam*e ;
% compute regression matrix
f= qdpp + lam*ep ; W(l,1) = a1-2*f(1) + a1*g*cos(x(1)) W(l,2) = (a2-2 + 2*a1*a2*cos(x(2)) + a1-2)*f(1) + (a2-2 + a1*a2*cos(x(2)))*f(2) ... - a1*a2*x(4)*(qdp(1)*sin(x(2)) + lam(1,1)*e(1)) - a1*a2*(x(3)+ x(4))*(qdp(2)+sin(x(2)) + lam(2,2)*e(2)) + a2*g*cos(x(1) + x(2)) + a1*g*cos(x(1)) ; W(2,1) = 0 ; W(2,2) = (a2-2 + a1*a2*cos(x(2)))*f(1) + a2-2*f(2) + a1*a2*(qdp(1) + lam(1,1)*e(1))*sin(x(2)) + a2*g*cos(x(1) + x(2)) ;
% control torques. Parameter estimates are [x(5) x(6)]'
tau= Kv*r + W*[x(5) x(6)] '; taul= tau(1) ; tau2= tau(2) % parameter updates phidot= gam*W'*r ; xdot(5)= phidot(l) xdot(6)= phidot(2)
3 ROBOT ARM DYNAMICS from Fig. 3.3.2
Figure 3.4.2: Adaptive controller.
3.4. FILTERED-ERROR APPROXIMATION-BASED CONTROL
161
Adaptive Controller: Desired and Actual Joint Angles
l.5,-,--------------------,
Angles
0
4
Time (Sec)
6
8
10
(a)
Adaptive Controller: Mass Estimates
Masses
()
4
Time (Sec)
6
(b)
Figure 3.4.3: Response using adaptive controller. angles. (b) Mass estimates.
(a) Actual and desired joint
CHAPTER 3. ROBOT DYNAMICS AND CONTROL
162
Adaptive Controller with U=oddcd D)'llamia
2:.....-~~~~~~~~~~~~~~~~~~-.
1.5
-0.
0
2
4
Time(Sec)
8
8
10
(a)
Adaptive Controller with U=odded D)'llamia
2
4
Time(Sec)
6
8
10
(b)
Figure 3.4.4: Response using adaptive controller with incorrect regression matrix, showing the effects of unmodelled dynamics. (a) Actual and desired joint angles. (b) Mass estimates. the rest of W(x) was not changed. The simulation results shown in Fig. 3.4.4 reveal that the performance of the adaptive controller is very bad. This corresponds to unmade/led dynamics which generally destroys the performance of adaptive controllers. Several techniques are available for making adaptive controllers robust to unmodelled dynamics, including thee-modification (Narendra and Annaswamy 1987), the u-modification (Ioannou and Kokotovic 1984), and deadzone techniques (Kreisselmeier and Anderson 1986). These all involve adding correction terms to the tuning algorithm (3.4.14). D
3.4.4
Robust Control
Robust controllers for robot arms (Spong et al. 1987, Corless 1989, Dawson et al. 1990) comprise a large class of controllers, some of which are based on approximation techniques. Referring to Equation (3.4.8) and Fig. 3.4.1, in adaptive controllers the primary design effort goes into selecting a dynamic estimate j for the nonlinear function (3.4.7) that is tuned on-line. By contrast, in robust controllers, the primary design effort goes into selecting the robust term v(t). An advantage of standard
163
3.4. FILTERED-ERROR APPROXIMATION-BASED CONTROL
robust controllers is that they have no dynamics, so they are generally simpler to implement. On the other hand, adaptive controllers are somewhat more refined in that the dynamics are learned on-line and less control effort is usually needed. Furthermore, in adaptive control it is necessary to compute a regression matrix W ( x), while in robust control it is necessary to compute a bounding function F ( x). In some modern techniques, robust and adaptive techniques are combined to provide the advantages of each class. These are called adaptive-robust controllers, or robustadaptive controllers. Two popular robust controllers are now described. A robust saturation controller is given by
Robust Saturation Controller.
J+Kvr-v
r
P(,) -rw, llrll 2:
{
v
E
(3 .4.16)
-r f.0J. llrll < E " '
where J is an estimate for J(x) that is not changed on-line- for instance, a PDgravity-based robust controller would use J = G(q), ignoring the other nonlinear terms. In computing the robust control term v(t), Eis a small design parameter, 11 ·II denotes the norm, and F (x) is a known scalar function that bounds the uncertainties J = f - J so that (3.4.17) llfll::; F(x). The intent is that F(x) is a simplified function that can be computed using the bounding properties in Table 3.2.1 even if the exact value of the complicated nonlinear function f (x) is unknown. The performance of the robust controller is described in the next theorem and illustrated in a subsequent example. Theorem 3.4.3 (Robust Saturation Controller) Suppose the disturbance Td(t) in (3.4.1) is zero and the desired trajectory qd(t) is bounded according to (3.4.5). Then, using the robust controller (3.4.16), the tracking error norm llr(t)ll is eventually bounded to the neighborhood of e:. Proof: Using the proposed control yields the closed-loop error dynamics
Mi= -Vmr - Kvr with
llfll < F(x)
+ J + v,
the known function. Select the Lyapunov function candidate
L Differentiate to discover
·
L=
1 T M(q)r. = 2r
1
2r
T
·
T
Mr+r Mi
whence substitution from the error dynamics yields
t
lT· Mr 2r lT
·
T
T
- r Vmr - r Kvr T
+ r T(f- + v) T-
2r (M-2Vm)r-r Kvr+r (f+v),
164
CHAPTER 3. ROBOT DYNAMICS AND CONTROL
so that skew-symmetry property P3 yields
L=
-rTI 0, G = GT > 0, and ,,; > 0 a small scalar design parameter. Then the filtered tracking error r(t) and NN weight estimates V, Ware UUB, with the bounds given specifically by ( 4.3.49) and ( 4.3.50). Moreover, the tracking error may be kept as small as desired by increasing the gains Kv in (4.3.18). Proof: Let the NN approximation property (4.3.7) hold for the function f(x) given in (4.3.6) {x I llxll qB. with a given accuracy EN for all x in the compact set Sx
=
4.3. TWO-LAYER NEURAL NETWORK CONTROLLER
205
=
Define Sr {r I llrll < (bx - qB)/(co + c2)}. Let r(O) E Sr. Then the approximation property holds. Let the Lyapunov function candidate be ( 4.3.47)
Differentiating with respect to time along the solution of the error system dynamics (4.3.20) yields
L=
-rT Kvr
+ ~rT(M -
2V,.n)r +tr { WT(P- 1
W+OTT)}
v} + rTWTiT + rT v + rT(Td + e:).
+tr { vr R- 1
Use the NN weight update laws and skew-symmetry to obtain
L =
-rT Kvr + x;llrlltr{WT(W - W)} + x;llrlltr{VT(V - V)}
+llrlltr{VT xiTT} + rT v + rTWTiT + rT(Td + e:) -rT Kvr + x;llrlltr{ZT (Z - Z)} + llrlltr{VT xiTT} + rT v + rTWT iT + r T (Td + e: ).
Since tr{ AT B} = ~ llAllFllBllF and
tr{Zr(z - Z)}
=
-11211}
~ llZllFllZllF - llZll},
there results
L
,d
+ K15.] -
v,
(5.1.23)
where V, W are the current weights in the actual NN as provided by the weight tuning algorithms. The tuning algorithms and the robustifying term v(t) must be derived to guarantee system stability. The NN controller is shown in Fig. 5.1.2; it has the basic structure of the NN controllers in Chapter 4, but with an additional inner force control loop. 5.1.2.1
Closed-Loop Error Dynamics Using NN Controller
The closed-loop error dynamics using this control algorithm are found by substitution of (5.1.23) into (5.1.17), and subsequent use of (5.1.13), to be
Mr= -LT KvLr - V1r +LT [WT O"(VT x) - WT (}(VT x)] + LT[Td + € + v]. (5.1.24) Now the manipulations closely follow those in Chapter 4. By expanding O"(VT x) in a Taylor series about O"(VT x) and some other manipulations one expresses the closed-loop error dynamics (see Problems section) as
Mr= -LT KvLr - V17' + LT[WT (0- - O-'VT x) +WT a-'vT x] +LT [w + v] (5.1.25) where the disturbance term is (5.1.26) The expression O(z) 2 denotes terms of order two in z. In these expressions, the NN weight estimation errors are given by
V=V-V
W=W-W
(5 .1.27)
5.1 FORCE CONTROL USING NEURAL NETWORKS
227
Nonlinear Inner Loop
~ ~ (A I]
Tracking Loop
Figure 5.1.2: Neural net hybrid position/force controller. and fr
-
a-'
(5.1.28) (5.1.29)
To bound the disturbance term w(t), it is now assumed that various bounding assumptions hold. In particular, the bounds in Table 5.0.l are needed along with the next assumptions, which hold in practical situations. Recall the definition of the Frobenius norm JI· [[F from Chapter 2. Assumption 5.1.1 (Desired Trajectory and NN Target Weight Bounds)
a.
The desired motion trajectory is bounded so that (5.1.30)
with qB a known scalar bound.
b.
Define the matrix of ideal target NN weights as (5.1.31)
Then, the ideal NN weights are constant and bounded so that (5.1.32) with ZB a known bound.
D
CHAPTER 5. NEURAL NET ROBOT CONTROL: EXTENSIONS
228
Table 5.1.1: NN Force/Position Controller.
Control Input:
NN Weight/Threshold Tuning Algorithms:
W
Fo-('VT x)(Lrf - Fit'VT x(Lr)T - KFl/(Lr)llW
V
Gx (a-'TW(Lr)) T -
KGll(Lr)liV
Design parameters: F, G positive definite matrices and rameter. Robustifying signal:
v(t)
K
> 0 a small pa-
= -Kz (llZllF + ZB)r
Define also the overall NN weight estimate and error matrices
z=[w 0
v~].
- [w 0
Z::=
V0 ] .
(5.1.33)
Now, it can be shown that the disturbance term w(t) is bounded by (5.1.34) with C 0 , C 1 , C 2 computable constants expressed in terms of various bounds. For instance, one can compute that (5.1.35) This interesting expression shows that the disturbances w(t) driving the error system increase as the disturbance torque Td(t) increases, or as the NN functional estimation error c:(t) increases (which occurs, for instance, if the number of hiddenlayer neurons is decreased, e.g. to save in computations).
5.1. 2. 2
NN Weight Tuning Algorithms for Guaranteed Stability
Based on the machinery just developed it is now possible to complete the design of the hybrid position/force controller in Fig. 5.1.2 by providing NN weight tuning algorithms for W, V and also a selection for the robustifying signal v(t) such that tracking performance and internal stability are both guaranteed. The next theorem provides the details; the NN controller it describes is given in Table 5.1.1. The complete proof of the theorem is given in Kwan and Lewis (1994).
229
5.1 FORCE CONTROL USING NEURAL NETWORKS
Theorem 5.1.1 (NN Hybrid Position/Force Controller) : Given the dynamics (5.1.1) for a robot exerting a force normal to a prescribed surface, let Assumption 5.1.1 hold so that the desired trajectory qa(t) is bounded and the target NN approximating weights are constant and bounded. Take the control input as (5.1.23) with the robustifying term (5.1.36) v(t) = -Kz(llZllF + ZB)r and the NN weight tuning algorithms
w
Fu(VT x)(Lr)T - F&'VT x(Lr)T -
v
Gx (a-'TW(Lr))T
with F, G positive definite matrices and,. is bounded to the neighborhood of
1>Fll(Lr)llW
(5.1.37)
-1>Gll(Lr)l!V
(5.1.38)
> 0.
llLr(t)ll
Then the position tracking error
(5.1.39)
the force tracking error 5.(t) is bounded, and the NN weight estimates W, V are bounded. Moreover, the position tracking error can be made as small as desired by increasing the gain I.d + K15.] + v V1r +WT(&- &'VT x) + WT&'VT x + P[I + K1]5. + (w + v).
-KvLr -
This may be written as
JT[I +Kt ]5.
= MLr+ KvLr + V1r- WT(&-&'VT x)- WT&'VT x -
(w+v)
= B(r, x, Z).
where all quantities on the right-hand side are bounded. Therefore, premultiply by J to derive
JB(r,x,Z) [I+ K1r 1 (11T)- 1 JB(r, x, z), where we have used the fact that J JT is nonsingular. This expression shows that the force tracking error 5.(t) is bounded and can be made as small as desired by increasing the force tracking gain K 1 . D
This proof has shown that the Lyapunov derivative is negative if llrll or llZllF are above some bounds, namely Br in (5.1.39) and Bz respectively. Thus, Br, Bz may be interpreted as practical bounds for the filtered position tracking error and the NN weights. The form of the bounds shows that Br may be made as small as desired by increasing the position feedback gains Kv. Note that the dependence of Br on Co in (5.1.35) shows that the tracking errors increase if either the robot arm disturbance torque rd(t) increases, or the NN approximation inaccuracy cN increases. The latter occurs, for instance, if a smaller NN is used to avoid computations. However, these effects may be offset by increasing Kv. As in Chapter 4, a complete analysis shows that this controller is local in the sense that the initial tracking errors must be in a certain set of allowable initial conditions. This initial condition set depends on the speed of the desired trajectory and the size of the compact set over which the approximation property (5.1.21) holds. Approximation accuracy generally increases with the number of hiddenlayer neurons in the NN. Technically, this leads also to a minimum requirement on the PD gain (see Chapter 4). Some remarks about the NN hybrid position/force controller in Table 5.1.l are in order. First, the NN weights can be initialized at zero, so that in Fig. 5.1.2 the PD loop initially stabilizes the system until the NN begins to learn. Therefore, no off-line training phase is needed for this NN controller. Next, note that the NN controller is very much like those in Chapter 4, but using Lr instead of r for position tracking. The first terms in tuning algorithms (5.1.37)(5.1.38) are backpropagation-through-time, with the modified filtered error Lr(t) the signal being backpropagated. The Jacobian required, namely fr' is very easily computed in terms of signals measurable in the feedback system. The last terms in the tuning algorithms correspond to Narendra's e-mod (Narendra and Annaswamy 1987), which is well known to be effective in standard adaptive control algorithms to provide robustness and avoid the need for persistence of excitation. The middle term in the tuning algorithm for W is new. It is required due to the nonlinear dependence of j on the first- layer NN weights V.
231
5.1 FORCE CONTROL USING NEURAL NETWORKS
w
.. ...
Mr + LT(V1+:K),)r ~
/;., v
r
-~
TuneW
-~
TuneV
Robust Term
Figure 5.1.3: Closed-loop position error system. Note finally that boundedness of the tracking position error, the force error, and the NN weights guarantees that the proposed control input (5.1.23) is bounded. More discussion on NN controllers of this form is provided in Chapter 4. 5.1.2.3
Passivity of the NN Hybrid Position/Force Controller
The NN hybrid position/force controller has the same passivity properties as the NN controllers in Chapter 4. The closed-loop error system (5.1.25) has the structure shown in Fig. 5.1.3, where all blocks· are interconnected in a feedback configuration and (w (v
wT (a- - a-' vT x) wTa-'vT x.
(5.1.40) (5.1.41)
Passivity and state-strict passivity (SSP) were defined in Chapter 2. It was shown in Chapter 3 that the filtered error dynamics are a state-strict passive system; it is easily shown that the constrained filtered position error system in the top block in the figure is also state-strict passive. The next result details the passivity properties of the position/force controller. The proof is given in Kwan and Lewis (1994) and follows closely similar proofs in Chapter 4. Theorem 5.1.2 (Passivity Properties of NN Force Controller) : Under the hypotheses of Theorem 5.1.1, the NN weight tuning algorithms in Table 5.1.1 make the maps from Lr(t) to (w and from Lr(t) to (v both SSP maps. D
Since both the filtered error dynamics and the tuning algorithms are SSP, it follows that the overall closed-loop system is SSP. Therefore, all signals and internal states (e.g. the NN weights) are bounded. The SSP property explains why no persistence of excitation (PE) is needed for this controller, as SSP guarantees weight boundedness without an observability-like condition. It should be noted that if one uses only backpropagation weight tuning, which basically amounts to using only the first terms in the tuning algorithms in Table 5.1.1, the closed-loop system is not SSP, but only passive. Therefore, an additional PE condition is needed for the controller to perform correctly.
232
CHAPTER 5. NEURAL NET ROBOT CONTROL: EXTENSIONS
circle constraint ~CY1.Y2 )= o
~
0
Y1
Figure 5.1.4: Two-link planar elbow arm with circle constraint. 5.1.3
Design Example for NN Hybrid Position/Force Controller
Example 5.1.1 (NN Hybrid Position/Force Controller) : The two-link planar manipulator in Fig. 5.1.4 has dynamics (see Chapter 3) given by M(q)
=[
a+f3+2rycosq2 (3 + 'f/COS q2
where a= (m1 + m2)ai,(3 = m2a~,'f/ = m2a1a2,e1 = g/a1. The specific arm considered in this example has a = 0.8, (3 = 0.32, 'fl= 0.4, ai = 1, a2 = 0.8. Friction is not included in this example.
a.
Constraint Surface and Jacobians. The constraint surface is the circle in Cartesian space (y1,y2) shown in Fig. 5.1.4, which can be expressed as (y)
= y~ + y~
-
r2
=0
where y := [y1 y2f. The transformation from joint space to Cartesian space is given by
Therefore, in joint space the constraint is expressed as
which has a unique constant solution for q2 given by
5.2 LINK FLEXIBILITY, MOTOR DYNAMICS, AND JOINT FLEXIBILITY
233
The Jacobian J(q) is
and the extended Jacobian is
Using these constructions, the constrained dynamics (5.1.11) are expressed as
b. Simulation.
Let the desired motion trajectory be -90 + 52.5(1 - cos l.26t) m, 15 m,
and the desired force be
t t
< 2.5sec > 2.5sec
Ad= 10 nt.
Take the initial conditions as q1 (0) = -70 deg, q2 (0) = 80 deg, qi (0) = 0, q2 (0) = 0. The controller parameters are selected as A = 20, I
0 is a design parameter,"(< ln 2/s, and the gain in (6.1.13) time-varying and given by
'Uc
(6.3.10)
is as defined in (6.1.12) with (6.3.11)
6.3. CASE OF UNKNOWN FUNCTION G(X)
289
u
1---- 0 and the multiplier K 2 > max{C2 ,C4/s1c:8 9 m} design parameters. The robustifying control term is µ ?_ 2,
(6.3.12)
and the indicator function I is defined as I= {
~
If g ?_ g and o.w.
lucl
~ s
and sgn(-) is the signum function. Note that I = 0 when the estimate g is too small or uc(t) is too large, either one corresponding to an undesirable situation. It is important to note that u is well-defined for all g, even when I = 0. Therefore, g --+ 0 does not generate any unbounded control signal. The intuition behind this controller is as follows. When g ?_ g and lucl < s then the total control action is set to Uc, otherwise control is switch~d to the auxiliary input Ur. This controller structure is shown in Fig. 6.3.1. Thus the resulting control action is well-defined everywhere and the uniform ultimate boundedness (UUB) of the closed-loop system can be shown with suitable NN weight tuning algorithms, as in the next subsection. Our proposed controller in (6.3.10) is formed from this idea with extra terms added for a smooth transition between Uc and Ur so that existence of solutions is guaranteed.
6.3.2
NN Weight Tuning for Tracking Stability
The next theorem shows how to tune the weights in the two NNs so that tracking performance and internal stability are guaranteed. The resulting controller is shown in Table 6.3 .1. It is important for the existence of solutions to the closed-loop system to note that the trajectory generated for G9 (t) is continuous.
290
CHAPTER 6. NEURAL NETWORK CONTROL OF NONLINEAR SYSTEMS
Table 6.3.1: Neural Net Controller with Unknown f(x) and g(x) NN Controller: _
{
u -
Uc+ Ur;Uc {Y(lucl-s) Ur -
Ur;Uc
cl'Oucl-s)
If J = 1 If I= 0
(6.3.13) (6.3.14)
Auxiliary Control Input: Ur=
191
(6.3.15)
-µ-luclsgn(r)
fl
PD Term
(6.3.16)
Time- Varying Gain:
(6.3.17) Indicator Function:
I= { 1 If g ~fl and 0 o.w.
lucl :S
s
(6.3.18)
NN Weight Tunings:
(6.3.19)
(6.3.20)
Signals:
e(t)
= x(t) - xd(t) r(t) = AT e(t)
Yd= -x~n) Design Parameters:
+ (0
A_T]e
Tracking error
(6.3.21)
Filtered tracking error
(6.3.22)
Desired trajectory feedforward signal (6.3.23)
Constant gains A> 0, KN> 0,1{2 > max{C2,C4/wyf8gm} Tuning matrices M;, N; symmetric and positive definite. Scalar "' > 0, s > 0, / < ln 2/ s, µ ~ 2.
6.3. CASE OF UNKNOWN FUNCTION G(X)
291
Theorem 6.3.1 (NN Tracking Controller for Unknown f and g) : Assume that the feedback linearizable system has a representation in the Brunovsky canonical form and let all assumptions hold. Let the control input be given by (6.3.10). Let the weight update laws for the f(x) NN be (6.3.24)
and the weight updates for the g(x) NN be provided by
~
Mg[(ug - u~V!x)ucr NgucrxWJ'
,.lrlluclWg]
u~ 0- ~lrlluclNgVg
if I= o.w.
if I= 1
~.w.
(6.3.25)
where M; and N; i = f, g are positive definite matrices. Then the filtered tracking error r(t), neural net weight errors E>1(t),Elg(t) and control input u(t) are UUB with constant specific bounds given in (6.3.32). Moreover the filtered tracking error r(t) can be made arbitrarily small by increasing the gain KN. Proof: Let the Lyapunov function candidate be L
= 21 r 2 + 21 tr {
"'"""' W; - T M;-1 W; L,;
•=f,g
+ V;- T N;-1 V;- }
.
(6.3.26)
We will study the derivative of (6.3.26) in two mutually exclusive and exhaustive regions. Region 1:
191 2: [j_ and lucl :S s.
Substitution of the functional approximation errors (6.3.8) into the error system dynamics (6.1.14) yields -T 1'T 'T 1-T -Kvr+W1 (ui-u 1V 1 x)+W1 u 1 V 1 x + [WJ'(ug - U~ VgT x) + WJ' U~ VgT x] Uc +d + Wf + WgUc + gud. The time derivative of (6.3.26) is
L. = rr.
+ tr {
"'"""' - T ..: L,; W; M;-1 W;
•=f,g
+ V;- T N;- I Vi,:, }
.
(6.3.27)
Substitute now r into (6.3.27) and perform a simple manipulation, (i.e. using xT y tr{xT y} = tr{yxT}, one can place weight matrices inside a trace operator). Then
L
-Kvr 2 +r(d+w1)+rgud++rwguc+ -T 1'T !,:, -T 'Tl !~ tr{W1 (u1-u 1V 1 x)r+M.f W1)}+tr{V1 (xrW1 u 1 +N.f V1)}+ - T • •I • T -I ~ - T • T. I -1 ~ tr{Wg (ug - ugVg x)ucr +Mg Wg)} + tr{Vg (xucrWg O'g +Ng V 9 )}.
With the update rules give in (6.3.24) and (6.3.25)
L
-
Kvr 2
+ r(d + WJ) + rw 9 uc + rgud +
,,;lrltr { e}' e f} + ,,;lrl lucltr { e~ eg}
292
CHAPTER 6. NEURAL NETWORK CONTROL OF NONLINEAR SYSTEMS
and from the inequality
it follows that
L ~
-
Kvr 2 +r(d+w1)+rw 9 uc+rgud+
Klrlll01ll(81m -110111) Substitute the upper bound of yield
Picking
J( z
WJ
+ Klrlll09ll(89m -1109ll)lucl·
and w 9 according to (6.3.9) and Kv from (6.3.11) to
> C2,
Since 191 2: g and µ 2: 2 the last term in this inequality is always negative. Now we can write the final form by completing the squares (6.3.28) where and
C _ 01m f
C1
= -2- + 2K'
Observe that the terms in braces in (6.3.28) define a conic ellipsoid, a compact set around the origin of (r, 110111, 110 9 11). We can, therefore, deduce from (6.3.28) that, if lrl > 8r, then
L ~ O for all
110111and110 9 11 where (6.3.29)
or, if then
L~
0 for all lrl and 110111 where
81
I
= Ct2 + y-;: {ii;
(since the quadratic terms dominate the linear terms after the certain points 8r 1 and 81i).
6.3. CASE OF UNKNOWN FUNCTION G(X)
293
For the weights of g(x) we can only claim an upper bound when positive cu as c2
7
c2
+it+ (1
lucl
~cu
> 0 for any
+ l/cu)Co + (£0gm + c)/2 K,
8g1
= Cg + 2
vrn:. z:;.
However it is not straightforward to show a bound on g, through (6.3.28) when lucJ gil as Since
11eg11=11eg - eg11::::: .11eg11 + egm· llEJgll as
This implies another upper bound on
11eg11 : : : 8g2 where
294
CHAPTER 6. NEURAL NETWORK CONTROL OF NONLINEAR SYSTEMS
20 18 16 14 12
"" d
~
10
0
0.1
0.2
0.3
0.4
o.s
0.6
0.7
0.8
0.9
IUcl
Figure 6.3.2: Illustration of the upper bound on 11~\11· We have shown two upper bounds on 110gll then one can establish a finite upper bound on 110gll for all jucl < s as 8g = min{8g 1 , 8g 2 }. The symbolic variation of these bounds is illustrated in Fig. 6.3.2. This shows the boundedness of r, Ur, and since lucl ~ s, this implies that u E Loo.
e,, eg,
Region 2:
g s.
With the update rules corresponding to this region
L becomes (6.3.30)
Observe that Uc may not be defined in this region though, because of notational simplicity we will use it in the ?Jue or UcE:--r(lucl-•l forms which are bounded when g = 0. Now define
Lg
r[Juc
+ rgud = -r§uc + rgu.
Substitution of the controller corresponding to this region yields
Lg=
-r?Juc - rgur(l - ~E:--y(lucl-s))
2
+ ~rguct--y(lucl-s). 2
Assume at time to that g is in a compact set, invoke Assumption 6.3.l to yield
Then using .::-ys
~
2
6.3. CASE OF UNKNOWN FUNCTION G(X)
The other case occurs when
lucl
5r 2 or llEl1ll > 512 L:::; 0. This implies that x and 81 stay in a compact set so does g(x). This shows the boundedness of r, 01 together with bounded Elg implies that Ur E Loo, hence u E L 00 . Reprise: Combining the results from region one and two, one can readily set
Sr 59
= max { br
1 ,
dr 2 }
,
= min{5 9 ,,5 92 }.
51
= max { 5ti , 5h}
(6.3.32)
Thus for both regions, if JrJ >Sr or JJEl1JJ > 51 or 110gll > 5g, then L:::; 0 and u E Loo. Let us denote (Ir 1, II 1 IJ, 110 g JI) by new coordinate variables (6' 6' 6). Define the region
e
V: {e
16
5; implies that V C 0. These sets are shown in Fig. 6.3.3. We have proved that whenever ei > Si then L(e) will not increase. This implies that e will decrease, in other words it will stay in the region 0 which is an invariant set. Therefore all the signals D in the closed-loop system remain bounded. This concludes the proof.
See the remarks following Theorem 6.2.1. The tuning algorithms presented in the theorem are augmented versions of backpropagation through time. Simplified Hebbian tuning algorithms are given in [Ye§ildirek 1994] (c.f. Chapter 4). Also note the following. Remarks:
296
CHAPTER 6.
NEURAL NETWORK CONTROL OF NONLINEAR SYSTEMS
0)· ..........................................:.;
...
. ...-or
_../l
.................,!.. ·:···· (
I
9g i og of i :' of i........................................................i/ n i
~!
Figure 6.3.3: Illustration of the invariant set. 1. For practical purposes, (6.3.32) can be considered as bounds on JrJ, JIG1JI, and JJGgll in the sense that excursions above these bounds will be small.
or,,
Or 2 that the bound on the tracking error may be kept arbitrarily small by selecting the control gain KN large enough.
2. Note from the definitions of
3. Note that the tuning of the NN that estimates g(x) is interrupted if g becomes too small. If the switching parameter s is chosen too small, it will limit the control input and result in a large tracking error which gives undesirable closed-loop performance. If it is too large, the control actuator may saturate as u(t) increases in magnitude. 4. Stability of the closed-loop system is shown without making any assumptions on the initial NN weight values. The NNs can easily be initialized as Bf (0) = 0 and Bg(O) > fr 1 (g). It is crucial to note that the NN need not to be trained off-line before use zn closed-loop. No assumptions of the initial weights being in an invariant set, or a region of attraction, or a feasible region are needed.
6.3.3
Illustrative Simulation Examples
Example 6.3.1 (Van der Pol System) : As an example consider a Van der Pol's system X2
(1 - xi)x2 - x1
+ (1 +xi + x~)u
=
(6.3.33)
which is in the controllability canonical form and has g(x) ~ 1 [j_ Vx. The neural nets which are used for j and g consist of 10 neurons. The function sgn(r) is approximated by a hyperbolic tangent function in simulations. Design parameters are set to s = 10, "f = 0.05, I< N = 20, )IJ = 5, Mi = Ni = 20, µ = 4 and the rest are set equal to 1. Initial
6.3. CASE OF UNKNOWN FUNCTION G(X)
297
·2.Sk·········· ... , ................;................, ...............;.................................. ,............. ,.................,................,..............; · 3 0~--'2-__...4_
_....6_
_.._8--1"-o-~1._2_
_,.14-~16=--__..18,..----:'20
dme(sec]
Figure 6.3.4: Actual and desired states.
2
4
6
10
12
14
time (sec]
Figure 6.3.5: Control input.
16
18
20
298
CHAPTER 6. NEURAL NETWORK CONTROL OF NONLINEAR SYSTEMS
0
j
.-0.5 -1
..........;..............
-1.S
................;................l""'''""!''"""'"'"'l""''"'..
-2
'
.............. _................l................f.. . . . . . . l..-............1.............-.J..............:..............
-2.5
-3
0
!
2
4
6
~
w
~
~
~
~
w
dme[sec]
Figure 6.3.6: Actual and desired states. conditions are e,(o) = 0 .and Sg(O) = 0.4 so that g(O) > 1 and x1(0) = x2(0) = 1. The desired trajectory is defined as Yd(t) = sint. The actual and desired outputs obtained by simulation are show in Fig. 6.3.4 and the control action is shown in Fig. 6.3.5. Note that almost perfect tracking is obtained in less than one second. 0 Example 6.3.2 (Ill-Defined Relative Degree System) Let us change the above Van der Pol's systems so that it is ill-defined when x1(t) X2
(1 - xf}x2 - x1
+ xf u.
= 0:
(6.3.34)
We took the same NN controller parameters as in Example 1. Although g(x) is not lower bounded, we treated gas a design parameter and set g = 0.1. The objective is to show that our NN controller ca.ii: give good performance even With systems that are not well-defined in relative degree. Simulation results showing the performance of the NN controller are given in Figs. 6.3.6 and 6.3.7. Observe that around the singularity points ( t = mr for n = ···, -1, 0, 1, · · · after tracking is satisfied) the controller needed to linearize the system reaches its peak which is set by the design parameters g and s. That is, when u >> s, u -+ Ur which is proportional to g- 1 . There is a trad;-off between the amount of control signal which can be applied and-the bound on the tracking error in the neighborhood of those singular points. By choosing a lower bound on g, the amount of the control and tracking error are decided. Further discussion on control of systems with ill-defined relative degree is given in (Commuri and Lewis 1994). D Example 6.3.3 (Chemical Stirred-Tank Reactor) Chemical systems, in general, can be very difficult to control because of their strong nonlinearities which are difficult to model, even though they may have few variables. They offer a good application for NN-based control.
6.3.
299
CASE OF UNKNOWN FUNCTION G(X)
time [secJ
Figure 6.3.7: Control input.
Inlet
Reactant
Coolant
Figure 6.3.8: Chemical stirred-tank reactor (CSTR) process.
300
CHAPTER 6. NEURAL NETWORK CONTROL OF NONLINEAR SYSTEMS
a. CSTR Dynamical Model.
There are many different models for bioreactors that contain liquid in a tank and cells that consume substrates and yield desired products and undesired byproducts. As an example, we take the continuously stirred tank reactor (CSTR), which is described as a benchmark problem in adaptive control (Ungar 1990) when the flow is considered as the control. A more complete and realistic model using a cooler around the tank is shown in Fig. 6.3.8. Let the reactor vessel volume be V. We assume the concentration CA inside the tank is uniform with temperature T. The inlet reactant is supplied with concentration CAJ, feed rate F, and temperature T1. The control input is the coolant temperature Tc. A dynamic model for the CSTR temperature control is then given by Ray (1975) v~
(6.3.35)
dt
VpCp dt'" dT
where the system parameters pCp, h, A, E, k 0 , and ,6,.H are assumed to be constant. Let = ~t' and define a transformation as
t
(6.3.36) with Ttd the nominal inlet temperature. Note that the normalized x1(t) is the difference between inlet and reactor concentrations and it is always less than or equal to 1 (since CA! ::::: CA :::::
0).
Suppose the objective is to control the temperature T of the reactor. Then, (6.3.35) can be rewritten in terms of dimensionless variables as -yx2 -xi+ Da(l - x1)e-y+x2 X1 1'X2 (6.3.37) -x2 + BDa(l - x1)e-Y+x2 -f3(x2 - X2c) + d + f3u with constants
Da = &..:.._FV e-• B - -b.HCAJ7 -
X2c
pCpTfd Tcd-Tfd Tfd
= 'Y
'Y -
f3 -
X2d
_E_ RTfd hA pCpF -yTd-Tjd Tfd
=
where Ted and Td are the nominal design values of coolant and reactor temperatures respectively. Although it is assumed that Da, -y, B, and f3 are constants, there are some uncertainties in them. The Damkohler constant Da is, in fact, a function of the reactor catalyst, and the dimensionless heat transfer constant f3 is a slowly-varying parameter. Note that this is a feedback linearizable model, though the less complete and realistic flow control CSTR model in Ungar (1990) is not. The behavior of the bioreactor is usually studied for three typical cases associated with three sets of plant parameters: CASE Case 1 Case 2 Case 3
B
7 8 11
f3 0.5 0.3 1.5
Da
0.110 0.072 0.135
with 'Y = 20. The corresponding regions have been studied previously (Liu and Lewis 1994, Ray 1975). The desired inlet reactant temperature, Ttd is here selected as 300° K, and the control objective is to keep the reactor temperature T at this level. Then, the tank concentration will converge to some constant steady-state value.
6.4. CONCLUSIONS
301
~~[ 3000
10
: : 20
30
Tlmo
Tune
g~~ 0
w
20
30
~
~
~
~
~
~
~
Tune
Figure 6.3.9: CSTR open-loop response to a disturbance.
b.
Open-Loop Response. The open-loop response of the bioreactor is shown in Fig. 6.3.9. In order to show the sensitivity to variations in the inlet reactant temperature, at time t = 40 unit, we added a disturbance effect of a 5° I< increase in the inlet reactant temperature. Observe the response of the reactor for different cases. Clearly, the temperature T fluctuates widely, an undesirable state of affairs. The tank temperature T increases by about 30° I< in Case one and twice as much in Case two, and exhibits oscillatory behavior in Case three. c. NN Controller Design and Simulation. The NN control structure in Table 6.3.l was now used to regulate the CSTR temperature T. We use six neurons in the hidden layers for both j and g. The design parameteres were selected as IT(k)\2>3(k)(kvr(k)+(Wl1h(k)+c(k)+d(k))) ll 2 (1 - a3f(k)\2>l(k))
1 'Pl
vr
(k))ll2
'T(k)' (k))IJWT(k)' (k)-(l-a2\2>'{(k) 0, 5 > 0 and p > 0 design parameters. Then the filtered tracking error r(k) and the NN weight estimates W1(k) and W 9 (k) are UUB, with the bounds specifically given by Equations (8.3.89) or (8.3.110), (8.3.93) or (8.3.114) and (8.3.97) provided the following conditions hold: (1) (2) (3) (4) (5) (6)
/311 'Pg(k)uc(k) II= /311'Pg(k)11 2 < a II 'Pt(k) 11 2 < 1, T/ + max(P1, P3, P4) < 1 0 < 0 < 1, 0 < p < 1, max(a2,bo) < 1,
1,
(8.3.67) (8.3.68) (8.3.69) (8.3. 70) (8.3.71) (8.3.72)
with P1, P3, and P4 constants which depend upon ry, o and p where T/
-
=
a II 'Pt(k) 11 2 +/311'Pg(k)uc(k)11 2 = a II 'Pt(k) 11 2 +/311cpg(k)11 2 , I= 1,
all'Pt(k)ll 2 ,I=0,
(8.3.73)
378
DISCRETE FEEDBACK LINEARIZATION BY NEURAL NETS
and a2, bo design parameters chosen using the gain matrix kv. Note: The parameters 0/, f3 and T/ depend upon the trajectory. Proof: Region I: I g(x(k)) 12'. g and I uc(k) 1:5 s. Select the Lyapunov function candidate (8.3.10) whose first difference is given by (8.3.11). The error in dynamics for the weight update laws are given for this region as
W1 (k + 1)
(I - O/IPf(k)cpJ(k))W1(k) - acp1(k)(kvr(k)
+E(k) + d(k))T + 8 II I - Ol'Pt(k)cpJ(k)
+ e9 (k) + g(x)ud(k)
II W1(k)
(8.3.74)
and
W9 (k + 1)
=
(I - f3cp 9 (k)cp~(k))W9 (k) -/3cp 9 (k)(kvr(k) + e1(k) + g(x)ud(k) +E(k) + d(k))T + p
II I -/3cpg(k)cp~(k) II Wg(k)
(8.3.75)
Substituting (8.3.74) and (8.3.75), in (8.3.11) and combining, rewriting and completing the squares and simplifying we get !:::,.]
=
-r(k)T[I - (2 + ry)k~kv]r(k) + 2(2 + ry)(kvr(k))T(g(x)ud(k) + E(k) + d(k)) +(2 + ry)(g(x)ud(k) + E(k) + d(k))T(g(x)ud(k) + E(k) + d(k))
+2P2 II kvr(k) + g(x)ud(k) + E(k) + d(k) II +2ry(g(x)ud(k))T(E(k) + d(k)) +2ryE(k)d(k) - (1- T/ - P3) II e1(k) 11 2 -(1 - T/ - P4) II e9 (k) 11 2 -2(1- ry- P1) -28(1 - 8)
II e1(k) 1111 e9 (k) II_! II I T/
II W1(k) 11 2
II W1(k) II Wfmax - 82 W}max]
1 T -~II I -/3cpg(k)cpg (k)
-2p(l - P)
- acp 1(k)cpJ(k) 11 2 [8(2 - 8)
II
Wg(k)
II 2
[p(2 - p)
II Wgmax
II Wg(k) II 2
- /w;max], (8.3.76)
where T/ is given in (8.3.69) and
(8.3.77) P2
and
= 2(8 II J -
Ol'PJ(k)cpt(k)T
II
Wfmax'Pfmax
+ P II
J - /3cpg(k)cpg(k)T
II Wgmax'Pgmax).
?3
= (ry + 8 II I
- Ol'Pt(k)cp1(k)T
11) 2 ,
(8.3.78) (8.3.79)
?4
= (ry + p II I
- f3cpg(k)cpg(k)T
11) 2 .
(8.3.80)
Now in this region, the bound on ud(k) can be obtained as
II !.id(k) II :5 II u(k) - uc(k) II :5 II Ur(k); uc(k) e-Y(luc(k)l-s) II.
(8.3.81)
In this region, since I uc(k) 1:5 s, and the auxiliary input ur(k) is given by (8.2.31), the bound in (8.3.81) can be taken as a constant since all the terms on the right side are bounded and this bound is denoted as
(8.3.82)
379
8.3. SINGLE-LAYER NN FOR FEEDBACK LINEARIZATION
Then the bound for g(x)ud(k) can be written as (8.3.29). Using the bound for g(x)ud(k) and substituting in (8.3.76), and completing the squares for II W1(k) II and 11 W 9 (k) II we obtain
.6.J
< -(1 - a2) II r(k) 11 2 +2aa II r(k) II
+a4
-(1 - T/ - Pa) II e1(k) 11 2 -(1- T/ - P4) II e 9 (k) 11 2 -2(1 - Tl - P1) II e 1(k) 1111 eg(k) II - II (ffae1(k) + ~e9 (k)) - (kvr(k) + g(x)ud(k) + t(k) + d(k)) 11 2 T 2 (1 - 8) 2 2 1 --;; II I - aip1(k)'PJ (k) II 8(2 - 8)[1/ W1(k) II - (2 _ 8) Wfmax]
- ~1 II
T
J - {3ipg(k )ip 9 (k)
II 2
p(2 - P)
[
II Wg(k) II - ((12 -_
p) 2 p) Wgmax]
(8.3.83)
where
(8.3.84) aa
=
(1 + ry)kvmax(tN + dM +Co)+ P2kvmax + P2C1 + ryC1(tN + dM)(8.3.85) 1
+2(2 + ry)C1 (tN + dM +Co)+ 2kvmax(tN + dM +Co) a44
=
2P2(tN + dM +Co)+ 2ryCo(tN + dM) +(2 + ry)(tN + dM + Co) 2 + 2rytNdM
(8.3.86)
(8.3.87)
and
=
a4
1
a44 +;:;
II
T
I - aip1(k)'P1 (k)
+~II I - (3ipg(k)ip~(k) 11
2
II
(2
2
82
2
(2 - 8 wfmax
~ p) w;max
(8.3.88)
All the terms in (8.3.83) are always negative except the first term as long as the condition (8.3.67) through (8.3. 72) hold. Since a2, aa and a4 are positive constants, .6.J ~ 0 as long as (8.3.67) through (8.3.72) hold with
II
r(k)
II> 8r1
(8.3.89)
where
(8.3.90) Similarly, completing the squares for 11 r(k) 11, 11
.6.J
=
-(1 - a2)[ll r(k)
II -
W9 (k) 11
using (8.3.76) yields
(1 :aa 2/
-(1 - T/ - Pa) II e1(k) 11 2 -(1 - Tl - P4) 11e9 (k)11 2 -2(1 - Tl - Pi) 11 e1(k) 1111e9 (k)11 - 11 (ffae1(k) + ~e 9 (k)) - (kvr(k) + g(x)ud(k) + t(k) + d(k)) 11 2 --;;1 II J - a O; i = 1, 2, 3 and Pig > O; i = 1, 2, 3 design parameters. Then the filtered tracking error r(k) and the NN weight estimates Wi1(k);i 1,2,3 and W; 9 (k);i = 1,2,3 are UUB, with the bounds specifically given by Equations (8.4.86) or (8.4.101), (8.4.92) or (8.4.105) and (8.4.96) provided the following conditions hold: (1)
/33g II zp3g(k)uc(k) II= /33g II cp3g(k) 11 2 < 1,
(8.4.58)
(2)
(8.4.60)
(4)
a;1 11rf>;1(k)11 < 2; i = 1, 2, /3ig 11r.pig(k)11 2 < 2;i = 1,2, ry+max(P1,P3,P4) < 1
(8.4.61)
(5)
0
31(k) 11 2 +/33g II zp3g(k)uc(k) 11 2 = 0!3j II rf>31(k) 11 2 +/33g II cp3g(k) 11 2 ' I= 1, a31 11 rf>31(k) 11 2 ,r = o,
(8.4.65)
and as and b6 design parameters chosen using the gain matrix kv. Proof: Region I: I g(x(k)) 12'. g and I ttc(k) IS:: s. Select the Lyapunov function candidate (8.4.16) whose first difference is given by (8.4.17). The error in dynamics for the weight update laws are given for this region as
W;1(k
+ 1)
=
(I - O!iJrf>iJ(k)rj>'ft(k))W;1(k) - O!ifrf>it(k)(Yif
+ Bitkvr(k))T
-Oij II I - a;trf>iJ (k )i.P'ft (k) II wif (k ); i = 1, 2
(I -/3i 9 cp;g(k)r.pfg(k))W; 9 (k) - /3; 9 i.{;; 9 (k)(y; 9
(8.4.66)
+ B;
9
kvr(k))T
-p;g II I - /3igcpig(k)r.pfg(k) II W;g(k); i = 1, 2
(I - a31rf>31(k)rf>i1(k))W3 1(k) - a31rf>3 1(k)(kvr(k)
+ c(k) + d(k)f -031 II I - a3frf>3t(k)rf>i1(k) II
(8.4.67)
+ e (k) 9
+g(x(k))ud(k)
(8.4.68)
and
(I - f33 9 i.(;3 9 (k)rf>i9 (k))W3 9 (k) - f33 9 cp3 9 (k)(kvr(k)
+g(x(k))ud(k) + c(k) + d(k))T -p3 9 II I - a3 9 cp3 9 (k)ij;j9 (k) II.
+ e1(k) (8.4.69)
Substituting (8.4.66) through (8.4.69) in (8.4.17), combining, substituting for g(x )ud(k) from (8.3.29), rewriting and completing the squares for 11 W;1 (k) II; i = 1, 2, 3, and II W; 9 (k) II; i = 1, 2, 3 one obtains !::,,.]
:::::
-(1 - a2) II r(k) 11 2 +2a3 II r(k) II +a4
-(1 -
'17 -
P3) 11e1(k)11 2 -(1- 'I]
-
P4) 11e9 (k)11 2
394
DISCRETE FEEDBACK LINEARIZATION BY NEURAL NETS
-2(1 - 11 - P1) II e1(k) 1111 eg(k) II - II (VPae1(k) + ~e9 (k)) - (kvr(k) + g(x)ud(k) + c(k) + d(k)) 11 2
I=2
- L(2 i=l
ll'itC.Oft(k )C.Oit (k))
II W;~ (k )C.Oit (k) - (2 - . ,,; (k) ll'if'f'if
((1 - ll';Jcpft(k)C,0;1(k)) - dif II I - ll';JcpiJ(k)cpft(k) ll)(Yi/ +Ps II r(k) 11 2 +2P6 II r(k) II +P1
'f'•f h.
(k ))
+ B;1kvr(k))
11 2 (8.4.70)
where
(8.4.71) a3
= (1 + ry)kvrnax(tN + dM +Co)+ P2kvrnax + P2C1+77C1(tN + dM) 1
+2(2 + 77)C1(tN + dM +Co)+ 2kvrnax(tN + dM +Co)
=
a44
(8.4.72)
2P2(cN+dM+Co)+217Co(cN+dM) +(2 + ry)(cN + dM + Co) 2 + 217