139 17 9MB
English Pages 216 [211] Year 2020
Advances in Industrial Control
Cosmin Copot Clara Mihaela Ionescu Cristina I. Muresan
Image-Based and Fractional-Order Control for Mechatronic Systems Theory and Applications with MATLAB®
Advances in Industrial Control Series Editors Michael J. Grimble, Industrial Control Centre, University of Strathclyde, Glasgow, UK Antonella Ferrara, Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy Editorial Board Graham Goodwin, School of Electrical Engineering and Computing, University of Newcastle, Callaghan, NSW, Australia Thomas J. Harris, Department of Chemical Engineering, Queen’s University, Kingston, ON, Canada Tong Heng Lee, Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore Om P. Malik, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada Kim-Fung Man, City University Hong Kong, Kowloon, Hong Kong Gustaf Olsson, Department of Industrial Electrical Engineering and Automation, Lund Institute of Technology, Lund, Sweden Asok Ray, Department of Mechanical Engineering, Pennsylvania State University, University Park, PA, USA Sebastian Engell, Lehrstuhl für Systemdynamik und Prozessführung, Technische Universität Dortmund, Dortmund, Germany Ikuo Yamamoto, Graduate School of Engineering, University of Nagasaki, Nagasaki, Japan
Advances in Industrial Control is a series of monographs and contributed titles focusing on the applications of advanced and novel control methods within applied settings. This series has worldwide distribution to engineers, researchers and libraries. The series promotes the exchange of information between academia and industry, to which end the books all demonstrate some theoretical aspect of an advanced or new control method and show how it can be applied either in a pilot plant or in some real industrial situation. The books are distinguished by the combination of the type of theory used and the type of application exemplified. Note that “industrial” here has a very broad interpretation; it applies not merely to the processes employed in industrial plants but to systems such as avionics and automotive brakes and drivetrain. This series complements the theoretical and more mathematical approach of Communications and Control Engineering. Indexed by SCOPUS and Engineering Index. Proposals for this series, composed of a proposal form downloaded from this page, a draft Contents, at least two sample chapters and an author cv (with a synopsis of the whole project, if possible) can be submitted to either of the: Series Editors Professor Michael J. Grimble Department of Electronic and Electrical Engineering, Royal College Building, 204 George Street, Glasgow G1 1XW, United Kingdom e-mail: [email protected] Professor Antonella Ferrara Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Via Ferrata 1, 27100 Pavia, Italy e-mail: [email protected] or the In-house Editor Mr. Oliver Jackson Springer London, 4 Crinan Street, London, N1 9XW, United Kingdom e-mail: [email protected] Proposals are peer-reviewed. Publishing Ethics Researchers should conduct their research from research proposal to publication in line with best practices and codes of conduct of relevant professional bodies and/or national and international regulatory bodies. For more details on individual ethics matters please see: https://www.springer.com/gp/authors-editors/journal-author/journal-author-helpdesk/ publishing-ethics/14214
More information about this series at http://www.springer.com/series/1412
Cosmin Copot Clara Mihaela Ionescu Cristina I. Muresan •
•
Image-Based and Fractional-Order Control for Mechatronic Systems Theory and Applications with MATLAB®
123
Cosmin Copot Department of Electromechanics University of Antwerp Antwerpen, Belgium
Clara Mihaela Ionescu Department of Electromechanics, Systems and Metals Engineering Ghent University Ghent, Oost-Vlaanderen, Belgium
Cristina I. Muresan Department of Automation Technical University of Cluj-Napoca Cluj Napoca, Romania
ISSN 1430-9491 ISSN 2193-1577 (electronic) Advances in Industrial Control ISBN 978-3-030-42005-5 ISBN 978-3-030-42006-2 (eBook) https://doi.org/10.1007/978-3-030-42006-2 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Series Editor’s Foreword
The subject of control engineering is viewed very differently by researchers and those that must implement and maintain systems. The former group develops general algorithms with a strong underlying mathematical basis, whilst the latter has more local concerns over the limits of equipment, quality of control and plant downtime. The series Advances in Industrial Control attempts to bridge this divide and hopes to encourage the adoption of more advanced control techniques when they are likely to be beneficial. The rapid development of new control theory and technology has an impact on all areas of control engineering and applications. This monograph series encourages the development of more targeted control theory that is driven by the needs and challenges of applications. A focus on applications is important if the different aspects of the “control design” problem are to be explored with the same dedication that “control synthesis” problems have received in the past. The series provides an opportunity for researchers to present an extended exposition of new work on industrial control and applications. The series raises awareness of the substantial benefits that advanced control can provide but should also temper that enthusiasm with a discussion of the challenges that can arise. This monograph covers some areas of control systems design that are not so well known outside of specialist communities. Image-based control is an emerging area with applications in areas like robotics, and with rather different control design problems because of the nature of the visual sensing system. For example, the selection of visual features is important in vision-based robot control systems, since these features determine the performance and accuracy of the robot control system. The second important topic covered is fractional-order control, based on ideas from fractional calculus—also an area of current research where there is a rapidly growing number of research papers. The early chapters provide an introduction to the different topics of visual servo systems and there is useful introductory material on modelling and control problems in robotics. Chapter 3 describes detection algorithms for image feature extraction and evaluation. This is an important aspect of the use of visual control architectures. v
vi
Series Editor’s Foreword
The subject of fractional-order systems sometimes evokes strong reactions for or against this approach to modelling and control. Fractional-order control problems are discussed in Chap. 4, where there is a basic introduction to the design of position- and velocity-based control systems. Tuning procedures for the ubiquitous PI/PD/PID forms of controller are of course always of interest and are described here for fractional-order control systems. Although the mathematics may not be entirely familiar, the ideas are illustrated by simple examples. Potential benefits including performance and robustness are suggested from the application examples. Chapter 5 takes a small step into the problems of multivariable systems by dealing with processes that have two inputs and two outputs. One of the more familiar topics covered is the relative gain array often used for structure assessment in the process industries. In the second part of the text typical applications of vision-based robot control and of fractional-order control are discussed. Simulators for image-based control architectures are fairly essential for most robotic control application design studies, and some of the usual tools are described in Chap. 6. The real-time implementation of industrial robot manipulators is also covered and should be particularly valuable for applications engineers. The application of fractional-order control to real-time targets is described in Chap. 7. The claim is that a fractional-order control system will outperform a classical control algorithm and this is illustrated using experimental results. The implementation of a fractional-order controller on a field-programmable gate array is also described. For implementation, the fractional-order controllers, which theoretically have infinite memory, are approximated by integer-order transfer functions. The experimental results suggest the fractional-order controllers can outperform a classical controller, which is, of course, the main justification for accepting the added theoretical complexity—complexity that might otherwise affect its potential for use in real applications. Bringing the different ideas together the use of factional-order controllers for visual servoing is considered in Chap. 8. The experimental results and the design processes follow traditional lines but of course using fractional-order descriptions. The experiments involve a ball-and-plate system and also an image-based control law for a vision-based robot control system. Sliding-mode control methods, covered in Chap. 9, are of course of wide interest in robot applications and are of growing importance in a range of others. Chapter 9 covers sliding-mode control for a class of robotic arm problems. It is shown that the algorithms require relatively few operations and can be implemented very efficiently on microcontrollers, programmable logic controllers, etc. This text covers some areas of control engineering that are not very well known in most industrial sectors but may have a valuable role in specialist applications. This text is, therefore, a very timely and a very welcome addition to the series on Advances in Industrial Control. Glasgow, UK December 2019
Michael J. Grimble
Contents
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Visual Servoing System . . . . . . . . . . . . . . . . . . . 1.2 Visual Features . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Control Architectures for Visual Servoing System 1.4 Fractional-Order Control . . . . . . . . . . . . . . . . . . . 1.5 Book Summary . . . . . . . . . . . . . . . . . . . . . . . . .
Part I
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
1 1 2 2 3 5
Visual Servoing Systems and Fractional-Order Control
2
Visual Servoing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Rigid Body Pose . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 2D Pose Description . . . . . . . . . . . . . . . . . . . . 2.1.2 3D Pose Description . . . . . . . . . . . . . . . . . . . . 2.1.2.1 Orientation . . . . . . . . . . . . . . . . . . . . . 2.1.2.2 Combining Translation and Orientation 2.2 Vision-Based Control Architectures . . . . . . . . . . . . . . . 2.2.1 Image-Based Control Architecture . . . . . . . . . . . 2.2.2 Position-Based Control Architecture . . . . . . . . . 2.3 Advanced Visual Servoing and Applications . . . . . . . . 2.4 Image-Based Controller . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Proportional Control Law . . . . . . . . . . . . . . . . . 2.4.2 The Robot Model: VCMD Model . . . . . . . . . . . 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
11 11 11 12 13 15 16 17 19 20 21 22 23 28
3
Image Feature Extraction and Evaluation 3.1 Introduction . . . . . . . . . . . . . . . . . . . . 3.2 Point Features . . . . . . . . . . . . . . . . . . . 3.2.1 Harris Corner Detector . . . . . . . 3.2.2 SIFT Descriptor . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
29 29 30 30 32
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
vii
viii
Contents
3.3 Image 3.3.1 3.4 Image 3.4.1
Moments in Visual Servoing Applications Interaction Matrix for Image Moments . . Features Evaluation . . . . . . . . . . . . . . . . Performance Analysis of Point Features . 3.4.1.1 Stability . . . . . . . . . . . . . . . . . . 3.4.1.2 Accuracy . . . . . . . . . . . . . . . . . 3.4.1.3 Repeatability . . . . . . . . . . . . . . 3.4.1.4 Feature Spread . . . . . . . . . . . . . 3.4.2 Performance Analysis of Image Moments 3.4.3 Applications . . . . . . . . . . . . . . . . . . . . . 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
5
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
38 41 45 46 46 47 47 49 50 52 60
Fractional-Order Control: General Aspects . . . . . . . . 4.1 The Concept of Fractional Calculus for Control . . . 4.1.1 Definitions of Fractional-Order Operators . . 4.1.2 Stability of Fractional-Order Systems . . . . . 4.2 Tuning Rules for Fractional-Order PID Controllers . 4.2.1 The Fractional-Order PI Controller . . . . . . . 4.2.2 Example: FO-PI for a First-Order Process . . 4.2.3 The Fractional-Order PD Controller . . . . . . 4.2.4 Example: FO-PD for an Integrative Process . 4.2.5 The Fractional-Order PID Controller . . . . . . 4.3 Advantages of Fractional-Order PID Controllers over Classical Integer-Order PIDs . . . . . . . . . . . . . 4.3.1 Controlling an Unstable System . . . . . . . . . 4.3.2 A Comparison for Velocity Systems . . . . . . 4.3.3 A Comparison for Position Systems . . . . . . 4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
63 63 65 66 69 70 72 75 76 79
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
81 81 85 86 88
Fractional-Order Control for TITO Systems . . . . . . . . . . 5.1 Multivariable Processes . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Classical Control of MIMO Processes . . . . . . . . 5.1.2 Measuring Interaction: The Relative Gain Array 5.2 Fractional-Order Control for MIMO Systems . . . . . . . . 5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
89 89 89 91 92 96
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. 99 . 99 . 99 . 102 . 103
Part II 6
. . . . . . . . . . .
Applications of Visual Servoing and Fractional-Order Control
Simulators for Image-Based Control Architecture 6.1 Image-Based Simulator for Servoing System . . 6.1.1 Visual Servoing Toolbox . . . . . . . . . . . 6.1.2 Simulator Using Point Features . . . . . . . 6.1.3 Simulator Using Image Moments . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Contents
ix
6.2 Real-Time Implementation for Industrial Robot Manipulators 6.2.1 Ethernet Interface for ABB Robot Controller . . . . . . . 6.2.1.1 Control Architecture . . . . . . . . . . . . . . . . . . 6.2.1.2 Robot Driver Interface . . . . . . . . . . . . . . . . 6.2.2 Ethernet Interface for Fanuc Robot Controller . . . . . . 6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
Application of Fractional-Order Control on Real-Time Targets 7.1 Fractional-Order PI Controller for a Modular Servo System . 7.1.1 Characteristics of the Experimental Set-up . . . . . . . . . 7.1.2 Tuning of the FO-PI Controller and Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Implementation of Fractional-Order Controllers on FPGAs . . 7.2.1 Analog Versus Digital Approximations . . . . . . . . . . . 7.2.2 A Digital Implementation for FPGA Target . . . . . . . . 7.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . 123 . . . 123 . . . 123 . . . . .
. . . . .
. . . . .
125 129 129 132 138
8
Fractional-Order Controller for Visual Servoing Systems . 8.1 Fractional Control of a Stewart Platform . . . . . . . . . . . 8.1.1 Stewart Platform: Ball and Plate System . . . . . . 8.1.1.1 Hardware Design and System Set-up . . 8.1.1.2 Inverse Kinematics . . . . . . . . . . . . . . . 8.1.1.3 Modelling . . . . . . . . . . . . . . . . . . . . . 8.1.1.4 Visual Feedback . . . . . . . . . . . . . . . . . 8.1.2 Fractional-Order Controller Design . . . . . . . . . . 8.1.3 Controller Implementation . . . . . . . . . . . . . . . . 8.2 Fractional Control of a Manipulator Robot . . . . . . . . . . 8.2.1 Models for a Visual Servoing Architecture . . . . 8.2.1.1 Visual Sensor . . . . . . . . . . . . . . . . . . . 8.2.1.2 Inner-Loop Model . . . . . . . . . . . . . . . 8.2.1.3 PI l -Image-Based Controller . . . . . . . . 8.2.1.4 Open-Loop Model . . . . . . . . . . . . . . . 8.2.2 Fractional-Order Controller Design . . . . . . . . . . 8.2.3 Simulation and Experimental Results . . . . . . . . 8.2.3.1 Simulation Results . . . . . . . . . . . . . . . 8.2.3.2 Real-Time Results . . . . . . . . . . . . . . . 8.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
139 139 139 139 140 142 146 147 149 152 152 153 154 154 155 156 158 158 159 163
9
Sliding-Mode Control for a Class of 9.1 Introduction . . . . . . . . . . . . . . . 9.2 Generic Model Description . . . . 9.3 Static Non-linear Characteristic . 9.4 Linear Controllers . . . . . . . . . . .
7
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
106 106 109 110 116 122
Robotic Arms . . . . . . . . . . . . . 165 . . . . . . . . . . . . . . . . . . . . . . . . 165 . . . . . . . . . . . . . . . . . . . . . . . . 167 . . . . . . . . . . . . . . . . . . . . . . . . 171 . . . . . . . . . . . . . . . . . . . . . . . . 171
x
Contents
9.5 Sliding-Mode Controller Design . . . . . . . . . . . . . . . . . . . . . . . . 177 9.6 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 9.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Chapter 1
Introduction
1.1 Visual Servoing System Visual servoing systems represents a branch of research that combines results from different fields such as: artificial vision, robotics, as well as real-time application design, which becomes a major area of interest for the latest research decade. Servoing systems refer to the use of visual features to control the movement of a manipulator robot. Visual features are defined as properties of the objects that make up an image. Images can be purchased using a visual sensor that is mounted either in a fixed position in the work environment or on the last joint of the robot. The first configuration is called eye-to-hand, and the second configuration is called eye-in-hand. In [1], for the first time, a complete description of the two fundamental architectures of the visual servoing systems is made: the position-based architecture and the image-based architecture. Each of the two architectures presents advantages and disadvantages when dealing with real-time processes [20, 49]. A position-based architecture consists of calculating an error represented in the Cartesian system and requires both an object model (usually a CAD type) and a perfectly calibrated camera in order to estimate the position and orientation of an object [2]. For the image-based architecture, the use of an object model is avoided by measuring an error signal in the image plane, i.e. a signal mapped directly to the controls of the execution element of the robot [20]. Both architectures use visual features in order to describe the properties of an object in the image plane. If for position-based architecture the features are used to characterize positions by correlating the image plane with the 3D space, in the case of image-based architecture these features lead to the formation of the Jacobian matrix, which represents the mapping between the velocities of the objects projected in the image plane and the working environment of the robot. The peculiarities that may occur in the case of the Jacobian matrix plus the impossibility to have direct control over the speeds developed by the robot in the 3D space lead to difficulties in designing the loop regulator. For image-based architecture, the approach is to reduce the error between a set of current features and a set of desired features. An error function is © Springer Nature Switzerland AG 2020 C. Copot et al., Image-Based and Fractional-Order Control for Mechatronic Systems, Advances in Industrial Control, https://doi.org/10.1007/978-3-030-42006-2_1
1
2
1 Introduction
defined from the point of view of quantities that can be directly measured in an image, and the control law is constructed so that this error directly maps the robot’s motion. Depending on the type of control architecture used, an object can be characterized by its position or based on the visual features extracted from the image. As one of the main goals of the overall stability of servoing systems, it has been found that position-based architecture suffers from massive limitations in terms of robustness and mathematical description of the models required for physical implementations. Hence, so-called hybrid methods were created to combine the advantages of the two architectures [3, 4].
1.2 Visual Features Choosing an appropriate set of visual features is necessary to ensure the most accurate correlation between the dynamics in the image space and the dynamics in the task space, which leads to different entities: point features (centroid, corners), image moments, the areas of the projected regions, the orientation of the lines joining two points, the lengths of the edges, the parameterization of the lines, etc. [23, 38, 39]. These types of visual features can be used to generate image-based control law. The commonly visual features encountered in practice are the point features and those based on image moments. The point features can be 1D (edge) and 2D (corner). The main advantage of using point features is that the calculation of the interaction matrix is relatively simple because the point coordinates are known. The disadvantage of using point features in servoing applications is that of the reduced stability (they are not invariant to transformations of the object in the working scene). The problem with point features can be eliminated by using image moments to generate the control law. The idea of using moments in servoing applications is relatively old, but the bottleneck was the fact that the interaction matrix was not available for any type of the object. In [23] a new method was proposed which allows the calculation of the interaction matrix related to the image moments and this type of visual features became popular in visual servoing applications.
1.3 Control Architectures for Visual Servoing System Robot control studies the structure and operation of the robot control system. Based on the geometric and dynamic model, of the robot task which is further converted into the trajectory to be followed, the necessary commands are established for the actuating elements and the hardware and software elements using the feedback signals obtained from the visual sensor [5]. This results in the complexity of the control system of a robot. It will frequently have a hierarchical organization, where on the upper level is the decision part regarding the action to be taken, and on the lower level the
1.3 Control Architectures for Visual Servoing System
3
control and action elements of the joints. These commands will also have to take into account the desired performance for the robot, here the dynamic model of the robot intervenes. The typical structure of the control system will contain a computer on the upper level and a system with one or more microcontrollers controlling the actuating elements in the joints. Selecting an appropriate set of visual features and designing the visual control law is the main aspect to achieve a high-performance control architecture. The most common problems in servoing systems that influence the process of selecting the visual features are the problems related to local minimum, to singularity and to visibility. In general, local-minimum problems occur only in certain configurations [23]. The basic notion for a local minimum is when the camera velocity is zero (vc = 0) while the error between the current features f and the desired features f ∗ is different from zero, and thus the system converges to a final configuration that is different from the desired configuration. If the vector of visual features f is composed of three points of interest, the same image of the three points can be seen from different hypotheses such that for different configurations of the camera we have f = f ∗ , which correspond to several global minimums. To obtain a unique position it is necessary to use at least four points. By using four points, the control law tries to impose eight trajectory restrictions on the image plane while the system has only six degrees of freedom. In this case, the control law can generate unattainable movements in the image plane indicating a local minimum [142]. There are different control strategies for visual servoing systems proposed to eliminate the local minimum problem. For example, in [6] it was proposed to use a hybrid architecture, and in [2] the use of a planning strategy. The visual features used in the servoing systems can leave the view of the camera during the servoing task [156]. Therefore, it is required that the control laws to be used are able to maintain the visual features in the field of view of the camera in order to obtain the correct feedback during the servoing process. To minimize the likelihood of visual features leaving the camera view, planning strategies [2] or marking strategies [7] can be used. The increase in the number of degrees of freedom of the robots and the increased complexity of the objects in the working environment have led to the need to implement new methods for designing the control law [55]. Thus, one of the proposed solutions aims at predictive control [54] in order to increase the convergence and the stability of the servoing system.
1.4 Fractional-Order Control The non-integer-order derivative was a topic of discussion for more than 300 years, and now it is known as fractional calculus. It corresponds to the generalization of ordinary differentiation and integration to arbitrary (non-integer) order [64]. In the past decade, fractional-order control (FOC) has received extensive attention from the research community in several fields including robotics, mechatronics, biology, physics, among others. Compared with the classical integer-order controllers,
4
1 Introduction
FOC techniques have achieved more impressive results in many control systems in terms of improving the robustness during wind gusts, payload variations, disturbances due to friction, modelling uncertainties, etc. [8, 9]. The robot motion tracking systems represent one of the most challenging control applications in the field of manipulator robots due to the highly non-linear and timevarying dynamics. Recently, FOC of non-linear systems has started to attract interest in different kinds of applications [10]. A FOC strategy for visual servoing systems is presented in [158], where the image-based control is designed using a point features based on fractional-order PI controller. The system to evaluate the performance of the proposed control strategy is composed of a manipulator robot with 6 degrees of freedom with an eye-in-hand camera. In [11], a two-degree of freedom fractional-order proportional–integral–derivative (2-DOF FOPID) controller structure for a two-link planar rigid robotic manipulator with payload for trajectory tracking task is proposed. The tuning of the controller parameters is realized using cuckoo search algorithm (CSA). The performance of the proposed 2-DOF FOPID controllers is compared with those of their integerorder designs, i.e. 2-DOF PID controllers, and with the classical PID controllers. Another work for robot manipulators with continuous fractional-order nonsingular terminal sliding-mode (CFONTSM) based on time delay estimation (TDE) is presented in [12]. The simulation and experiment results show that the proposed control design can ensure higher tracking precision and faster convergence compared with TDE-based continuous integer-order NTSM (CIONTSM) design in a wide range of speed. Meanwhile, better performance is also observed compared with TDE-based IONTSM and FONTSM control designs using boundary layer technique. A non-linear adaptive fractional-order fuzzy proportional–integral–derivative (NLA-FOFPID) controller to control a 2-link planar rigid robotic manipulator with payload is studied in [13]. The gains of the controllers are optimized using a Backtracking Search Algorithm. Several simulations were performed to assess the performances of NLA–FPID (integer case) and NLA–FOFPID controllers for servo and regulatory mode. It has been observed that NLA–FOFPID controller outperforms NLA–FPID controller by offering much better performance. Particularly, in an uncertain environment it offered very robust behaviour as compared to NLA– FPID controller. Similarly, a fractional-order self-organizing fuzzy controller (FOSOFC) to control a two-link electrically-driven rigid robotic (EDRR) manipulator system is proposed in [14]. The simulation results show the effective behaviour of the FOSOFC, the obtained performance is compared with fractional-order fuzzy proportional–integral and derivative (FOFPID) controller for trajectory tracking as well as the disturbance rejection study. A recent work about the fuzzy fractionalorder control of robotic manipulators with PID error manifolds is proposed in [15]. The simulation results demonstrated the reliability of the proposed structure, and comparisons with respect to classical, discontinuous and continuous, sliding-mode controllers highlighted the superiority of the proposed method. On the other hand, the mechatronic systems represent one of the most challenging control applications due to their interdisciplinary nature. For example, in [91] a method for tuning and designing fractional-order controllers for a class of
1.4 Fractional-Order Control
5
second-order unstable processes, using stability analysis principles is studied. The experimental results suggest that the fractional-order controller is able to reduce the oscillatory behaviour and achieve a fast settling time and a zero steady-state error. Similarly, the performance of a magnetic levitation system (MLS) with fractionalorder proportional–integral–derivative (FO-PID) controller and integer-order PID controller for a particular position of the levitated object is studied [16]. The controller parameters are tuned using dynamic particle swarm optimization (dPSO) technique. Effectiveness of the proposed control scheme is verified by simulation and experimental results. A recent article about a new model-free fractional-order sliding-mode control (MFFOSMC) based on an extended state observer (ESO) for a quarter car active suspension systems is presented [17]. In which, the main goal is to increase the ride comfort while the dynamic wheel load and the suspension deflection remain within safety-critical bounds. The simulation results demonstrate the effectiveness of the proposed controller, a comparison with classical PID, time delay estimation control, and intelligent PID controller has been performed.
1.5 Book Summary In this book, we present a comprehensive and unified approach to use vision-based feedback in fractional-order control design algorithms to achieve intrinsic loop robustness and enable loop performance specifications. The book is structured in 10 Chapters ordered as follows after this introductory chapter. Chapter 2 In this chapter, an overview of the visual servoing concept is presented. The first part shows the representation of position and orientation of an object in 2- and 3dimensional environment. A coordinate frame is used to describe a set of points attached to a specific object. The pose of a coordinate frame can be described with respect to other coordinate frames and the transformation between two frames is given by a homogeneous transformation matrix. In the second part of this chapter, the fundamentals of visual-based control architecture were presented. From the perspective of visual sensor location, we can identify two configurations: (i) eye-in-hand, the camera is mounted on the robot’s TCP; (ii) eye-to-hand, the camera is fixed in the working environment. Based on the type of the control structure there are two main architectures: (i) Image-Based Visual Servoing, the controller is designed directly in the image plane and (ii) Position-Based Visual Servoing, the controller is designed using pose estimation based on camera calibration and 3D target model. Chapter 3 The performance of the visual servoing system is highly connected with the type and the performance of the visual features. In this chapter, the geometric and photometric visual features were considered. In order to extract the geometric visual features two point features operator have been presented: Harris operator and SIFT descriptor,
6
1 Introduction
while photometric features were described by image moments. From the class of geometric features, point features are mostly used in visual servoing applications. One of the main disadvantages of this type of features is that the point features operator does not provide stable features for these types of applications. The number of point features from the first image should be the same on the entire sequence of images. Apart from the stability of point features are other criteria (repeatability, accuracy, features spread) that need to be fulfilled in order to obtain good performance. This problem of point features can be eliminated by using image moments which allow a general representation of an image and also allow the description of a more complex object. The performance evaluation of this type of features can be realized using a criterion based on Hausdorff distance. Chapter 4 Fractional-order controllers have seen a rapid increase regarding research interest due to the numerous advantages they offer, including better closed-loop dynamics and robustness. In this chapter, the basic concepts of fractional systems are covered as an introduction to more complex issues tackled in the subsequent chapters of the book. Tuning rules for fractional-order PI/PD/PID controllers are given, as well as illustrative examples, covering applications from mechatronic systems. To demonstrate the advantages of using a fractional-order controller instead of the traditional integer-order controllers three examples are covered: an unstable magnetic levitation system, a velocity system, as well as a position system. The comparative simulation results, in closed-loop, show that indeed the fractional- order controllers provide improved performance and increased robustness. Chapter 5 Here, the tuning rules detailed and exemplified in Chap. 4 are extended for the case of multivariable systems, more specifically for the case of the two-inputs-two-outputs systems. First of all, a short review of classical control techniques for multivariable processes is presented followed by a brief analysis of interaction and proper pairing of input–output signals. Then, a multivariable fractional-order controller design is presented for a two-inputs-two-outputs process, with the tuning rules and methodology based on a modification of classical integer-order multivariable PID controller design. Chapter 6 In this chapter, a new proportional control law which includes the dynamic model of the manipulator robot, the robot is modelled as a ‘Virtual Cartesian Motion Device’, has been developed. Performance evaluation of the servoing system was performed with respect to the type of visual features used to design the visual control law. The implementation, testing and validation of the control algorithm was achieved through development of a simulator for servoing systems. The simulators developed within this chapter allow the use of both types of visual features, point features as well as image moments. Starting from this simulator a control architecture for real-time servoing applications has been developed. In the first stage, the real-time control architecture was designed for an ABB-IRB2400 robot manipulator with an eye-inhand camera configuration and makes use of point features extracted with Harris and
1.5 Book Summary
7
SIFT operators. In the second phase, a real-time control architecture for a FANUC Arc Mate 120 robot manipulator was implemented. Although the proportional controller is one of the simplest controllers that can be designed, the experimental results indicate that the proportional control law achieves satisfactory performance for servoing applications. In order to improve the performance of the visual servoing system, a more complex control architecture such as predictive controller [147, 150, 159–161] can be designed, but this was out of the scope of this book. Chapter 7 The chapter presents two case studies regarding the implementation of fractionalorder controllers on real-time targets. The chapter also includes the implementation steps for digital control systems and discusses pitfalls on the actual implementation of fractional-order controllers on dedicated real-time devices. The advantages of using fractional-order controllers over the traditional integer-order controllers are explained and validated using two examples, including an open-loop unstable system. A full fractional-order PI controller design is detailed in the last part on a reallife application, i.e. a modular servo system. The experimental results indicate the accuracy, efficiency and robustness of the tuned fractional-order controller which outperforms the classical control. In this chapter, a digital implementation on a fieldprogrammable gate array (FPGA) device is also presented for the control of a velocity system, using a fractional-order PI controller. Chapter 8 In this chapter two case studies regarding the implementation of a fractional-order controller on servoing system are presented. For the first case, fractional-order PD controllers are implemented onto a ball and plate system. By changing the experimental conditions, i.e. changing the ball, we show that the fractional-order controller is more suitable for this process. This result is very important since in the real-life applications as most of the mechatronic systems, the practical process can be different from the theoretical models. For the second case, an image-based control law based on fractional calculus applied to visual servoing systems has been presented. The considered control architecture consists of a 6 d.o.f. manipulator robot with an eye-in-hand configuration. To evaluate the performances of the proposed fractionalorder controller, a real-time implementation using MATLAB® and a Fanuc robot was performed. Different experiments for planar static objects were conducted using both, fractional-order P I µ and integer-order P I controllers. For both case studies considered in this chapter, the experimental results show that the control law based on fractional-order obtains better performances in comparison with classical control law. Chapter 9 In this chapter, an application of a sliding-mode control strategy with a generalized dynamic reference trajectory is presented. The results have been tested in realistic simulations from a field test case of combine harvester with spout angle control for optimal filling of crop reservoirs. While the implementation of the sliding-mode control strategy is not (much) more demanding than that of linear controllers, its
8
1 Introduction
performance is much more robust in changing dynamic conditions of the real process. Other applications with similar features can be considered, without loss of generality. Chapter 10 Here we give a summary of the main features of the book in terms of scientific content. The given theory background and exemplified applications are integrated in a broader concept of bringing forward new emerging tools in the area of mechatronics. Acknowledgements The authors would like to thank Oliver Jackson and all editors for the support with the review, editing and production steps. Cosmin Copot would like to thank Prof. Corneliu Lazar and Dr. Adrian Burlacu from “Gheorghe Asachi” Technical University of Iasi for the professional support during his Ph.D. which represent the basis of Chaps. 2, 3, 6 and 8. Cristina Muresan was funded by a research grant of the Romanian National Authority for Scientific Research and Innovation, CNCS/CCCDI-UEFISCDI, project number PN-III-P1-1.1-TE-2016-1396, TE 65/2018. Clara Ionescu would like to thank to colleagues from EEDT-Decision and Control core group of Flanders Make for the research and funding opportunities.
Part I
Visual Servoing Systems and Fractional-Order Control
Chapter 2
Visual Servoing Systems
2.1 Rigid Body Pose A fundamental requirement in robotics and computer vision is to represent the position and orientation of objects in an environment whereas the latter can be static or dynamic. Such objects may include robots, cameras, workpieces, obstacles and paths. A frame of coordinates, also known as a Cartesian coordinate system, is a set of orthogonal axes which intersect at a point defined as the origin. A rigid body (under the assumption that for any transformation its constituent points maintain a constant relative position with respect to the object’s coordinate frame) is completely described in space by its position and orientation (in short further defined as pose) with respect to a reference frame (i.e. a fixed coordinate frame of the environment).
2.1.1 2D Pose Description A 2-dimensional (2D) environment, or plane, is usually described by a Cartesian coordinate frame with orthogonal axes denoted by X and Y and typically drawn with the X-axis horizontal and the Y-axis vertical. Unit vectors parallel to the axes are − → − → denoted by i and j . A point is represented by its X- and Y-coordinates [ p X pY ]T or as a bound vector: − → − → (2.1) p = p X i + pY j Regarding orientation description, the most common approach is to use rotations that are rigid transformations. For the 2D environment, rotations about the origin can be represented by 2 × 2 matrices of the form: R=
cos(θ ) − sin(θ ) sin(θ ) cos(θ )
© Springer Nature Switzerland AG 2020 C. Copot et al., Image-Based and Fractional-Order Control for Mechatronic Systems, Advances in Industrial Control, https://doi.org/10.1007/978-3-030-42006-2_2
(2.2) 11
12
2 Visual Servoing Systems
The main properties of rotation matrices can be summarized as follows: • a rotation of 0 radians will have no effect because the rotation matrix yields the 10 identity matrix R = 01 • two successive rotations can be decomposed as matrix multiplication; • the inverse of a rotation matrix is its transpose RRT = 1; • the determinant of rotation matrix is equal to 1 which means that R belongs to the special orthogonal group of dimension 2 or equivalently R ∈ S O2 ⊂ R2×2 . It is interesting to observe that instead of representing an angle, which is a scalar, we have used a 2 × 2 matrix that comprises four elements, although these elements are not independent. Each column has a unit magnitude which provides two constraints. The columns are orthogonal which provide another constraint. Four elements and three constraints are effectively one independent value. The rotation matrix is an example of a non-minimum representation and the disadvantages such as the increased memory it requires are outweighed by its advantages such as composability [48, 49]. Regarding position description, we need to analyse translations. A translation can be represented by a vector. If the translation vector is t = [t X tY ], then
p =p+t
(2.3)
gives the new position of the point under analysis. A general rigid body transformation is given by a pair (R(θ ), t). These pairs have the following effect on the pose of a position vector:
p = R(θ )p + t or, if we prefer the homogeneous coordinates representation: ⎡ ⎤ ⎡ ⎤⎡ ⎤ pX cos(θ ) − sin(θ ) t X pX ⎣ p ⎦ = ⎣ sin(θ ) cos(θ ) tY ⎦ ⎣ pY ⎦ Y 0 0 1 1 1
(2.4)
(2.5)
The matrix has a very specific structure and belongs to the special Euclidean group of dimension 2.
2.1.2 3D Pose Description The 3-dimensional (3D) case is an extension of the 2D case discussed in the previous section. We add an extra coordinate axis, typically denoted by Z, that is orthogonal to both the X- and Y-axes. The direction of the Z-axis obeys the right-hand rule and
2.1 Rigid Body Pose
13
forms a right-handed coordinate frame. Unit vectors parallel to the axes are denoted − → − → − → i , j and k such that − → − → − → k = i × j
(2.6)
A point p is represented by its [ p X , pY , p Z ] coordinates or as a bound vector: − → − → − → p = p X i + pY j + p Z k
2.1.2.1
(2.7)
Orientation
In order to describe the orientation of the rigid body, it is convenient to consider an orthonormal frame attached to the body and express its unit vectors with respect to the reference frame. There is a set of techniques that can be used to describe orientation in 3D. Among them we find: orthonormal rotation matrices, Euler angles, quaternions, all described hereafter. In 3D any rotation is about a fixed axis, thus we have to specify the angle of rotation θ and the unit vector of the rotation axis u. The orthonormal rotation matrices for rotation about X −, Y − and Z − axes are ⎡ ⎤ 1 0 0 − → R(θ, i ) = ⎣ 0 cos θ − sin θ ⎦ ⎡ 0 sin θ cos θ ⎤ cos θ 0 sin θ − → R(θ, j ) = ⎣ 0 1 0 ⎦ ⎡ − sin θ 0 cos θ ⎤ cos θ − sin θ 0 − → R(θ, k ) = ⎣ sin θ cos θ 0 ⎦ 0 0 1 Let us consider two orthonormal coordinate frames {A} and {B}. Just as for the 2D case we can represent the orientation of a coordinate frame by its unit vectors expressed in terms of the reference coordinate frame. Each unit vector has three elements and they form the columns of a 3 × 3 orthonormal matrix A R B A
p = A RB
B
p
(2.8)
which rotates a vector defined with respect to frame {A} to a vector with respect to {B}. The orientation description through rotation matrix is governed by the Euler theorem: Any two independent orthonormal coordinate frames can be related by a sequence of rotations (not more than three) about coordinate axes, where no two successive rotations may be about the same axis. By recalling the meaning of a rotation matrix in terms of the orientation of a current frame with respect to a fixed
14
2 Visual Servoing Systems
frame, it can be recognized that its columns are the direction cosines of the axes of the current frame with respect to the fixed frame. Similarly, its rows (columns of its transpose and inverse) are the direction cosines of the axes of the fixed frame with respect to the current frame. An important issue of composition of rotations is that the matrix product is not commutative. Hence, it can be concluded that two rotations, in general, do not commute and their composition depends on the order of the single rotations (i.e. matrix algebra). Rotation matrices give a redundant description of frame orientation; in fact, they are characterized by nine elements which are not independent. They are related by six constraints due to the orthogonality conditions. This implies that three parameters are sufficient to describe the orientation of a rigid body in space. A representation of orientation in terms of three independent parameters constitutes a minimal representation. A minimal of orientation can be obtained by using a set of three representation angles φ = α β γ . Consider the rotation matrix expressing the elementary rotation about one of the coordinate axes as a function of a single angle. Consequently, a generic rotation matrix can be obtained by composing a suitable sequence of three elementary rotations while guaranteeing that two successive rotations are not made about parallel axes. This implies that 12 distinct sets of angles are allowed out of all 27 possible combinations; each set represents a triplet of Euler angles. An important set of Euler angles are the ZYX angles, also called Roll, Pitch and Yaw angles. In this case, the angles α β γ represent rotations defined with respect to a fixed coordinate frame attached to the centre of mass of the rigid body. • Rotate the reference frame by the angle α about axis X (roll) • Rotate the reference frame by the angle β about axis Y (pitch) • Rotate the reference frame by the angle γ about axis Z (yaw). The resulting frame orientation is obtained by composition of rotations with respect to the fixed frame, and then it can be computed via pre-multiplication of the matrices of elementary rotation: ⎡
⎤ cγ cβ cα cβ sγ − sα cγ cα sβ cγ + sγ sγ R(φ) = R Z (γ )RY (β)R X (α) = ⎣ sα cβ sα sβ sγ + cα cγ sα sβ cγ − cα sγ ⎦ (2.9) −sβ cα sγ cβ cγ A fundamental problem with the three-angle representations just described is singularity. This occurs when the rotational axis of the middle term in the sequence becomes parallel to the rotation axis of the first or third term (this is also known as the problem of gimbal lock). In the same way as complex numbers, the quaternions can be defined by introducing abstract symbols i, j, k which satisfy the rules i 2 = j 2 = k 2 = i jk = −1 and the usual algebraic rules, except the commutative law of multiplication. When quaternions are used in geometry, it is more convenient to define them as qˆ = q0 + q = q0 + q1 i + q2 j + q3 k,
(2.10)
2.1 Rigid Body Pose
15
where the imaginary part q1 i + q2 j + q3 k behaves like a vector q = (q1 , q2 , q3 ) in V3 , and the real part q0 behaves like a scalar in R. The conjugate of a quaternion is denoted qˆ ∗ = q0 − q. Let Q u be the ensemble of quaternions having unit length → → ( q 2 + q 2 + q 2 + q 2 = 1). If qˆ ∈ Q then rotating a vector − r around an − η direc0
1
2
3
u
tion axis with an angle α can be express by ρˆ = qˆ rˆ qˆ ∗ ,
(2.11)
→ → where is the quaternion multiplication, rˆ = 0 + − r , ρˆ = 0 + − ρ and qˆ = cos α2 + − → η sin α2 . The transformation that links the orientation expressed by a rotation matrix with the orientation expressed by quaternions is: 1 1 1 + r11 − r22 − r33 ; q1 = (r12 + r21 ) 2 4q0 1 1 q2 = (r13 + r31 ) ; q3 = (r32 − r23 ) , 4q0 4q0
q0 = ±
(2.12)
where ri j are the components of the rotation matrix from (2.9).
2.1.2.2
Combining Translation and Orientation
Usually, the position of a rigid body in space is expressed in terms of the position of a suitable point on the body with respect to a reference frame, defined as translation. Its orientation is expressed in terms of the components of the unit vectors of a frame attached to the body—with origin in the above point—with respect to the same reference frame, defined as rotation. In the previous subsection we discussed several representations of orientation, and now we need to combine this information with translation, to create a tangible representation of relative pose. The two most practical representations are: the quaternion vector pair and the 4 × 4 homogeneous transformation matrix, both described hereafter. Taking into account that a coordinate frame can always be attached to a rigid object, a homogeneous transformation matrix describes either the pose of a frame with respect to a reference frame, or it represents the displacement of a frame into a new pose. In the first case, the upper left 3 × 3 matrix represents the orientation of the object, while the right-hand 3 × 1 column describes its position (e.g. the position of its mass centre). The last row of the homogeneous transformation matrix will be always represented by [0, 0, 0, 1]: Rd (2.13) T= ; R ∈ SO3 ; d ∈ R 0 1
16
2 Visual Servoing Systems
In the case of object displacement, the upper left matrix corresponds to rotation and the right-hand column corresponds to the translation of the object. Since R is orthogonal, it is straightforward to observe that the inverse transformation T−1 is given by T R −RT d ; (2.14) T−1 = 0 1 In order to apply a homogeneous transformation on a 3D point p we need to conA p and the corresponding homogeneous sider its homogeneous representation: 1 B transform between two orthonormal frames T A : A B p p B = TA (2.15) 1 1 As a conclusion to this section, a homogeneous transformation matrix expresses the coordinate transformation between two frames in a compact form. If the frames have the same origin, this reduces to the rotation matrix previously defined. Instead, if the frames have distinct origins, it allows the notation with superscripts and subscripts to be kept which directly characterizes the current frame and the fixed frame.
2.2 Vision-Based Control Architectures Visual servoing can be seen as sensor-based control scheme from a vision sensor. Main, if not all, visual servoing tasks can be expressed as the regulation towards zero of an error e(t) which is defined by e(t) = f(m(t), a) − f ∗ (t)
(2.16)
where m(t) is a set of image measurements (e.g. the image coordinates of interest points, or the area, the centre of gravity and other geometric characteristics of an object) and a is a set of parameters that represent potential additional knowledge about the system (e.g. coarse camera intrinsic parameters or 3D model of objects). The vector f ∗ (t) contains the desired value of the features, which can be either constant in the case of a fixed goal or varying if the task consists in following a specified trajectory. The classical designs for servoing architectures are named image-based visual servoing (IBVS), in which f consists of a set of 2D parameters that are directly expressed in the image, and pose-based visual servoing (PBVS), in which f consists of a set of 3D parameters related to the pose between the camera and the target.
2.2 Vision-Based Control Architectures
17
Fig. 2.1 A general representation of the image-based visual servoing architecture Fig. 2.2 Perspective projection of a 3D point
2.2.1 Image-Based Control Architecture Image-based visual servoing (IBVS) uses direct image measurements as feedback to control the motion of a robot, see Fig. 2.1. The robot’s positioning task is expressed as an image-based error function to be minimized using a suitable control law. As IBVS does not explicitly solve for the Cartesian pose of the target object, its performance does not depend on the accuracy of a priori models. However, since the domain of the control law is in image space, there is no direct control over the Cartesian or joint-space trajectory of the robot end-effector. Interaction Matrix for Point Feature Extraction Let p = (X, Y, Z ) be a 3D point defined in the camera space, its projection in the image plane (Fig. 2.2) is the point x of coordinates (x, y), where
x = (X/Z )ι y = (Y/Z )ι
(2.17)
and ι represents the focal length, which is considered equal to 1. The time derivative of a 2D point is defined as x˙ = X˙ /Z − x Z˙ /Z (2.18) y˙ = Y˙ /Z − y Z˙ /Z
18
2 Visual Servoing Systems
Fig. 2.3 a An eye-in-hand camera configuration; b An eye-to-hand camera configuration
Regardless of the visual sensor configuration, eye-in-hand (the camera is mounted on the TCP of the robot) or eye-to-hand (the camera is fixed in the working space). The two types of camera configurations are illustrated in Fig. 2.3. The derivative of the point p with respect to the camera velocity vc can be calculated based on the fundamental kinematics equation: p˙ = −v − ω × p = −v + [p]× ω
(2.19)
where vc = [v T , ω T ]T = [vx , v y , vz , ωx , ω y , ωz ]T is the camera velocity with v and ω the linear and angular velocity, respectively, [p]× is the antisymmetric matrix defined as ⎡ ⎤ 0 −Z Y [p]× = ⎣ Z 0 −X ⎦ (2.20) −Y X 0 Substituting (2.20) in (2.19), the derivative of a point p with respect to the visual sensor is ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤ X˙ vx 0 −Z Y ωx vx −ω y Z + ωz Y ⎣ Y˙ ⎦ = − ⎣ v y ⎦ + ⎣ Z 0 −X ⎦ ⎣ ω y ⎦ = − ⎣ v y ⎦ + ⎣ ωx Z − ωz X ⎦ −Y X 0 vz ωz vz −ωx Y + ω y X Z˙ (2.21) thus ⎧ ⎨ X˙ = −vx − ω y Z + ωz Y (2.22) Y˙ = −v y + ωx Z − ωz X ⎩ ˙ Z = −vz − ωx Y + ω y X If we substitute (2.22) in (2.18), the velocity of a 2D image feature x˙ = (x, ˙ y˙ )T can be written as
2.2 Vision-Based Control Architectures
19
x˙ = −vx /Z + xvz /Z + x yωx − (1 + x 2 )ω y + yωz y˙ = −v y /Z + yvz /Z + (1 + y 2 )ωx − x yω y − xωz
(2.23)
By writing previous equation in a matriceal form, we have x˙ = Lx vc
(2.24)
where Lx is the interaction matrix related to x. Combining (2.23) and (2.24), the interaction matrix for a 2D point x of coordinates (x, y) can be calculated by Lx =
−1/Z 0 x/Z 0 − 1/Z y/Z
xy − (1 + x 2 ) 2 − xy 1+y
y −x
(2.25)
where Z is the depth of the point relative to the camera frame. For a set of n point features, the interaction matrix can be computed as ⎡
⎤ Lx1 ⎢ Lx2 ⎥ ⎢ ⎥ Ln = ⎢ . ⎥ ⎣ .. ⎦
(2.26)
Lxn The interaction matrix of the ith feature point of coordinates (u i , vi ) in the image plane is given by u 2 +ι2 −ι 0 uzii u iιvi − i ι vi zi (2.27) Li = 2 2 vi vi +ι 0 −ι − u iιvi −u i zi zi ι and ι, as mention before, is the focus distance.
2.2.2 Position-Based Control Architecture The information used to design position-based visual servoing (PBVS) architectures is a set of 3D parameters related to the pose between the camera and the target, see Fig. 2.4. These 3D parameters have to be estimated from the image measurements either through a pose estimation process using the knowledge of the 3D target model, or through a partial pose estimation process using the properties of the epipolar geometry between the current and the desired images, or finally through a triangulation process if a stereovision system is considered. For PBVS case, the error e(t) is defined in the pose space, d ∈ R3 , R ∈ S O3 . If the goal pose is given by d = 0, R = I then the role of the computer vision system is to provide, in real time, a measurement of the pose error. Consider u, θ to be the axis/angle parameterization of R. For this case, the error is given by
20
2 Visual Servoing Systems
Fig. 2.4 A general representation of the position-based visual servoing architecture
e(t) =
d uθ
(2.28)
and its derivative is given by e˙ = where
R 0 Vc 0 L ω (u, θ )
θ sincθ L ω (u, θ ) = I − [u]× + 1 − [u]× 2 2 sinc2 θ2
(2.29)
(2.30)
and sinc x is the sinus cardinal defined such that xsinc x = sin x and sinc 0 = 1. With respect to robustness, feedback action will be computed using estimated quantities that are a function of the image measurements and the system calibration parameters.
2.3 Advanced Visual Servoing and Applications This section builds on the previous one and discusses some advanced visual servo techniques and related applications. Wide-angle cameras such as fisheye lenses and catadioptric cameras have significant advantages for visual servoing. Next we briefly show how IBVS can be reformulated for polar rather than Cartesian image-plane coordinates. This is directly relevant to fisheye lenses but also gives improved rotational control when using a perspective camera. In polar coordinates, an image point (u, v) is expressed by the pair (r, φ) where r is the distance from the principal point: r=
u 2 + v2
(2.31)
2.3 Advanced Visual Servoing and Applications
21
The angle from the u-axis to a line joining the principal point to the image point is φ = tan−1
v u
(2.32)
For polar coordinates, the interaction matrix that links the time derivative with the camera velocity is L polar =
− Zι cos φ − Zι sin φ Zr − r ιZ sin φ − r ιZ cos φ 0
ι2 +r 2 sin φ ι ι cos φ Z
−ι
+r 2 cos φ ι ι sin φ Z
2
0 −1
(2.33)
This interaction matrix is unusual in that it has three constant elements. In the first row, the zero indicates that radius r is invariant to rotation about the z-axis. In the second row, the zero indicates that polar angle is invariant to translation along the optical axis and the negative one indicates that the angle of a feature (with respect to the u-axis) decreases with positive camera rotation. As for the Cartesian point features, the translational part of the Jacobian (the first 3 columns) is proportional to 1/Z. Note also that the Jacobian is undefined for r = 0, that is, for a point at the image centre. Potential applications of visual servoing are numerous. It can be used as soon as a vision sensor is available and a task is assigned to a dynamic system to control its motion. Among the applications which reported the use of visual servoing techniques we find the following: • • • • • • • • • • •
The control of a pan–tilt–zoom camera; Grasping using a robot arm; Locomotion and dexterous manipulation with a humanoid robot; Micro- or nanomanipulation of MEMS or biological cells; Pipe inspection by an underwater autonomous vehicle; UAV formation control; Autonomous navigation of a mobile robot in indoor or outdoor environment; Aircraft landing; Autonomous satellite rendezvous; Biopsy using ultrasound probes or heart motion compensation in medical robotics; Virtual cinematography in animation.
2.4 Image-Based Controller The necessity of designing flexible and versatile systems is one of the most challenging and popular trends in robotic research. In the first part of this section, a simulator for servoing application using point features and image moments as visual features were presented. Including visual servoing techniques in an existing robotic system is a very demanding task. Thus, in the second part, a solution for extending the capabilities of a 6DOF (degree of freedom) manipulator robot, for visual servoing
22
2 Visual Servoing Systems
system development, is presented. An image-based control architecture is designed and a real-time implementation on an ABB robot and a Fanuc robot is developed. The image acquisition and processing together with the computing of the imagebased control law were implemented in MATLAB® . A new type of robot driving interface that links the robot’s controller with MATLAB® environment is proposed. The robustness and stability of the visual feature (both point features and image moments) based control laws are tested in multiple experiments. The experimental results outline the very good performance of real-time visual servoing system.
2.4.1 Proportional Control Law The feedback loop control architectures based on image information systems use visual features in real time as they are extracted from the system during operation, e.g. to control the motion of the robot. The main goal of the image-based controller is to minimize the error e(t) between the current set of visual features f(k) and the desired configuration of the visual features f ∗ and thus generating the reference velocity vc∗ for the visual sensor, also referred to as path planning, motion planning and reference guidance system. A set of visual features f can be used in serving applications if they are defined as data of a differential application such as f = f (r(t))
(2.34)
where r(t) describes the situation/transformation between the visual sensor and the working space at moment t. The movement of the camera or of the objects induce a variation into the visual feature values. These variations can be correlated with the camera velocity in the working space by using ∂f r˙ = Lf vc f˙ = ∂r
(2.35)
where • Lf is a Jacobian matrix defined as interaction matrix attached to f. The value of this matrix depends on the visual features f and also on the relative position between the camera and the object. • vc = [v, ω]T represents the camera velocity with respect to the working space. As aforementioned, the scope of the image-based controller is to minimize the error e(t) from the image plane defined as e(t) = f(t) − f ∗
(2.36)
2.4 Image-Based Controller
23
Relations (2.35) and (2.36) deliver the relationship between the time variation of the error and the camera velocity: e˙ = f˙ = Lf vc
(2.37)
Taking into account the robot dynamics, the previous equation becomes e˙ = Lf G(z −1 )vc∗
(2.38)
In order to have an exponential decrease of the error: e˙ = −λe
(2.39)
and considering vc∗ as the command signal for the robot controller, the following control law is obtained: vc∗ = −λG −1 (z −1 )Lf + f(t) − f ∗
(2.40)
where Lf + is the pseudo-inverse of the Lf interaction matrix and is defined as −1 T Lf+ = Lf T Lf Lf
(2.41)
The control systems based on the visual feedback are non-linear and have coupled dynamics. The stability of these types of system can be analysed by using Lyapunov functions. For instance, a detailed analysis of the stability of the visual serving systems can be found in [46]. If the visual features used to describe the object are image moments, then using the same steps as for point features, the image-based proportional control law is vc∗ = −λG −1 (z −1 )Lf+m fm (t) − fm∗
(2.42)
where Lf+m is the pseudo-inverse of the interaction matrix Lfm .
2.4.2 The Robot Model: VCMD Model Generally, an image-based servoing system consists of: a manipulator robot, a visual sensor and an image-based controller. Figure 2.5 illustrates a schematic representation of an image-based control architecture of manipulator robots with 6 DOF. The core of the architecture is represented by the image-based controller which requires a priori information related to the system behaviour in order to minimize the error between the current configuration of the visual features f and the desired configuration of the features f ∗ . To enable modelling the dynamic response of the system
24
2 Visual Servoing Systems
Fig. 2.5 Visual servoing architecture to control manipulator robots
in open loop, the two entities which have to be analysed are: manipulator robot and visual sensor. Further on, an eye-in-hand configuration of the ensemble robot-visual sensor is being considered. There are two possibilities to model the robot: kinematic and dynamic modelling. From these two, robot dynamics represent a key role with respect to the performance of the visual servoing system. For servoing applications the manipulator robot can be modelled as a VCMD [54]. As seen in Fig. 2.5, the input signal for the VCMD is the output for the visual-based controller vc∗ . This signal is the reference velocity of the camera and it has the following form vc∗ = [v∗ ω∗ ]T , where v∗ = [vx∗ v∗y vz∗ ]T is the linear velocity and ω∗ = [ωx∗ ω∗y ωz∗ ]T is the angular velocity. The input signal vc∗ is represented in Cartesian space and in order to be applied to the manipulator robot a homogeneous transformation is needed. If we denote with [t1 , t2 , t3 , t4 , t5 , t6 ]T , the pose obtained by integrating vc∗ then the Jacobian of the robot can be defined as ∂t1 ⎤ · · · ∂q 6 ⎥ ⎢ Jr = ⎣ ... . . . ... ⎦ ∂t6 ∂t6 · · · ∂q ∂q1 6
⎡
∂t1 ∂q1
(2.43)
where: qi , i = 1, 6 represent the states of the robot joints. Therefore, transformation of vc∗ from Cartesian space in the robot space joint can be obtained via Jr−1 . Next, two methods are used to describe the robot dynamics as a VCMD are discussed. In [54], a linearized model for a multivariable (MIMO) VCMD is proposed. Starting from the hypothesis that each joint has its own velocity control loop (see Fig. 2.6). The controllers Ci , i = 1, 6 corresponding to the control loops are designed in such a way that to ensure the joints decoupling (this being valid for the most manipulator robots). Due to Jacobian matrix structure related to the robot and due to the dynamic of the robot with respect to the joint position, the result is a non-linear VCMD model. For linearization purposes, the following assumptions are being made: the non-linear effect generated by Coriolis force, the inertial force and gravitational torch have an
2.4 Image-Based Controller
25
Fig. 2.6 The VCMD model of a manipulator robot
Fig. 2.7 The linearized model for VCMD
insignificant disturbance effect on the control loop. This can be rejected by an adequate design of the controllers Ci , i = 1, 6. The non-linearities generated by all these forces become significant when the robot is moving at high speed. Due to the fact that the control loop for the joints is very fast, this can be designed in continuous time and therefore a zero-order hold (ZOH) block has to be included as depicted in Fig. 2.6. To apply the vc∗ speed in the robot joint space, the inverse of the robot Jacobian Jr−1 is used. Taking into account that the joint inertia is varying less with respect to the robot position xq , the inertia matrix can be considered constant around a given position of the robot. For servoing systems based on visual feedback, speed joints are controlled individually and the controllers Ci , i = 1, 6 are designed to reject the non-linear effect produced by the disturbances. Starting from these assumptions and considering that the dynamic model of the robots but also the Jacobian Jr are constant in a given position of the robot xq , the control loops of the robot can be modelled using the transfer function G(s): vq (s) = G(s)vq∗ (s)
(2.44)
The discrete linear model of the VCMD is illustrated in Fig. 2.7 and described by G V C M D (z −1 ) = 1 − z −1 Z
Jr G(s)Jr−1 s2
(2.45)
where Z represents the z transform. In [54], the dynamics of the visual sensor are modelled as pure delays. These delays are given by: (i) the time necessary for image acquisition and image formation, (ii) the time needed to transfer the image from the camera to the computer and (iii) the time required for image processing, see Fig. 2.8.
26
2 Visual Servoing Systems
Fig. 2.8 Visual sensor model
In order to determine the model of the visual sensor, we can assume that the time required for the acquisition and formation of the image, Tach , is equal to the sampling period Ts , the time necessary to transfer the image, Ttrans , is very small in comparison with the sampling period and the time for image processing, T pr oc , is equal to the sampling period. Hence, the visual sensor can be modelled as a dead-time element equal to 2Ts . Using the camera model together with the VCMD model and knowing the initial pose of the camera denoted by x0 , we can develop a control structure as in Fig. 2.9 which can then be successfully applied in real-time servoing applications. Assuming that G(s) is diagonal, then the discrete transfer matrix can be written as
G p (z −1 ) = z −2 1 − z −1
G(s) Jr Z s2
Jr−1 = z −2 Jr diag
bi (z −1 ) ai (z −1 )
Jr−1 , i = 1, 6
(2.46) The autoregressive integrated moving average with exogenous inputs (ARIMAX) model has the following deterministic form: z where
−1
−1
−1
−1
A (z )B(z ) = G p (z) = z
−2
bi (z −1 ) Jr diag J −1 ai (z −1 ) r
−1 −1 A(z −1 ) = diag ai (z ) Jr B(z −1 ) = z −1 diag bi (z −1 ) Jr−1
(2.47)
(2.48)
Another method that can be used to model the dynamics of the manipulator robot is presented in [55]. The proposed method to model the VCMD involves the disturbance observer (DOB) to estimate the object’s movement from the current sampling period to the next one, Fig. 2.10. The VCMD model is non-linear because the robot Jacobian (Jr ) is a multivariable coupled one. The model of the robot dynamics is a function of the robot joints angular positions xq . One way to achieve a decoupled system is to employ a robust control strategy based on the joint-space disturbance observer (DOB) and thus each joint axis is considered decoupled under the cutoff frequency of DOB [55]. The velocity controller is considered as a diagonal gain matrix: Kv = diag{kv , kv , . . . , kv }
(2.49)
2.4 Image-Based Controller
27
Fig. 2.9 Control architecture for a servoing system using VCMD model
Fig. 2.10 The robot modelled as VCMD using DOB
If the non-singularity of the robot Jacobian Jr in the camera coordinates is assured, the transfer function from the acceleration command v˙ c∗ to the velocity vc can be considered as a integrator system in the frequency region below the cutoff frequency. The typical sampling period of the VCMD system is 0.2–1 [ms] and the typical cutoff frequency is 150–300 [rad/s]. Since the velocity controller is usually a proportional one with an amplification value kv , the inner-loop system can be expressed in the frequency region below the cutoff frequency as vc (s) = G(s)vc∗ (s) =
kv I6 v∗ (s) s + kv c
(2.50)
The discrete model of the inner velocity loop presented in Fig. 2.10 is described by
G(z −1 ) = 1 − z −1 Z (G(s)/s)
(2.51)
while the VCMD model is G V C M D (z −1 ) = 1 − z −1 Z G(s)/s 2
(2.52)
The control structure of a servoing system when the robot dynamics is modelled using DOB is illustrated in Fig. 2.11. In this approach, as in the one presented above, the visual sensor is modelled as a dead-time element z d resulting in the following
28
2 Visual Servoing Systems
Fig. 2.11 Control scheme of a servoing system when the robot is modelled using DOB
discrete-time model:
G p (z −1 ) = z −d G(z −1 )
(2.53)
2.5 Summary In this chapter, an overview of the visual servoing concept is presented. The first part shows the representation of position and orientation of an object in the 2D and 3D environment. A coordinate frame is used to describe a set of points attached to a specific object. The pose of a coordinate frame can be described with respect to other coordinate frame and the transformation between two frames is given by a homogeneous transformation matrix. In the second part of this chapter, the fundamentals of visual-based control architecture were presented. From the perspective of visual sensor location one can identify two configurations: (i) eye-in-hand, the camera is mounted on the robot’s TCP; (ii) eye-to-hand, the camera is fixed in the working environment. Based on the type of the control structure, i.e. there are two main architectures: (i) Image-Based Visual Servoing, the controller is designed directly in the image plane and (ii) Position-Based Visual Servoing, i.e. the controller is designed using pose estimation based on camera calibration and 3D target model.
Chapter 3
Image Feature Extraction and Evaluation
3.1 Introduction Visual features extracted from images acquired using visual sensors represent the inputs for the visual control architecture. Visual sensors used for the image acquisition can be, for instance, a conventional camera commonly used in servoing application or an ultrasonic camera or an omnidirectional camera [18, 19]. Selection of visual features is an important aspect of visual servoing systems since these visual features are relevant with respect to the performance and accuracy of the servoing systems. The minimum number of visual features used in the control architecture in order to drive the movement of a robot manipulator depends on the number of degrees of freedom (DOF) of the robot. Therefore, it is necessary to have a correlation between the visual features and the motion of the visual sensor. Visual features that can be used as input to the control architecture can be divided into: (i) geometric features— features of an object constructed by a set of geometric elements such as points, lines, curves, surfaces and (ii) photometric features—the luminance of all pixels in the image. Geometric features are: (i) 2D—used to describe the geometric content of a working area and (ii) 3D—used to perform the correlation between the coordinate axes attached to the robot and the coordinate axes attached to the object. Both 2D and 3D features can be used simultaneously in designing the control architecture, resulting in a hybrid control architecture. The 2D visual features are extracted from the image plane and represent the coordinates of the point features, parameters that defined lines or ellipse, region of interest and contours [20–22]. Photometric features which can be used in visual servoing applications are image moments [23, 24]. Using image moments as visual features, significant improvements in the performance of servoing systems occur due to the fact that these visual features allow a general representation of the image and also allow the description of the complex object. Another advantage of these features is that it can be used to design a decoupled servoing system and at the same time to minimize the non-linearities introduced by the interaction matrix. © Springer Nature Switzerland AG 2020 C. Copot et al., Image-Based and Fractional-Order Control for Mechatronic Systems, Advances in Industrial Control, https://doi.org/10.1007/978-3-030-42006-2_3
29
30
3 Image Feature Extraction and Evaluation
3.2 Point Features A point feature is a point in the image which has a well-defined position and can be precisely detected. A point feature can also be defined as the intersection of two edges. The performances of the point feature operators are analysed based on the ability to detect the same point features in multiple images which are similar but not identical, for example, the same image with different translations, rotations and/or other geometrical transformations.
3.2.1 Harris Corner Detector Harris operator is a point feature detector based on the autocorrelation matrix. This algorithm was first proposed by Moravec [26] and later in 1988 was improved by Harris and Stephens [25] while introducing the autocorrelation function. The complete algorithm consists of two stages. Firstly, features are detected with the autocorrelation matrix computed for each pixel in the image. Secondly, the structure of the local image is characterized, by calculating the local maximum of the autocorrelation function in a neighbourhood area defined by the user. Points which are associated with these local maximum values are considered as interest points (i.e. point features). Given an image function I (x, y), the autocorrelation matrix is computed using the following equation: A=
x
g(x, y)
y
Ix2 Ix I y Ix I y I y2
2 I I I = x x 2 y Ix I y I y
(3.1)
where g(x, y) represents a smoothing circular mask, for example, the Gaussian ker−(x 2 +y 2 )
nel g(x, y) = e σ 2 and Ix , I y represents the gradient of the image function for direction x and y defined as Ix = Iy =
∂I ∂x ∂I ∂y
= I ∗ (−1, 0, 1) = I ∗ (−1, 0, 1)T
(3.2)
The eigenvalues of the autocorrelation matrix represent the principal signal changes in two orthogonal directions in a neighbourhood around the point. Since the matrix A is symmetric, both eigenvalues of the matrix are on the real axis and thus a graphical representation of the eigenvalues can be illustrated in Fig. 3.1. Based on the property of the autocorrelation matrix, three cases can be distinguished: • if both eigenvalues are small, then no point features are detected; • if one eigenvalue is small and the other one is big, then an edge is detected; • if the image signal varies significantly in both directions (i.e. both eigenvalues are large) then a point feature can be found as a location in the image.
3.2 Point Features
31
Fig. 3.1 Geometric representation of the autocorrelation function
The basic idea of the Harris operator is to analyse the function C(A) defined as C(A) = det (A) − δtrace2 (A)
(3.3)
where det (A) = represents the determinant of matrix A, δ ∈ is a tuning parameter and trace(A) = represents the trace of the autocorrelation matrix. From these equations, it follows that high values of point features correspond to large eigenvalues. A point feature is detected if the matrix A has the rank equal with 2 and has large eigenvalues. Based on the Harris algorithm, for each pixel in the image, a cornerness value is computed using Ix2 + I y2 trace(A) = 2 2 (3.4) cv = det(A) Ix I y − Ix I y 2 where Ix , I y represent the gradient of the image function for direction x and y filtered with a Gaussian kernel. A pixel is stated as a point feature if the value cv is below a threshold defined by the user. Comparing the cv value with the threshold value is possible to manipulate the number of the point features detected with respect to the selected threshold. The point features extracted using the Harris operator are invariant to affine transformations such as translation, rotation, and partially invariant to small changes of the luminosity. The performances (stability and repeatability) of the Harris operator are quite high with respect to other point feature detectors. An example of a MATLAB® code for this operator is given below. function [h_features] = harrisd(image) h= fspecial(’gaussian’); treshold=9200; % this is the treshold value defined by the user mask_y= [−1 0 1; −1 0 1; −1 0 1]; mask_x= mask_y’; Ix= conv2(image, mask_x, ’same’);
32
3 Image Feature Extraction and Evaluation
Iy= conv2(image, mask_y, ’same’); Ix2= conv2(Ix.^2, h, ’same’); Iy2= conv2(Iy.^2, h, ’same’); Ixy= conv2(Ix.∗Iy, h, ’same’); cp= ((Ix2.∗Iy2) − (Ixy.^2))./(Ix2 + Iy2); imshow(uint8(image)); hold on; sze=25; mx = ordfilt2(cp,sze^2,ones(sze)); cp= (cp==mx)&(cp>treshold); [r, c]= find(cp); h=plot(c,r,’or’); set(h, ’LineWidth’, 3); h_features=[c,r];
3.2.2 SIFT Descriptor In the previous section, one of the most important point features detector was introduced to the reader. However, the point features extracted using Harris operator has the same properties only in the case of image sequences with the same scaling factor (the distance between the motion plan of the object and the visual sensor is constant). In reality, most of the servoing applications contain sequences of images with variable scaling factor requiring that point features detector should be invariant to scale transformation. From the scale-invariant point features detectors, the most used methods are twofold: (i) SIFT (Scale-Invariant Features Transform) and (ii) the Harris–Laplace algorithm. Detection of scale-invariant features can be realized using SIFT algorithm proposed by Lowe [27, 28], the algorithm consists of four stages: 1. 2. 3. 4.
Extreme (minimum and/or maximum) detection in the scale space; Features localization; Magnitude and orientation calculation; and SIFT Descriptor computation.
The first two are used for extreme detection in scale space and accurate localization of keypoints. The orientation assignment and the computation of descriptor are realized in the last two stages. The first stage in determining the point features is to identify the position in the scale space, which represents extreme points for a particular function. Detection of position which are invariant to image scale is based on the idea of finding stable point features at each scale level. The scale space is computed using a sequence of filtering with a given kernel [28]. It has been shown that the only kernel which can be used to generate a 3D scaling space is the Gaussian kernel [29, 30]: 1 −(x 2 +y 2 )/2σ 2 e (3.5) G(x, y, σ ) = 2π σ 2
3.2 Point Features
33
where σ 2 represents the dispersion of this Gaussian kernel. The scale space L(x, y, σ ) is defined as the convolution between the image function, I (x, y), and the Gaussian kernel G(x, y, σ ): L(x, y, σ ) = G(x, y, σ ) ∗ I (x, y) (3.6) The function L(x, y, σ ) generates a sequence of filtered images with respect to the parameter σ . If the parameter σ is considered as a function: σ : [omin , omax ] × [0, S − 1] →
(3.7)
defined by σ (o, s) = σ0 2
o+s S
(3.8)
where o represents an octave from the scale space, S the number of levels for each octave, s the index of level from octave o, a scale space can be defined. The computation of a scale space attached to an image I (x, y) can be implemented recursively through successive filtering and changes to the image size: 1. 2. 3. 4.
Computing L(x, y, σ (o, s)) using Eq. (3.6); Using Eq. (3.8), σ (o, s + 1) is computed; L(x, y, σ (o, s + 1)) is composed; If the condition s + 1 = S is true, the image in the next octave will be resized.
At this moment we have to choose the starting parameters which according to the algorithm proposed in [28] has the following values: the number of levels per octave S = 5, the maximum octave is omax = 4 and the minimum octave is omin = −1. Since the minimum octave is equal to –1, the image resolution is doubled at the beginning of the algorithm: No × Mo = 2N I × 2M I
(3.9)
where No × Mo represents the resolution of the image per octave and N I × M I denotes the size of the image. The algorithm will execute the image filtering recursively until the condition s + 1 = S is fulfilled and in that moment the image resolution is Mo+i−1 No+i−1 × (3.10) No+i × Mo+i = 2 2 After the scale space is completed (o = omax ), the difference-of-Gaussian (DoG) space is computed using the following equation: D(x, y, σ ) = (G(x, y, kσ ) − G(x, y, σ )) ∗ I (x, y) = = L(x, y, kσ ) − L(x, y, σ )
(3.11)
The DoG space can be interpreted as a derivation of the scale space, in this space we will be able to detect octaves which have the size smaller with one unit
34
3 Image Feature Extraction and Evaluation
Fig. 3.2 a Scale representation for 3 levels of one octave; b the 3 × 3 window around the point P
in comparison with ones from the scale space. In the DoG space, we look for the positions P(x p , y p , σ p ) where the value of function D(P) is higher or smaller than the value of each 26 neighbours from the 3 × 3 × 3 cube centred in the point P (Fig. 3.2a). If the value of the function is bigger or smaller then the values of the 26 neighbours, then P(x p , y p , σ p ) is an extreme point [28]. All the extreme points form the set of the local extremum of the function D(P) in the scale space, denoted by Λ = P(x, y, σ )|∀P ∈ V3×3×3 (x, y, σ ), |D(P)| > D(P )
(3.12)
Regarding the point features localization problem, the idea is to refine the set of extreme by using a stability criteria in the neighbourhood of each element P ∈ Λ. The process consists in determining the extreme points of the function D(x, y, σ ) in the vicinity of the points P. By applying Taylor series until the quadratic term for the function D(x, y, σ ) shifts such as the origin is in point P, we can analyse the offset of P [31]: 1 ∂2 D ∂ DT P + PT D(P) = D + P (3.13) ∂P 2 ∂ P2 Taking the derivative of this function and setting it to zero, the location of the extremum is determined by 2 −1
= −∂ D ∂ D (3.14) P ∂ P2 ∂ P
that must The stability conditions imposed by Lowe refer to the value of the offset P be smaller than 0.5 in any of the three dimensions (x, y, σ ) and to reject the extreme
points which are unstable to contrast variation. The function value D( P)
= D+ D( P)
1 ∂ DT
P 2 ∂P
(3.15)
is used to reject the unstable with low contrast. For an image I (x, y) ∈ [0, 1], extrema
< 0.03 to reject unstable points. In order to reject we can use the inequality D( P)
3.2 Point Features
35
the edge response, the eigenvalues of the Hessian matrix can be used: H=
Dx x Dx y Dx y D yy
(3.16)
These eigenvalues are proportional to the principal curvatures of the function D. Let λ L and λ S be the eigenvalues with, respectively, larger, smaller magnitude, and r be the value of the ratio λλLS , we can write the following equation:
where
(λ L + λ S )2 trace(H )2 (r + 1)2 = = det(H ) λL · λS r
(3.17)
trace(H ) = Dx x + D yy = λ L + λ S det(H ) = Dx x D yy − Dx2y = λ L λ S
(3.18)
(r + 1)2 trace(H )2 < det(H ) r
(3.19)
Using the inequality:
the point features for which the ratio between the principal curvatures is larger than r are eliminated. As aforementioned, SIFT algorithm consists of four stages. The first two, extremum detection in scale space and features localization, were presented above and are used to extract the point features. The next two steps, magnitude and orientation calculation and the descriptor development are used to compute the SIFT descriptor. The descriptor of the point features can be represented by assigning an orientation and a magnitude calculated based on the local properties of the image function [32]. The scale of the features is used to select the right level of the image filtering with the Gaussian kernel. Hence, for each keypoint is assigned an orientation and a magnitude based on the property of local image:
2
2 M(x, y) = L x+1,y − L x−1,y + L x,y+1 − L x,y−1
θ (x, y) = tan−1 (L x,y+1 − L x,y−1 )/(L x+1,y − L x−1,y )
(3.20)
The orientation histogram consists of the orientation gradient of all points within a circular mask around the point features. The oriented histogram has 36 bins covering the 360◦ ranges of orientations, then for each block, the orientations and the magnitudes are converted to 8 bins dividing the trigonometric circle, see Fig. 3.3. Hitherto, the location, the scale and the orientation of each point feature are known, and the next step is to compose the descriptor for a local region from the image. To improve the stability of the points detected and to achieve better performance, the magnitude and orientation are composed for the whole neighbourhood 16 × 16 around each point feature, and not only for the feature itself, Fig. 3.4. In Fig. 3.4a,
36
3 Image Feature Extraction and Evaluation
Fig. 3.3 a The orientation histogram; b the fundamental orientations
Fig. 3.4 a The point features extracted with SIFT operator; b the orientation and the magnitude for all 256 pixels from the 16 × 16 neighbourhood; c the point feature descriptor
the point features detected using SIFT operator are presented. First the magnitude and the orientation of all 256 points from the 16 × 16 neighbourhood around each point features are computed, see Fig. 3.4b. Using the scale of the point feature, the level of the Gaussian kernel is selected. For computational efficiency, the gradient function is computed for all the levels of the pyramid using (3.14). The 16 × 16 neighbourhood is divided into blocks by 4 × 4 dimensions and then for each block a histogram is computed, see Fig. 3.4c. The orientation of a histogram is divided into b index as in Fig. 3.3 and next the index magnitude is calculated using the following equation: h r (l,m) (k) =
M(x, y)(1 − |θ (x, y) − ck |/Δk )
(3.21)
x,y∈r (l,m)
and where ck denotes the orientation of each index, Δk is a constant equal with 360 2b (x, y) are the point coordinates from the subregion r(l,m) . For every point feature, the descriptor is formed from a vector which contained the values of all the orientation histogram, the length of the vector is bn 2 , where b is the number of the orientation in the histogram and n is the dimension of the block. In this case, we have 4 × 4 blocks with 8 orientations for each of them, so the length of vector is 4 × 4 × 8 = 128.
3.2 Point Features
37
Finally, the vector is normalized to improve the invariance response for different transformations like illumination, rotation and translation changes. function [A, Or] = SIFT_descriptor(img, features) [m, n]=size(img); A=zeros(m,n,length(features(:,1))); Or=zeros(m,n,length(features(:,1))); d=0; for k=1:length(features(:,1)), Lmax = floor(min(log(2∗size(img)/12)/log(features(k,3)))); for w1=1:Lmax if w1==1 Lk = filter_gaussian(img,7,.5); %slightly filter bottom level end; Lk= filter_gaussian(Lk,7,1.5); end; Lk=double(Lk); xk=round(features(k,1)); yk=round(features(k,2)); for i=xk−7:xk+8, for j=yk−7:yk+8, deltax=Lk(i+1,j)−Lk(i−1,j); deltay=Lk(i,j+1)−Lk(i,j−1); if (deltax~=0)&&(deltay~=0) A(i,j,k)=sqrt(((Lk(i,j+1)−Lk(i,j−1))^2)+(Lk(i+1,j)−Lk(i −1,j))^2); if (deltax>0)&&(deltay>0) Or(i,j,k)=atan(deltay/deltax); end; if (deltax0) Or(i,j,k)=atan(deltay/deltax)+pi; end; if (deltax