301 40 27MB
English Pages XVI, 193 [208] Year 2021
Lecture Notes in Electrical Engineering 720
Oleg Gusikhin Kurosh Madani Janan Zaytoon Editors
Informatics in Control, Automation and Robotics 16th International Conference, ICINCO 2019 Prague, Czech Republic, July 29–31, 2019, Revised Selected Papers
Lecture Notes in Electrical Engineering Volume 720
Series Editors Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli Federico II, Naples, Italy Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán, Mexico Bijaya Ketan Panigrahi, Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, Munich, Germany Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China Shanben Chen, Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore Rüdiger Dillmann, Humanoids and Intelligent Systems Laboratory, Karlsruhe Institute for Technology, Karlsruhe, Germany Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China Gianluigi Ferrari, Università di Parma, Parma, Italy Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid, Madrid, Spain Sandra Hirche, Department of Electrical Engineering and Information Science, Technische Universität München, Munich, Germany Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA, USA Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Alaa Khamis, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt Torsten Kroeger, Stanford University, Stanford, CA, USA Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA Ferran Martin, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA Sebastian Möller, Quality and Usability Laboratory, TU Berlin, Berlin, Germany Subhas Mukhopadhyay, School of Engineering & Advanced Technology, Massey University, Palmerston North, Manawatu-Wanganui, New Zealand Cun-Zheng Ning, Electrical Engineering, Arizona State University, Tempe, AZ, USA Toyoaki Nishida, Graduate School of Informatics, Kyoto University, Kyoto, Japan Federica Pascucci, Dipartimento di Ingegneria, Università degli Studi “Roma Tre”, Rome, Italy Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Gan Woon Seng, School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore, Singapore Joachim Speidel, Institute of Telecommunications, Universität Stuttgart, Stuttgart, Germany Germano Veiga, Campus da FEUP, INESC Porto, Porto, Portugal Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Beijing, China Junjie James Zhang, Charlotte, NC, USA
The book series Lecture Notes in Electrical Engineering (LNEE) publishes the latest developments in Electrical Engineering - quickly, informally and in high quality. While original research reported in proceedings and monographs has traditionally formed the core of LNEE, we also encourage authors to submit books devoted to supporting student education and professional training in the various fields and applications areas of electrical engineering. The series cover classical and emerging topics concerning:
• • • • • • • • • • • •
Communication Engineering, Information Theory and Networks Electronics Engineering and Microelectronics Signal, Image and Speech Processing Wireless and Mobile Communication Circuits and Systems Energy Systems, Power Electronics and Electrical Machines Electro-optical Engineering Instrumentation Engineering Avionics Engineering Control Systems Internet-of-Things and Cybersecurity Biomedical Devices, MEMS and NEMS
For general information about this book series, comments or suggestions, please contact leontina. [email protected]. To submit a proposal or request further information, please contact the Publishing Editor in your country: China Jasmine Dou, Associate Editor ([email protected]) India, Japan, Rest of Asia Swati Meherishi, Executive Editor ([email protected]) Southeast Asia, Australia, New Zealand Ramesh Nath Premnath, Editor ([email protected]) USA, Canada: Michael Luby, Senior Editor ([email protected]) All other Countries: Leontina Di Cecco, Senior Editor ([email protected]) ** Indexing: Indexed by Scopus. **
More information about this series at http://www.springer.com/series/7818
Oleg Gusikhin Kurosh Madani Janan Zaytoon •
•
Editors
Informatics in Control, Automation and Robotics 16th International Conference, ICINCO 2019 Prague, Czech Republic, July 29–31, 2019, Revised Selected Papers
123
Editors Oleg Gusikhin Ford Research and Advanced Engineering Dearborn, MI, USA
Kurosh Madani University of Paris-EST Créteil (UPEC) Vitry-sur-Seine, France
Janan Zaytoon University of Reims Champagne Ardennes Vitry-sur-Seine, France
ISSN 1876-1100 ISSN 1876-1119 (electronic) Lecture Notes in Electrical Engineering ISBN 978-3-030-63192-5 ISBN 978-3-030-63193-2 (eBook) https://doi.org/10.1007/978-3-030-63193-2 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The present book includes extended and revised versions of a set of selected papers from the 16th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2019), held in Prague, Czech Republic, from July 29 to 31, 2019. The purpose of the 16th edition of this annual conference is to bring together researchers, engineers and practitioners interested in the application of informatics to control, automation and robotics. Four simultaneous tracks were held, covering intelligent control systems, optimization, robotics, automation, signal processing, sensors, systems modeling and control, and industrial informatics. Informatics applications are pervasive in many areas of control, automation and robotics. ICINCO 2019 received 213 paper submissions from 44 countries of which 5% were included in this book. The papers were selected by the event chairs, and their selection is based on a number of criteria that include the classifications and comments provided by the program committee members, the session chairs’ assessment and also the program chairs’ global view of all papers included in the technical program. The authors of selected papers were then invited to submit a revised and extended version of their papers having at least 30% innovative material. The selection has also followed the ambition of an equilibrate highlight of the four aforementioned conference’s tracks. We would like to thank all the authors for their contributions and to express our gratitude to the reviewers who have helped to ensure the quality of this post-conference publication. July 2019
Oleg Gusikhin Kurosh Madani Janan Zaytoon
v
Organization
Conference Co-chairs Kurosh Madani Janan Zaytoon
University of Paris-EST Créteil (UPEC), France University of Reims Champagne-Ardenne, France
Program Chair Oleg Gusikhin
Ford Motor Company, USA
Program Committee Samir Aberkane El-Houssaine Aghezzaf Eugenio Aguirre Carlos Aldana Mihail Antchev Rui Araujo T. Asokan M. Amine Atoui Alfonso Baños Guilherme Barreto Marcelo Becker Laxmidhar Behera Carsten Behn Faïz Ben Amar
Centre de Recherche en Automatique de Nancy (CRAN), France Ghent University, Faculty of Engineering and Architecture, Belgium University of Granada, Spain University of Guadalajara, Mexico Technical University, Sofia, Bulgaria University of Coimbra, Portugal Indian Institute of Technology Madras, India UBS, France Universidad de Murcia, Spain Universidade Federal do Ceará, Brazil Escola de Engenharia de São Carlos, Brazil Indian Institute of Technology Kanpur, India Schmalkalden University of Applied Sciences, Germany Université Pierre et Marie Curie, Institut Systèmes Intelligents et de Robotique, France
vii
viii
Lyes Benyoucef Karsten Berns Reinaldo Bianchi Mauro Birattari Jean-Louis Boimond Magnus Boman Glen Bright Marvin Bugeja Kevin Burn Kenneth Camilleri Angelo Cangelosi Giuseppe Carbone Alessandro Casavola Riccardo Cassinis Marco Castellani Jaime Cerda Kuo-Ming Chang Yuh-Min Chen Albert Cheng Sung-Bae Cho Ryszard Choras Paul Christodoulides Chengbin Chu Feng Chu Daisuke Chugo Carlos Coello James Conrad Erik Cuevas José Cunha Prithviraj Dasgupta Angel Del Pobil Kyriakos Deliparaschos Mingcong Deng Paolo Di Giamberardino Hans Doran António Dourado Vasile Dragan Venky Dubey
Organization
Aix-Marseille University, France University of Kaiserslautern, Germany Centro Universitario da FEI, Brazil Université Libre de Bruxelles, Belgium ISTIA, LARIS, France The Royal Institute of Technology, Sweden University of KwaZulu-Natal, South Africa University of Malta, Malta University of Sunderland, UK University of Malta, Malta University of Plymouth, UK Università della Calabria, Italy University of Calabria, Italy University of Brescia, Italy University of Birmingham, UK Universidad Michoacana de San Nicolas de Hidalgo, Mexico National Kaohsiung University of Applied Sciences, Taiwan, Republic of China National Cheng Kung University, Taiwan, Republic of China University of Houston, USA Yonsei University, Korea UTP University of Sciences and Technology, Poland Cyprus University of Technology, Cyprus ESIEE Paris, Université Paris-Est, France University of Evry Val d’Essonne, France Kwansei Gakuin University, Japan CINVESTAV-IPN, Mexico University of North Carolina at Charlotte, USA Universidad de Guadalajara, Mexico University of Trás-os-Montes and Alto Douro, Portugal University of Nebraska, USA Universitat Jaume I, Spain Cyprus University of Technology, Cyprus Tokyo University of Agriculture and Technology, Japan Sapienza University of Rome, Italy Zurich University of Applied Sciences, Switzerland University of Coimbra, Portugal Romanian Academy, Romania Bournemouth University, UK
Organization
Ioan Dumitrache Richard J. Duro Marc Ebner Mohammed El-Abd Simon Fabri Mohammed Fadali Christophe Farges Paolo Fiorini Juan Flores Mauro Franceschelli Georg Frey Toyomi Fujita Mauro Gaggero Iván Garcíia Daza Maria Garcia-Planas Paulo Gil Giuseppina Gini Arthur Gómez Amr Goneid Lucian Grigorie Lech Grzesiak Jan Haase Maki Habib Kensuke Harada Jennifer Harding Arturo Hernández - Aguirre Noelia Hernández Parra Yasuhisa Hirata Wladyslaw Homenda Jonathan How Yisheng Huang Daniela Iacoviello Gianluca Ippoliti Sarangapani Jagannathan Michael Jenkin Myong Jeong Wootae Jeong
ix
University POLITEHNICA of Bucharest, Romania Universidade Da Coruña, Spain Ernst-Moritz-Arndt-Universität Greifswald, Germany American University of Kuwait, Kuwait University of Malta, Malta UNR, USA Bordeaux University, IMS – UMR 5218 CNRS, France University of Verona, Italy University of Michoacan, Mexico Università di Cagliari, Italy Chair of Automation and Energy Systems, Saarland University, Germany Tohoku Institute of Technology, Japan National Research Council of Italy, Italy University of Alcalá, Spain Universitat Politecnica de Catalunya, Spain Universidade NOVA de Lisboa, Portugal Politecnico di Milano, Italy Universidade do Vale do Rio dos Sinos, Brazil The American University in Cairo, Egypt Military Technical Academy “Ferdinand I”, Bucharest, Romania Warsaw University of Technology, Poland Helmut Schmidt University of the Federal Armed Forces Hamburg, Germany The American University in Cairo, Egypt Osaka University, Japan Loughborough University, UK Centre for Research in Mathematics, Mexico Universidad de Alcalá, Spain Tohoku University, Japan Warsaw University of Technology, Poland Massachusetts Institute of Technology, USA Nat. Ilan Univ., Taiwan, Republic of China Sapienza University of Rome, Italy Università Politecnica delle Marche, Italy Missouri University of Science and Technology, USA York University, Canada Rutgers University, USA Korea Railroad Research Institute, Korea, Republic of
x
Isabel Jesus Márcio José da Cunha Marc Jungers Fabrício Junqueira Michail Kalogiannakis Tohru Kawabe Rafael Kelly DaeEun Kim Waree Kongprawechnon Krzysztof Kozlowski Ondrej Krejcar Kolja Kühnlenz Nikolay Kuznetsov Sébastien Lahaye Jose Lastra Jimmy Lauber Dimitri Lefebvre Kauko Leiviskä Roland Lenain Fabrizio Leonardi Tsai-Yen Li Christos Liambas Gordon Lightbody Huei-Yung Lin Ángel Llamazares Antonio Lopes Sérgio Lopes Gonzalo Lopez-Nicolas Ping Lou Yu-Shen Lu Eric Lucet Martin Lukac Damian Lyons Danilo Maccio José Machado Douglas Macharet
Organization
ISEP/IPP - School of Engineering, Polytechnic Institute of Porto, Portugal Federal University of Uberlândia, Brazil CNRS, Université de Lorraine, France University of São Paulo (USP), Brazil University of Crete, Greece University of Tsukuba, Japan CICESE Research Center, Mexico Yonsei University, Korea Thammasat University, Thailand Poznan University of Technology, Poland University of Hradec Kralove, Czech Republic Coburg University of Applied Sciences and Arts, Germany Saint Petersburg State University, Russian Federation ISTIA, LARIS, France Tampere University of Technology, FAST Laboratory, Finland University of Valenciennes, France GREAH, Université Le Havre, France University of Oulu, Finland Irstea, France FEI University, Brazil National Chengchi University, Taiwan, Republic of China Aristotle University of Thessaloniki, Greece University College Cork, Ireland National Chung Cheng University, Taiwan, Republic of China Universidad de Alcalá, Spain University of Porto, Portugal University of Minho, Portugal Universidad de Zaragoza, Spain Wuhan University of Technology, China National Taiwan Normal University, Taiwan, Republic of China CEA - French Alternative Energies and Atomic Energy Commission, France Nazarbayev University, Kazakhstan Fordham University, USA National Research Council of Italy, Italy Institute of Engineering, Polytechnic of Porto, Portugal Universidade Federal De Minas Gerais, Brazil
Organization
Anthony Maciejewski Frederic Maire Om Malik Tariq Masood Seán McLoone Joaquim Melendez Nadhir Messai Konstantinos Michail Agnieszka Miguel Vladimir Mostyn Vassilis Moulianitis George Moustris Riccardo Muradore Cairo Nascimento Jr. Andreas Nearchou Juan J. Nieto Emmanuel Nuño Ortega Fernando Osorio Ju H. Park Igor Paromtchik Ignacio Parra-Alonso Pierre Payeur Dariusz Pazderski Qingjin Peng Fernando Pereira D. Pham Nirvana Popescu Raul Marin Prades Radu-Emil Precup José Ragot Oscar Reinoso Mihailo Ristic Juha Röning Hubert Roth António Ruano Jocelyn Sabatier Christophe Sabourin Priti Sajja Antonio Sala Abdel-Badeeh Salem Roberto Sanchis Llopis
xi
Colorado State University, USA Queensland University of Technology, Australia University of Calgary, Canada University of Cambridge, UK Queen’s University Belfast, Ireland University of Girona, Spain CReSTIC, France Cyprus University of Technology, Cyprus Seattle University, USA VSB - Technical University of Ostrava, Czech Republic University of the Aegean, Greece National Technical University of Athens, Greece University of Verona, Italy Instituto Tecnológico de Aeronáutica, Brazil University of Patras, Greece University of Santiago de Compostela, Spain University of Guadalajara, Mexico USP - Universidade de Sao Paulo, Brazil Yeungnam University, Korea Inria, France University of Alcalá, Spain University of Ottawa, Canada Poznan University of Technology, Poland University of Manitoba, Canada University of Porto, Portugal University of Birmingham, UK University POLITEHNICA of Bucharest, Romania Jaume I University, Spain Politehnica University of Timisoara, Romania Centre de Recherche en Automatique de Nancy, France Miguel Hernandez University, Spain Imperial College London, UK University of Oulu, Finland University of Siegen, Germany University of Algarve, Portugal IMS Laboratory, Bordeaux University, France IUT Sénart, University Paris-Est Creteil (UPEC), France Sardar Patel University, India Universidad Politecnica de Valencia, Spain Ain Shams University, Egypt Universitat Jaume I, Spain
xii
Jurek Sasiadek Daniel Schmidt Gerhard Schweitzer João Sequeira Roman Sergienko Madhavan Shanmugavel Jinhua She Michael Short Manuel Silva Vasile Sima Dan Simon Daniel Simon Andrzej Sluzek Burkhard Stadlmann Olaf Stursberg Raúl Suárez Les Sztandera Tianhao Tang Tomasz Tarczewski Abdelhamid Tayebi Daniel Thalmann Gui Tian Germano Torres Wa-Muzemba Tshibangu Avgoustos Tsinakos
Antonios Tsourdos Ali Turgut Bartlomiej Ufnalski Roberto Valenti Angel Valera Antonis Vardulakis César Vargas Rosales Alexandra Velasco Ramiro Velazquez Hugo Vieira Neto Arnoud Visser
Organization
Carleton University, Canada SEW-EURODRIVE GmbH, Germany ETH Zurich, Switzerland Instituto Superior Ténico, Portugal Gini GmbH, Germany SRM Institute of Science and Technology, India Tokyo University of Technology, Japan Teesside University, UK ISEP-Instituto Superior de Engenharia do Porto, Portugal National Institute for Research-Development in Informatics, Romania Cleveland State University, USA Inria, France Khalifa University, UAE Upper Austria University of Applied Sciences, Wels, Austria University of Kassel, Germany Universitat Politecnica de Catalunya (UPC), Spain Jefferson (Philadelphia University + Thomas Jefferson University), USA Shanghai Maritime University, China Nicolaus Copernicus University, Poland Lakehead University, Canada Ecole Polytechnique Federale de Lausanne, Switzerland Newcastle University, UK PS Solutions, Brazil Morgan State University, USA University of Kavala Institute of Technology/Athabasca University of Canada, Greece Cranfield University, UK Middle East Technical University, Turkey Warsaw University of Technology, Poland MathWorks, USA Universidad Politécnica de Valencia, Spain Aristotle University of Thessaloniki, Greece ITESM - Monterrey, Mexico Universidad Militar Nueva Granada, Colombia Universidad Panamericana, Mexico Federal University of Technology - Paraná, Brazil Universiteit van Amsterdam, Netherlands
Organization
Damir Vrancic Bernardo Wagner Mattias Wahde Axel Walthelm Feng Wan Jingyu Wang Long Wang James Whidborne Min Wu Farouk Yalaoui Qinmin Yang Weiwei Yu Marek Zaremba Janan Zaytoon Jie Zhang Primo Zingaretti Alois Zoitl
xiii
Jožef Stefan Institute, Slovenia Leibniz Universität Hannover, Germany Chalmers University of Technology, Sweden Siemens AG, Germany University of Macau, Macau Northwestern Polytechnical University (NPU), China Peking University, China Cranfield University, UK China University of Geosciences, China University of Technology of Troyes, France Zhejiang University, China Northwestern Polytechnical University, China Université du Québec (UQO), Canada University of Reims Champagne-Ardenne, France Newcastle University, UK Università Politecnica delle Marche, Italy Johannes Kepler University, Austria
Additional Reviewers Vítor Barros Cheng Guo Marcin Kielczewski Adriano Siqueira
UFMG, Brazil Colorado State, USA Poznan University of Technology, Poland USP - EESC, Brazil
Invited Speakers Dimitar Filev Ali Zolghadri Evangelos Theodorou
Research & Advanced Engineering, Ford Motor Company, USA University of Bordeaux, CNRS, France Georgia Institute of Technology, USA
Contents
Towards a PGD-Based Computational Vademecum for Robot Path Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nicolás Montés, Francisco Chinesta, Antonio Falcó, Marta C. Mora, Lucia Hilario, Enrique Nadal, and Jean Louis Duval
1
Optimal Reachability and Grasping for a Soft Manipulator . . . . . . . . . Simone Cacace, Anna Chiara Lai, and Paola Loreti
16
On Local Observer Design for LQR Problems with Tracking . . . . . . . . Paolo Di Giamberardino and Daniela Iacoviello
35
6-DoF Extrinsic Calibration Between a Manipulator and a 2D Laser Scanner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan Alberts, Sebastian P. Kleinschmidt, and Bernardo Wagner
61
Simultaneous Control of Two Robotics Arms Sharing Workspace via Visual Predictive Control . . . . . . . . . . . . . . . . . . . . . . . . E. Le Flécher, A. Durand-Petiteville, V. Cadenat, and T. Sentenac
79
Advanced Algorithm for Interpolation with Wendland Functions . . . . . Hjortur Bjornsson and Sigurdur Hafstein
99
An Algorithm for Identifying High-Rank Singularities of Arbitrary Redundant Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Ahmad A. Almarkhi Adaptive Interaction Control of a Very Flexible Parallel Robot Manipulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Fatemeh Ansarieshlaghi and Peter Eberhard Bayesian Transfer Learning Between Uniformly Modelled Bayesian Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Ladislav Jirsa, Lenka Kuklišová Pavelková, and Anthony Quinn
xv
xvi
Contents
Multi-robot Cooperation for Assembly: Automated Planning and Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Ludwig Nägele, Andreas Schierl, Alwin Hoffmann, and Wolfgang Reif Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Towards a PGD-Based Computational Vademecum for Robot Path Planning Nicol´as Mont´es1(B) , Francisco Chinesta2 , Antonio Falc´ o1 , Marta C. Mora3 , 1 4 Lucia Hilario , Enrique Nadal , and Jean Louis Duval5 1
Department of Mathematics, Physics and Technological Sciences, University CEU Cardenal Herrera, C/San Bartolom´e, 55, Alfara del Patriarca, Valencia, Spain {nicolas.montes,afalco,luciah}@uchceu.es 2 PIMM, ENSAM, ParisTech, ESI GROUP, Chair on Advanced Modeling and Simulation of Manufacturing Processes, Paris, France [email protected] 3 Department of Mechanical Engineering and Construction, Universitat Jaume I, Castell´ on de la Plana, Castell´ on, Spain [email protected] 4 Centro de Investigacion en Ingenieria Mecanica, Universitat Politecnica de Valencia, Valencia, Spain [email protected] 5 ESI Group, Rungis Cedex, France [email protected] Abstract. The present paper describes the use of the technique known as PGD-Vademecum in potential-based path planning for mobile robots. The basic idea of the method is to obtain a Vademecum with all the possible robot paths from any start to any goal derived from a harmonic potential field in a predefined map, including any possible dynamic obstacle positions. The PGD is a numerical technique that offers three important advantages over other methods. First, the ability to calculate all the possible Poisson equation solutions for all start, goal and dynamic obstacle combinations in a map, ensuring that the resulting potential field does not have local minima. Second, the PGD-Vademecum can be expressed as a sum of uncoupled multiplied terms: the geometric map and the start and goal configurations. Consequently, the harmonic potential field for any start and goal robot positions can be reconstructed very fast, in an almost negligible computational time, allowing real-time path planning. Third, just a few uncoupled parameters are required to reconstruct the potential field with a minimum discretization error. The abilities of this technique are demonstrated by means of simulation results. Keywords: Model order reduction techniques · PGD planning · Robot potential field · Laplace equation
1
· Path
Introduction
An essential task in robotics is to guide the robot safely from a start to a goal configuration among a set of obstacles. To this aim, a collision-free path must c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 O. Gusikhin et al. (Eds.): ICINCO 2019, LNEE 720, pp. 1–15, 2021. https://doi.org/10.1007/978-3-030-63193-2_1
2
N. Mont´es et al.
be calculated, which implies a computationally hard geometric path planning unfeasible in real-time (RT) applications [27]. In the literature, this problem is known as the motion planning problem or the piano mover’s problem and its complexity has motivated a lot of research in the field of robot path planning. Some researchers have studied subproblems of the general approach [1]. Other works have considered simplified planning paradigms under some realistic assumptions such as, for instance, sampling-based planners, grid-based searches, interval-based searches, geometric algorithms, etc. [1]. One of the most frequently used algorithm for robot path planning is the Artificial Potential Field method (APF), [28]. This technique defines an artificial potential field in the configuration space (C-space) that generates a path from a start to a goal configuration. This method is very fast for real-time applications, being its main drawback the possibility for the robot to be trapped in a local minimum of the potential function. This problem can be addressed using harmonic functions in the generation of the potential field [2]. Harmonic functions satisfy the Laplace equation in the C-space and completely remove the local minima as they comply with the Min-Max principle [3]. They were initially proposed in [4] and used for robot path planning in [5,6,8,10,14,19,21]. The main problem of this technique, tackled by [14], is that the solution must be numerically computed in a discrete mesh and, therefore, the computational cost increases exponentially with the mesh resolution. In the study [22] where a potential field is derived from a harmonic function in a scanned environment composed of 1500 triangles, the computational cost was 19.2 seconds in a Dell Latitude laptop with an Intel Core 2 Processor, 2 GB of RAM, which is not feasible for real-time applications. Lately, in the field of advanced computational methods, a novel numerical alternative called Proper Generalized Decomposition (PGD) has appeared with the aim of computing the Laplace equation. It is a totally new approach for solving classic high-complexity problems [23,24]. Many complex problems can be efficiently converted into a multidimensional framework, which opens up new possibilities to solve old and new problems with approaches not foreseen until now. In a PGD framework, the model is solved only once to obtain a general solution that includes all the solutions for every value of the parameters, that is, a Computational Vademecum. The goal of this work is to develop the calculation of a PGD-based computational Vademecum (PGD-Vademecum for short) for the use of the potential flow theory based on harmonic functions in two real-time path planning applications. In our previous work [28], the use of a PGD-Vademecum in a potential-based robot path planner was presented for the first time but only taking into account the start and goal robot positions as extra parameters. The present paper extends the previous work including dynamic obstacles also as extra parameters. This paper is structured as follows. Section 2 describes the potential flow theory and derives a harmonic function from the Laplace equation. In Sect. 3, a particular PGD-Vademecum for robot path planning is obtained, with start and goal positions included as parameters. Section 4 provides the guidelines for the
Towards a PGD-Based Computational Vademecum for Robot Path Planning
3
introduction of dynamic obstacles as extra parameters in the PGD formulation. Section 5 provides simulation examples with and without dynamic obstacles. Finally, Sect. 6 draws conclusions and future works.
2 2.1
Previous Knowledge Potential Flow Theory for Robot Path Planning
In the last years, the potential flow theory has been the basis of many path planning techniques [5,6,8,10,14,19,21,22], mainly focused in the resolution of the Laplace equation. Let us first outline the mathematical model that describes the flow of an inviscid incompressible fluid. If a steady state irrotational flow in the Eulerian framework is assumed, the velocity υ meets the relation ×υ = 0,
(1)
and, therefore, the velocity is the gradient of a scalar potential function, i.e. υ = ∇u. Hence, the potential u is a solution of the Laplace equation: Δ u = 0.
(2)
Using a 2.5D mould filling model similar to [25] it is possible to insert a localized fluid source (respectively, sink) modelled by a Dirac term δS (respectively, −δT ) added to the right hand side of (2). For this purpose, let us assume a unit amount of fluid injected at point S during a unit of time and the same unit withdrawn at point T . The velocity of the fluid is then the solution of the Poisson equation, that includes the source term f = δS − δT as: − Δu = δS − δT .
(3)
Equation (3) must be complemented by appropriate boundary conditions. In these sense, the fluid cannot flow through the boundaries, a condition expressed by υ · n (n being a vector normal to the boundary Γ ). On Γ the velocity must verify: − ∇u · n = 0, (4) which is the usual the Neumann boundary condition expressed by ∂u = 0. ∂n Γ
(5)
The resolution of the Poisson equation under these conditions produces a potential field from the Starting point S (source) to the Target point T (sink), without deadlocks [22].
4
2.2
N. Mont´es et al.
PGD Solution of the Poisson Equation
Consider the two dimensional Poisson equation Δu(x, y) = f (x, y)
(6)
over a two-dimensional rectangular domain ΩX = ΩX ×ΩY ⊂ R with boundary ∂u = q. For all suitable test functions u∗ , the weighted residual condition ∂n Γ forms reads u∗ · (Δu − f ) dΩX (7) 2
ΩX
The classical way of accounting for Neuman conditions is to integrate by parts the weighted residual form and implement the flux condition as a so-called natural boundary condition: ∗ ∇u · ∇u dΩX = u∗ · f dΩX ΩX
−
ΩX
ΩX
(8) u∗ (x, y = Γ ) · q dΩX
Our goal is to obtain a PGD approximate solution to (8) in the separated form u(ΩX ) =
N
Xi (x) · Yi (y)
(9)
i=1
We shall do so by computing each term of the expansion one at a time, thus enriching the PGD approximation until a suitable convergence criterion is satisfied. At each enrichment step n, (n 1), we have already computed the n − 1 first terms of the PGD approximation (9): un−1 (ΩX ) =
n−1
Xi (x) · Yi (y)
(10)
i=1
We now wish to compute the next term Xn (x) · Yn (y) to obtain the enriched PGD solution un (x, y) = un−1 (x, y) + Xn (x) · Yn (y) =
n−1
Xi (x) · Yi (y) + Xn (x) · Yn (y)
(11)
i=1
Both functions Xn (x) and Yn (y) are unknown at the current enrichment step n and an alternative iterative scheme is applied. We use index p to denote a particular iteration. un,p (x, y) = un−1 (x, y) + Xnp (x) · Ynp (y)
(12)
The simplest one is an alternating direction strategy that computes Xnp (x) from Ynp−1 (y) and then Ynp (y) from Xnp (x). An arbitrary initial guess Yn0 (y) is specified to start the iterative process. The non-linear iterations proceed until reaching a fixed point within a user-specified tolerance, see [23,24]. The convergence of the above procedure to the weak solution of (6) is proved in [26].
Towards a PGD-Based Computational Vademecum for Robot Path Planning
3
5
PGD-Vademecum for Robot Path Planning in a 2D Environment
The preceding section has presented a simple example application of the resolution of the Poisson equation using PGD in a case where the 2D space is decomposed in X and Y . [23,24] demonstrate that parameters in a model can be set as additional coordinates when using the PGD approach. In the following sections, a path planning example is presented where these additional coordinates are all the possible combinations of the start and target positions and can be included in the source term of the Poisson equation (6). 3.1
Source Term Definition
First of all, it is necessary to assume that a constant source term f in Eq. (6) is actually a non-uniform source term f (ΩX , ΩS , ΩT ), where ΩX = Ωx × Ωy , ΩS = Ωr × Ωs and ΩT = Ωr × Ωt . In this definition, the start and target points S and T are defined by means of Gaussian models with mean and variance: S = (s, r) and T = (t, r), respectively. In these models, s and t are the mean values located in specific points X = (x, y) in each separated space ΩS , ΩT and r is the variance. Gaussian models are used instead of Delta Dirac models because they provide much better results in a PGD-Vademecum than Delta Dirac model, as explained in [24]. In order to define the source term, the next two matrices must be constructed first: ⎤ ⎡ f (x1 , s1 ) . . . f (x1 , sN ) ⎥ ⎢ .. .. .. f (X, S) = ⎣ ⎦ . . . f (xN , s1 ) . . . f (xN , sN ) ⎤
⎡
(13)
f (x1 , t1 ) . . . f (x1 , tN ) ⎥ .. .. .. ⎦ . . . f (xN , t1 ) . . . f (xN , tN )
⎢ f (X, T ) = ⎣
Applying the Single Value Decomposition (SVD) method to these matrices, the result is the decomposition of the source term in the form: f (X, S) =
F
αjS · FjS (X) · GSj (S)
j=1
g(X, T ) =
F
(14)
αjT · FjT (X) · GTj (T )
j=1
Thus, the Poisson equation to be solved results in the form: Δu(x, y) = f (X, S) + g(X, T )
(15)
6
3.2
N. Mont´es et al.
Calculation of the PGD-Vademecum
For all suitable test functions u∗ , the weighted residual forms reads u∗ · (Δu − f ) dΩX,S,T = 0
(16)
ΩX,S,T
where f is in the form obtained in (15): f = f (X, S) + g(X, T ) Now, Eq. (8) reads
u∗ · u dΩX,S,T =
ΩX,S,T
u∗ · f dΩX,S,T
ΩX,S,T
(17)
(18)
u∗ (x, y = Γ ) · q dΩX,S,T
− ΩX,S,T
and the PGD-Vademecum is now u(X, S, T ) =
N
Ri (X) · Wi (S) · Ki (T )
(19)
i=1
From now on, it will be assumed that the common variance r takes a fixed value and the construction of the Vademecum will be obtained considering the solution of (15) for that value r and all the values of (X = (x, y); S = (s1 , s2 ); T = (t1 , t2 )) ∈ ΩX × ΩS × ΩT . Then, the PGD-Vademecum solution is constructed considering that un−1 (X, S, T ) =
n−1
Ri (X) · Wi (S) · Ki (T )
(20)
i=1
where the enrichment step is given by un = un−1 + R(X) · W (S) · K(T ).
(21)
The key point is to find a rank-one function in the form R(X) · W (S) · K(T ) = R1 (x) · R2 (y) · W1 (s1 ) · W2 (s2 ) · K1 (t1 ) · K2 (t2 ) which satisfies ΩX ×ΩS ×ΩT
u∗ · (Δun − f ) dΩX,S,T = 0
(22)
Towards a PGD-Based Computational Vademecum for Robot Path Planning
7
for all u∗ in the linear space of functions R(X) · W (S) · K ∗ (T ) + R(X) · W ∗ (S) · K(T ) + R∗ (X) · W (S) · K(T ), where K ∗ (T ) is orthogonal to K(T ),W ∗ (S) is orthogonal to W (S) and R∗ (X) is orthogonal to R(X). In the appendix an alternating direction algorithm is provided to construct the separated representation. A prior step implies the discretization of (20) by means of the Finite-Element Method (FEM) with Nx · Ny + Ns1 · Ns2 + Nt1 · Nt2 degrees of freedom and assuming that N = Nx = Ny = Ns1 = Ns2 = Nt1 = Nt2 . Therefore, at each iteration step, the algorithm computes the 3N 2 terms of the rank-one update (22) and, after n iterations, the PGD approximation of the solution u is given by (21). 3.3
A Numerical Example
With the aim of testing the advantages of the PGD framework, a simple example is developed in this subsection. Let us consider a domain ΩX of 5m × 5m square. Let us select a discretization of the domain using Nx = Ny = 50 nodes on each side, that is, 2500 degrees of freedom and the variance r set to 1.2. Figure 1 shows the resulting PGD reconstruction for n = 200 terms. The computational cost of the PGD reconstruction is 0.0101 s in a Mac with an Intel Core 2 Duo Processor (3.06 GHz) and 4 GB RAM. It is worth noting the difference between this negligible value and the cost of a FEM approximation to solve a standard linear system, where the computational cost increases to 4.7 s. 3.4
Error of the PGD Approximation
The error of the PGD approximation versus the number of terms used can be measured by means of different techniques. A very appropriate error estimator in this case is the L2 (ΩX × ΩS × ΩT )-residual R(n), that can be obtained inserting the PGD-Vademecum approximation in the Poisson equation, resulting in R(n) = (Δun − f ) · (Δun − f ) dΩX,S,T (23) ΩX ×ΩS ×ΩT
Figure 2 shows one of the most important features of the PGD: the relevant information is stored in the first terms of the approximation.
8
N. Mont´es et al.
Fig. 1. PGD reconstruction of the potential field for given start and goal positions: (up) potential field, (down) streamlines.
Towards a PGD-Based Computational Vademecum for Robot Path Planning
9
Fig. 2. Residual error versus number of PGD terms, [28].
3.5
Calculation of Streamlines
As described in Sect. 1, the use of harmonic functions solve the local minima problem present in potential field-based techniques. The harmonic functions used in this work are based on flow dynamics and are described by the Poisson equation, where the potential field is free of deadlocks and produces a set of flow lines [5,6,8,10,14,19,21,22]. These streamlines are time-independent and indicate the motion direction of a massless fluid element traveling from an initial to a final position and following a velocity field obtained from the gradient of the potential field, as modeled by Eq. (24). vx =
du du , vy = − dx dy
(24)
Then, from the velocity field, any interpolation technique can be used to compute the streamlines as, for instance, linear, cubic, splines, etc. Figure 1(down) displays the streamlines that result from using linear interpolation for the PGD reconstruction of the potential field showed in Fig. 1(up).
4
Towards a Potential-Based Robot Path Planning with Dynamic Obstacles Using the PGD-Vademecum
Let us consider a dynamic environment with dynamic obstacles. The mobile robot needs to avoid each obstacle in real time. This situation implies that the domain changes because the path of the mobile robot must change in order to
10
N. Mont´es et al.
Fig. 3. Simulation results (4, 1), (3, 4), (2, 1), (4, 3).
from
[28]:
the
robot
visits
the
targets
T
=
Towards a PGD-Based Computational Vademecum for Robot Path Planning
11
Fig. 4. Simulation results: the robot visits two targets avoiding an obstacle in the environment.
12
N. Mont´es et al.
Fig. 5. Simulation results: modification of streamlines by a dynamic obstacle.
avoid the dynamic obstacle. For that reason, we introduce K(x, p) to change the domain and now the equation to solve is: (K(X, p) u) = f,
(25)
Towards a PGD-Based Computational Vademecum for Robot Path Planning
13
where K(X, p) =
N
K(X)K p (p)
(26)
i=1
and p is the parameter associated to the obstacle. The PGD-solution is expressed as follows: N u(X, p) = Xi (X)Pi (p) (27) i=1
5
Simulation Results
Some real-time simulations in Matlab have been developed with the aim of testing the advantages offered by the PGD-Vademecum approach. An omnidirectional mobile robot has been modelled that navigates in a 5 × 5 m square environment guided by a harmonic potential field approximated by the PGD with 50 × 50 nodes, r = 0.7 and n = 200. For a realistic implementation, only a small Region Of Interest (ROI) is reconstructed in each algorithm execution. The ROI is composed of the surrounding nodes of the current robot position and its size depends on the maximum robot velocity. In the following examples, for specific Start and Target configurations, the robot selects the shortest streamline, which is a straight line heading to the Target. Figure 3 depicts different trajectories followed by the robot from the Start point S = (1, 4) to subsequent Target points T = (4, 1), (3, 4), (2, 1), (4, 3). A second example is also presented in the same 5 × 5 m square environment, with the same potential function and parameters. In this occasion, a square static obstacle is located in the center of the environment and the robot visits two consecutive targets or goals. For a specific Start and Target configuration, it can be seen that the robot selects the shortest streamline heading to the Target and avoiding the obstacle. Figure 4 depicts the simulation results. Finally, Fig. 5 displays the modification of the streamlines of the potential field due to the variation in the location of the dynamic obstacle at different instants of time. Thus, the shortest path to the goal is different at every time instant.
6
Conclusions and Future Work
The present paper develops the application of the numerical technique known as PGD-Vademecum in the potential-based global path planning problem for mobile robots. This method generates a sort of Computational Vademecum containing all the possible robot paths for any Start-Target-Obstacle combinations in a predefined map. This PGD-Vademecum must be computed off-line and stored in the robot memory in order to be reconstructed on-line for any particular combination of start, target and obstacle configurations. It is extremely fast for real-time applications as its formulation results in a simple sum of products.
14
N. Mont´es et al.
Additionally, during navigation, only the surrounding nodes of the robot location must be reconstructed in every algorithm execution. As a result, the computational costs in real-time are very low and almost negligible. The obtained robot paths are based on the Laplace/Poisson equation and, thus, are free of deadlocks when the variance is properly adjusted in the PGD approximated solution. This property makes the proposed approach a promising method to solve the piano mover’s problem in robotics. Acknowledgements. This paper has been supported by the project GVA/2019/124 from Generalitat Valenciana.
References 1. Khatib, S.: Handbook of Robotics. Springer (2008). https://doi.org/10.1007/9783-319-32552-1 1 2. Canny, J.F.: The Complexity of Robot Motion Planning. MIT Press, Cambridge (1988). https://doi.org/10.1017/S0263574700000151 3. Rimon, E., Koditschek, D.: Exact robot navigation using artificial potential functions. IEEE Trans. Robot. Autom. 8(5), 501–518 (1992). https://doi.org/10.1109/ 70.163777 4. Zachmanoglou, E., Thoe, D.W.: Introduction to Partial Differential Equations with Applications. Dover Publications, Inc., (1986). ISBN 0486652513 5. Connolly, C.I., Burns, J.B., Weiss, R.: Path planning using Laplace’s equation. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 2102–2106 (1990). https://doi.org/10.1109/ROBOT.1990.126315 6. Kim, J., Khosla, P.: Real-time obstacle avoidance using harmonic potencial functions. IEEE Trans. Robot. Autom. 8(3), 338–349 (1992). https://doi.org/10.1109/ 70.143352 7. Akishita, S., Kawamura, S., Hayashi, K.: New navigation function utilizing hydrodynamic potential for mobile robot. In: Proceedings of the IEEE International Workshop on Intelligent Motion Control, vol. 2, pp. 413–417 (1990). https://doi. org/10.1109/IMC.1990.687354 8. Akishita, S., Hisanobu, T., Kawamura, S.: Fast path planning available for moving obstacle avoidance by use of Laplace potential. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 1, pp. 673 –678 (1993). https://doi.org/10.1109/IROS.1993.583188 9. Gulder, J., Utkin, V.I.: Sliding mode control for an obstacle avoidance strategy based on a harmonic potential field. In: IEEE International Conference on Decision and Control, pp. 424–429 (1993). https://doi.org/10.1109/CDC.1993.325112 10. Guldner, J., Utkin, V.I., Hashimoto, H.: Robot obstacle avoidance in n-dimensional space using planar harmonic artificial fields. J. Dyn. Syst. Measur. Control 119, 160–166 (1997). https://doi.org/10.1115/1.2801228 11. Keymeulen, D., Decuyper, J.: The fluid dynamics applied to mobile robot motion: the stream field method. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 378–385 (1994). https://doi.org/10.1109/ROBOT. 1994.351266 12. Tarassenko, I., Blake, A.: Analogue computation of collision-free paths. In: Proceedings of IEEE International Conference on Robotics Automation, pp. 540–545 (1991). https://doi.org/10.1109/ROBOT.1991.131636
Towards a PGD-Based Computational Vademecum for Robot Path Planning
15
13. Sato, K.: Deadlock-free motion planning using the Laplace potential field. Adv. Robot. 7(5), 449–461 (1993). https://doi.org/10.1163/156855393X00285 14. Waido, S.: Vehicle motion planning using stream functions. In: Proceedings of IEEE International Conference on Robotics and Automation, vol. 2, pp. 2484– 2491 (2003). https://doi.org/10.1109/ROBOT.2003.1241966 15. Sullivan, J., Waydo, S., Campbell, M.: Using stream functions for complex behavior and path generation. In: AIAA Guidance, Navigation, and Control Conference (2003). https://doi.org/10.2514/6.2003-5800 16. Barraquand, J., Latombe, J.C.: Robot motion planning: a distributed representation approach. Report no. STAN-CS-89-1257, Department of Computer Science, Standford University (1989). https://doi.org/10.1177/027836499101000604 17. Janglov, D.: Collision-free path of autonomous mobile vehicle. In: Proceedings of the International Symposium on Measurement and Control in Robotics (1995). ISBN 8022707600 18. Connolly, C.I., Grupen, R.: The application of harmonic functions to robotics. J. Robot. Syst. 10(7), 931–946 (1993). https://doi.org/10.1002/rob.4620100704 19. Rosell, J., Iniguez, P.: A hierarchical and dynamic method to compute harmonic functions for constrained motion planning. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 3, pp. 2335–2340 (2002). https://doi.org/10.1109/IRDS.2002.1041616 20. Li, Z.X., Bui, T.D.: Robot path planning using fluid model. J. Intell. Robot. Syst. 21, 29–50 (1998). https://doi.org/10.1023/A:100796340 21. Saudi, A., Sulaiman, J.: Path planing for mobile robots using 4EGSOR via ninepoint Laplacian (4EGSOR9L) iterative method. Int. J. Comput. Appl. 53(16), 38–42 (2012). https://doi.org/10.5120/8509-2568 22. Gingras, D., Dupuis, E., Payre, G., Lafontaine, J.: Path planning based on fluid mechanics for mobile robots used unstructured terrain models. In: IEEE International Conference on Robotics and Automation, Anchorage, Alaska, USA (2010). https://doi.org/10.1109/ROBOT.2010.5509679 23. Chinesta, F., Keunings, R., Leygue, A.: The Proper Generalized Decomposition for Advanced Numerical Simulations. Springer Briefs in Applied Science and Technology (2014). ISBN 978-3-319-02865-1 24. Chinesta, F., Leygue, A., Bordeu, F., Aguado, J.V., Cueto, E., Gonzalez, D., Alfaro, I., Ammar, A., Huerta, A.: PGD-based computational vademecum for efficient design, optimization and control. Arch. Comput. Methods Eng. 20(1), 31–49 (2013). https://doi.org/10.1007/s11831-013-9080-x 25. Domenech, L., Falc´ o, A., Garc´ıa, V., S´ anchez, F.: Towards a 2.5D geometric model in mold filling simulation. J. Comput. Appl. Math. 291, 183–196 (2016). https:// doi.org/10.1016/j.cam.2015.02.043 26. Falc´ o, A., Nouy, A.: Proper generalized decomposition for nonlinear convex problems in tensor Banach spaces. Numer. Math. 121(3), 503–530 (2012). https://doi. org/10.1007/s00211-011-0437-5 27. Khatib, O.: Real-time obstacle avoidance for manipulators and mobile robots. Int. J. Robot. Res. 5(1), 90–98 (1986). https://doi.org/10.1109/ROBOT.1985.1087247 28. Mont´es, N., Chinesta, F., Falc´ o, A., Mora, M.C., Hilario, L., Nadal, E., Duval, J.L.: A PGD-based method for robot global path planning: a primer. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2019), pp. 31–39 (2019). https://doi.org/10.5220/ 0007809000310039
Optimal Reachability and Grasping for a Soft Manipulator Simone Cacace1 , Anna Chiara Lai2(B) , and Paola Loreti2 1
Dipartimento di Matematica e Fisica, Universit` a degli studi Roma Tre, Rome, Italy [email protected] 2 Dipartimento di Scienze di Base e Applicate per l’Ingegneria, Sapienza Universit` a di Roma, Rome, Italy {anna.lai,paola.loreti}@sbai.uniroma1.it
Abstract. We investigate optimal reachability and grasping problems for a planar soft manipulator, from both a theoretical and numerical point of view. The underlying control model describes the evolution of the symmetry axis of the device, which is subject to inextensibility and curvature constraints, a bending moment and a curvature control. Optimal control strategies are characterized with tools coming from the optimal control theory of PDEs. We run some numerical tests in order to validate the model and to synthetize optimal control strategies. Keywords: Soft manipulators · Octopus arm · Control strategies Reachability · Motion planning · Grasping optimization
1
·
Introduction
In this paper we investigate reachability and grasping problems for planar soft manipulators in the framework of optimal control theory. The manipulator we are modeling is a three-dimensional body with an axial symmetry, non-uniform thickness and a fixed endpoint. The device is characterized by a structural resistance to bending, the bending moment, and its soft material allows for elastic distorsions only below a fixed threshold, modelled via a bending constraint. Moreover, the bending can be enforced by a pointwise bending control, an internal angular force representing the control term of the system. Finally, we assume an inextensiblity constraint on the manipulator: its structure does not allow for longitudinal stretching. Using the key morphological assumption of axial symmetry, we restrict our investigation to the evolution of the symmetry axis of the manipulator, modeled as a planar curve q(s, t) : [0, 1] × [0, T ] → R2 , where s is an arclength coordinate and T > 0 is the final time. Physically, q is modelled as an inextensible beam, whose mass represents the mass of the whole manipulator. The evolution is determined by suitable reaction and friction forces, encompassed in a nonlinear, fourth order system of evolutive, controlled PDEs, generalizing the Euler-Bernoulli beam equation. In particular, the aforementioned bending c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 O. Gusikhin et al. (Eds.): ICINCO 2019, LNEE 720, pp. 16–34, 2021. https://doi.org/10.1007/978-3-030-63193-2_2
Optimal Reachability and Grasping for a Soft Manipulator
17
constraints and controls of the manipulator on its symmetry axis are encoded as curvature constraints and controls, enforced by angular reaction forces. The optimal control problems that we address are formulated as a constrained minimization of appropriate cost functionals, involving a quadratic cost on the controls, see (7), (10) for the reachability problem and (14) for the grasping problem. The control model underlying our study and a first investigation of the related optimal reachability strategies were originally developed in [1], then extended in [2] to the case in which only a portion of the device is controlled. Here, beside a deeper investigation of the model and of its connection with hyper-redundant manipulators, we address the important problem of grasping a prescribed object. In our investigation the grasping problem is studied (and numerically solved) in a stationary setting. We point out that our approach, involving the theory of optimal control of PDEs, allows in a quite natural way for an extension to fully dynamic optimization: here we present the key, spatial component of the numerical solution of this challenging problem, while postponing its evolutive counterpart to future works. It is almost impossible to give an exhaustive state of the art even on the particular framework of our paper, i.e., the motion planning of highly articulated and soft manipulators. Then we limit ourselves to refer to the papers that mostly inspired our work: the seminal papers [3,4] were hyper-redundant manipulator were originally introduced, the paper [5] for the kinematic aspects and [6,7] for a discrete dynamical model. The parameter setting of our tests is motivated by the number theoretical approach presented in [9–11]. Finally, we refer to the book [14] for an introduction to our theoretical background, that is, optimal control theory of PDEs. The paper is organized as follows. Section 2 is devoted to an overview of the model under exam, and Sect. 3 to static and dynamic optimal reachability problems. In Sect. 4 we investigate an optimal grasping problem for the stationary case. Finally in Sect. 5 we draw our conclusions.
2
A Control Model for Planar Manipulators
We revise, from a robotic perspective, the main steps of the modeling procedure in [1]. In particular, we provide an analytical description of the constraints as formal limit of a discrete particle system, which in turn models an hyper-redundant manipulator. 2.1
From a Discrete Hyper-redundant to a Soft Manipulator
Hyper-redundant manipulator are characterized by a very large, possibly infinite, number of actuable degrees of freedom: their investigation in the framework of soft robotics is motivated by the fact that they can be viewed as discretizations of continuum robots. Following this idea, we consider here a device composed by N links and N + 1 joints, whose position in the plane is given by the array
18
S. Cacace et al.
(q0 , . . . , qN ). We denote by mk the mass of the k-th joint, with k = 0, . . . , N , and we consider negligible the mass of the corresponding links. Moreover, given vectors v1 , v2 ∈ R2 , we assume standard notations for the Euclidean norm and the dot product, respectively |v1 | and v1 · v2 , and we define v1 × v2 := v1 · v2⊥ , where v2⊥ denotes the clockwise orthogonal vector to v2 . Finally, the positive part function is denoted by (·)+ . The manipulator is endowed with the following physical features: – Inextensibility Constraint. The links are rigid and their lengths are given by , so that the discrete counterpart of the inextensibility constraint is |qk − qk−1 | =
k = 1, . . . , N.
(1)
This constraint is imposed exactly, by defining for k = 1, ..., N Fk (q, σ) := σk |qk − qk−1 |2 − 2 , where σk is a Langrange multiplier. – Curvature Constraint. We assume that two consecutive links, say the k-th and the k + 1-th, cannot form an angle larger than a fixed threshold αk . This can be formalized by the following condition on the joints: (qk+1 − qk ) · (qk − qk−1 ) ≥ 2 cos(αk ). This constraint is imposed via penalty method. In particular, for k = 1, ..., N we consider the elastic potential: 2 1 Gk (q) := νk cos(αk ) − 2 (qk+1 − qk ) · (qk − qk−1 ) , + where νk ≥ 0 is a penalty parameter playing the role of an elastic constant. – Bending Moment. Modeling an intrinsic resistance to leave the position at rest, corresponding to null relative angles, is described by the following equality constraint: (qk+1 − qk ) × (qk − qk−1 ) = 0. The related elastic potential with penalty parameter εk > 0 is 2 Bk (q) := εk (qk+1 − qk ) × (qk − qk−1 ) , – Curvature Control. A curvature control in a discrete setting equals to impose the exact angle between the joints qk−1 , qk , qk+1 , i.e., the following equality constraint: (qk+1 − qk ) × (qk − qk−1 ) = 2 sin(αk uk ),
Optimal Reachability and Grasping for a Soft Manipulator
19
where uk ∈ [−1, 1] is the control term. Note that the control set [−1, 1] is chosen in order to be consistent with the curvature constraint. Also in this case we enforce the constraint by penalty method, by considering: 2 1 Hk (q) := μk sin(αk uk ) − 2 (qk+1 − qk ) × (qk − qk−1 ) , where μk ≥ 0 is a penalty parameter. Note that to set μk = 0 corresponds to deactivate the control of the k-th joint and let it evolve according the above constraints only. Note that the definition of Gk , Bk and Hk in the cases k = 0 and k = N is made consistent by considering two ghost joints q−1 := q0 + (0, 1) and qN +1 := qN + (qN − qN −1 ) at the endpoints. We now are in position to build the Lagrangian associated to the hyperredundant manipulator, composed by a kinetic energy term and suitably rescaled elastic potentials: N 1
˙ σ) := LN (q, q,
k=0
2
mk |q˙k |2 −
1 1 1 1 Fk (q, σ) − 3 Gk (q) − 5 Bk (q) − Hk (q). 2 2 2
Assume now that the hyper-redundant manipulator is indeed a discretization of our continuous manipulator, that is, there exist smooth functions ν, μ, ε, ρ, ω : [0, 1] → R+ and, for T > 0, smooth functions q : [0, 1] × [0, T ] → R2 , σ : [0, 1] × [0, T ] → R and u : [0, 1] × [0, T ] → [−1, 1] such that, for all N , k = 1, ..., N and t ∈ [0, T ] νk = ν(k),
μk = μ(k),
ρk := mk / = ρ(k), qk (t) = q(k, t),
εk = ε(k),
ωk := αk / = ω(k),
σk (t) = σ(k, t),
uk (t) = u(k, t).
Fixing the total length of the manipulator equal to 1, so that = 1/N , we define the Lagrangian associated to the soft, continuous manipulator by the formal limit (see [1] for details) lim LN (q, q, ˙ σ) = L(q, q, ˙ σ).
N →+∞
where
L(q, σ) := 0
1
1
2 1 1 1 ρ|qt |2 − σ(|qs |2 − 1) − ν |qss |2 − ω 2 + − ε|qss |2 2 2 4 2 1 2 − μ (ωu − qs × qss ) ds 2
(2)
and qt , qs , qss denote partial derivatives in time and space respectively. A comparison between discrete and continuous constraints and related potentials is displayed in Table 1 and Table 2, respectively.
20
S. Cacace et al. Table 1. Exact constraint equations in both discrete and continuous settings. Constraint
Constraint equation Discrete
Continuous
Inextensibility
|qk − qk−1 | =
|qs | = 1
Curvature
(qk+1 − qk ) · (qk − qk−1 ) ≥ 2 cos(αk )
|qss | ≤ ω
Bending moment (qk+1 − qk ) × (qk − qk−1 ) = 0 Control
|qss | = 0
(qk+1 − qk ) × (qk − qk−1 ) = 2 sin(αk uk ) qs × qss = ωu
Table 2. Potentials derived from penalty method in both discrete and continuous settings. The functions ν and μ represent non-uniform elastic constants. Constraint
Penalization elastic potential Discrete
Inextensibility
None 2 Curvature νk cos(αk ) − 12 (qk+1 − qk ) · (qk − qk−1 ) + 2 Bending moment εk (qk+1 − qk ) × (qk − qk−1 ) 2 Control μk sin(αk uk ) − 12 (qk+1 − qk ) × (qk − qk−1 )
2.2
Continuous None ν(|qss |2 − ω 2 )2+ ε|qss |2 μ (ωu − qs × qss )2
Equations of Motion
Equations of motion for both the discrete and the continuous models can be derived applying the least action principle to the corresponding Lagrangians. Here we report only the continuous case, which is the building block for the optimal control problems we address in the next sections. Taking into account also some friction forces, we obtain the following system of nonlinear, evolutive, fourth order PDEs – see [1] for details:
⊥ − Gqss + Hqs⊥ ss − βqt − γqsssst . ρqtt = σqs − Hqss s (3) |qs |2 = 1 for (s, t) ∈ (0, 1) × (0, T ). We remark that the map G := G[q, ν, ε, ω] = ε + ν |qss |2 − ω 2 + encodes the bending moment and the curvature constraint, while the map H := H[q, μ, u, ω] = μ (ωu − qs × qss ) corresponds to the control term. The term −βqt represents an environmental viscous friction proportional to the velocity; the term −γqsssst an internal viscous friction, proportional to the change in time of the curvature. The system is completed with suitable initial data and with the following boundary conditions for t ∈ (0, T ): ⎧ ⎪ q(0, t) = (0, 0), (anchor point) ⎪ ⎪ ⎪ ⎪ ⎪ q (0, t) = −(0, 1) (fixed tangent) s ⎨ (4) (zero bending moment) qss (1, t) = 0 ⎪ ⎪ ⎪qsss (1, t) = 0 (zero shear stress) ⎪ ⎪ ⎪ ⎩σ(1, t) = 0 (zero tension).
Optimal Reachability and Grasping for a Soft Manipulator
21
Note that the first two conditions are a modeling choice, whereas the free endpoint conditions emerge from the stationarity of the Lagrangian L. For reader’s convenience, we summarize the model parameters in Table 3. Table 3. A summary of the quantities and the functions involved in the soft manipulator control equation (3). Name Description
Type
q σ
Axis parametrization Inextensibility constraint multiplier
Unknown of the equation
u
Curvature control
Control
ρ ω
Mass distribution Maximal curvature
Physical parameter
ε ν μ
Bending elastic constant Penalty parameter Curvature constraint elastic constant Curvature control elastic constant G[q, ν, ε, ω] = ε + ν |qss |2 − ω 2 + Reaction force
G H β γ
H[q, μ, u, ω] = μ (ωu − qs × qss )
Control term
Enviromental friction Internal friction
Friction coefficient
Remark 1 (Deactivated Controls). We recall that the control term in (3) is the elastic force ⊥ )s − (Hqs⊥ )ss F [q, μ, u, ω] := −(Hqss where H = μ(ωu − qs × qss ). Therefore, to neglect the penalty parameter μ can be used to model the deactivation of the controls in a subregion of the device. The scenarios we have in mind include mechanical breakdowns, voluntary deactivation for design or energy saving purposes and, as we see below, modeling applications to hyper-redundant systems. For instance, to choose μ ≡ 0 yields the uncontrolled dynamics: ρqtt = (σqs )s − (Gqss )ss .
(5)
More generally, to set μ ≡ 0 in a closed set I ⊂ [0, 1] means that the portion of q corresponding to I is uncontrolled and it evolves according to (5) – with suitable (time dependent, controlled) boundary conditions. We conclude this section by focusing on the tuning of the mass distribution ρ and on the curvature constraint ω in order to encompass some morphological properties of the original three dimensional model. Following [2], we consider a three dimensional manipulator, endowed with axial symmetry and uniform mass density ρv . In particular, axial symmetry implies that the cross section
22
S. Cacace et al.
of the manipulator, at any point s ∈ [0, 1] of its symmetry axis, is a circle Ω(s) of radius, say, d(s). Therefore, to set ρ(s) := πρv d2 (s) corresponds to concentrate the mass of Ω(s) on its barycenter. The curvature constraint ω(s) can be chosen starting from the general consideration that a bending induces a deformation of the elastic material composing the body of the manipulator, for which assume an uniform yield point. In other words, in order to prevent inelastic deformations, the pointwise elastic forces acting on the material must be bounded by an uniform constant Fmax . In view of the axial symmetry, the maximal angular elastic force F (s) in any cross-section Ω(s) is attained on its boundary and it reads F (s) = e|qss |d(s), where e is the elastic constant of the material. Therefore to impose F (s) ≤ Fmax is equivalent to the curvature constraint |qss | ≤ ω(s) := Fmax /(ed(s)).
3
Optimal Reachability
In this section we focus on an optimal reachability problem in both stationary and dynamic settings. The problem is to steer the end-effector (or tip) of the device, parametrized by q(1, t), towards a target in the plane q ∗ , while optimizing accuracy, steadiness and energy consumption. Our approach is based on optimal control theory, that is, we recast the problem as a constrained minimization of a cost functional. 3.1
Static Optimal Reachability
We begin by focusing on the optimal shape of the soft manipulator at the equilibrium. Equilibria (q, σ) of the soft manipulator Eq. (3) where explicitly characterized in [1], and generalized to the case of uncontrolled regions of the manipulator in [2]. In particular, assuming the technical condition μ(1) = μs (1) = 0, the shape of the manipulator is the solution q of the following second order ODE: ⎧ qss = ω ¯ uqs⊥ in (0, 1) ⎪ ⎪ ⎨ 2 in (0, 1) |qs | = 1 (6) q(0) = (0, 0) ⎪ ⎪ ⎩ qs (0) = (0, −1). where ω ¯ := μω/(μ + ε). Remark 2 (Equilibria with deactivated controls). If q is uncontrolled in I ⊂ [0, 1], i.e., μ(s) = 0 for all s ∈ I, then (6) implies |qss | = 0 in I, that is the corresponding portion of the device at the equilibrium is arranged in a straight line. We consider a cost functional involving a quadratic cost on the controls, modeling the energy required to force a prescribed curvature, and a tip-target distance term: 1 1 |q(1) − q ∗ |2 , u2 ds + (7) J s := 2 [0,1]\I 2τ
Optimal Reachability and Grasping for a Soft Manipulator
23
where I := {s ∈ (0, 1) | μ(s) = 0}. Note that the domain of the control u is in general the whole interval [0, 1] and thus it is independent from the parameter μ, which is deputed to quantify the (possibly null) dynamic effects of u. To cope with this, the domain of integration of J s is restricted to the regions in which the controls are really actuated. Note that, due to the quadratic dependence of the cost in u, we may equivalently extend the domain of integration of J s to the whole [0, 1] by setting by default u(s) ≡ 0 for s ∈ I. Also remark that, tuning the penalization parameter τ allows to prioritize either the reachability task or the energy saving. The stationary optimal control problem then reads min J s ,
subject to (6) and to |u| ≤ 1.
(8)
Following [2], we can restate (8) in terms of the Euler’s elastica type variational problem 1 1 1 2 ∗ 2 min |q(1) − q | |qss | ds + 2 (0,1)\I ω ¯2 2τ ⎧ in (0, 1) |qs |2 = 1 ⎪ (9) ⎪ ⎨ ¯ in (0, 1) |qss | ≤ ω subject to q(0) = (0, 0) ⎪ ⎪ ⎩ qs (0) = (0, −1). Indeed, by differentiating the constraint |qs | = 1 we obtain the relation qss ·qs = 0 and, consequently, |qss · qs⊥ | = |qss | · |qs⊥ | = |qss |. Then, by dot multiplying q ⊥ in ¯ |u| and, consequently, both sides of the first equation of (6), we deduce |qss | = ω u2 = ω¯12 |qss |2 in (0, 1)\I. Numerical Tests for Static Optimal Reachability. We consider numerical solutions of the optimal problem (9) in the following scenarios: I = ∅, i.e., the deviced is fully controlled, I = [0.35, 0.65], i.e., the device is uncontrolled in its median section, I = [0.25, 0.4] ∪ [0.6, 0.75] and, finally, I = [0, 1]\{0, 0.25, 0.5, 0.75}. Slight variations (in terms of dicretization step) of the first three tests were earlier discussed in [2], while the last test is completely new. Table 4. Control deactivation settings related to μI . Test
Control deactivation region
Test 1 I = ∅ Test 2 I = [0.35, 0.65] Test 3 I = [0.25, 0.4] ∪ [0.6, 0.75] Test 4 I = [0, 1]\{0, 0.25, 0.5, 0.75}
24
S. Cacace et al.
q(s)
κ(s)
Fig. 1. On the first, second and third row are reported the results of Test 1, 2 and 3, respectively. In the first column are depicted the optimal solutions and, in the second column, the related signed curvatures (bold line) and curvature constraints ±¯ ω (thin lines).
Optimal Reachability and Grasping for a Soft Manipulator
25
Table 5. Global parameter settings. Parameter description
Setting
Bending moment ε(s) = 10−1 (1 − 0.9s) s2 Curvature control penalty μ(s) = (1 − s) exp(−0.1 1−s 2) 2 ω(s) = 4π(1 + s ) Curvature constraint Target point Target penalty
q ∗ = (0.3563, −0.4423) τ = 10−4
Discretization step
Δs = 0.02
q(s)
κ(s)
(a)
(b)
Fig. 2. In (a) the solution q(s) of Tests 4 and in (b) the related signed curvature κ(s).
In the numerical implementation, the constraints |qs | = 1 and |qss | ≤ ω ¯ are enforced via an augmented Lagrangian method, and the problem is discretized using a finite difference scheme. This yields a discrete system of equations, whose non-linear terms are treated via a quasi Newton’s method. In what follows we depict the solution q(s) of the problem (9) and the associated signed curvature κ(s) := qs (s) × qss (s) in the cases reported in Table 4. Note that, due to the choice of arclenght coordinates, the (unsigned) curvature of q is |κ| = |qss |. We report the results of Test 1, Test 2 and Test 3 –see Table 5– in Fig. 1: regions corresponding to in-actuated control are depicted in white. Test 4 requires a few more words of discussion. In this case the manipulator is completely uncontrolled, with the exception of four points. The aim is to recover, from a merely static point of view, optimal configurations for a (hyperredundant) rigid manipulator from the soft manipulator model: the controlled points represent indeed the joints of the manipulator. Using the fact that uncontrolled regions arrange in straight lines at the equilibrium, we obtain an optimal
26
S. Cacace et al.
configuration modeling at once the equilibrium of the soft manipulator as well as the solution of an optimal inverse kinematic problem for a hyper-redudant manipulator – see Fig. 2(a). Note that concentrating the actuation of controls on singletons yields the support of the resulting optimal curvature to concentrate on the same isolated points – see Fig. 2(b). To cope with this expected phenomenon, in Test 4 the curvature constraint is dropped. We finally note that the particular choice of q ∗ guarantees reachability in all these four cases, but clearly, with different optimal solutions. 3.2
Dynamic Optimal Reachability
We now consider the dynamic version of the reachability problem discussed in Sect. 3.1: given q ∗ ∈ R2 and T, τ > 0, we look for a time-varying optimal control u∗ : [0, 1] × [0, T ] → [−1, 1] minimizing the functional T 1 T 12 1 |q(1, t) − q ∗ |2 dt + u ds dt 2τ 0 2 0 0 1 1 + ρ(s)|qt (s, T )|2 ds, 2 0
J (q, u) =
(10)
subject to the soft manipulator dynamics (3). The terms representing the tiptarget distance and the quadratic cost on the control are now declined in a dynamic sense: they are minimized during the whole evolution of the system. The third term of J is deputed to steadiness, it represents the kinetic energy of the whole manipulator at final time. Our approach is based on the study of first-order optimality conditions, i.e., on the solution of the so-called optimality system. The unknowns are the stationary points (q, σ, u) of J (subject to (3)), and the related multipliers (¯ q, σ ¯ ) which are called the adjoint states. Roughly speaking, if a control u∗ is optimal, then the optimality system admits a solution ¯ ). The optimality system is composed by two PDEs, of the form (q, σ, u∗ , q¯, σ describing the evolution of the adjoint states, the equations of motion (3) and a variational inequality for the control. Following [1,2], the adjoint states equations are: ⎧ ⊥ ρ¯ qtt = σ q¯s − H q¯ss − G¯ qss + H q¯s⊥ ss ⎪ ⎪ s ⎪ ⎨ ¯ ⊥ − Gq ¯ ss + Hq ¯ ⊥ + σ ¯ qs − Hq ss s s ss ⎪ + β q¯t + γ q¯sssst ⎪ ⎪ ⎩ q¯s · qs = 0
(11)
for (s, t) ∈ (0, 1) × (0, T ). The maps G and H are defined in Sect. 2 and the maps ¯ and H ¯ are their linearizations, respectively: G ¯ q¯, ν, ω] = g[q, ν, ω]qss · q¯ss , G[q, ¯ q¯, μ] = μ (¯ H[q, qs × qss + qs × q¯ss ) ,
Optimal Reachability and Grasping for a Soft Manipulator
27
where g[q, ν, ω] = 2ν1(|qss |2 − ω 2 ) and 1(·) stands for the Heaviside function, i.e. 1(x) = 1 for x ≥ 0 and 1(x) = 0 otherwise. Optimality conditions also yield final and boundary conditions on (¯ q, σ ¯ ). As a consequence of the fact that the optimization takes into account the whole time interval (0, T ), initial conditions on q correspond to final conditions on its adjoint state q¯: q¯(s, T ) = −qt (s, T ),
q¯t (s, T ) = 0
for s ∈ (0, 1).
Boundary conditions are reported in Table 6. Note that the zero, the first and the second order conditions on q¯ display an essential symmetry with those on q reported in (4). On the other hand, both the third order condition on q¯ and the adjoint tension boundary condition on σ ¯ show a dependence on the difference vector between the tip and the target q ∗ : this phenomenon represents the fact that the tip is forced towards q ∗ . Table 6. Boundary conditions on (¯ q, σ ¯ ) for t ∈ (0, T ). Fixed endpoint q¯(0, t) = 0 q¯s (0, t) = 0 Free endpoint
q¯ss (1, t) = 0 q¯sss (1, t) = τ1ε (q − q ∗ ) · qs⊥ qs⊥ (1, t) σ ¯ (1, t) = − τ1 (q − q ∗ ) · qs (1, t)
Finally, the variational inequality for the control is T 1 ¯ q¯] (v − u)ds dt ≥ 0. u + ω H[q, 0
(12)
0
for every v : [0, 1]×[0, T ] → [−1, 1], which provides, in a weak sense, the variation of the functional J with respect to u, subject to the constraint |u| ≤ 1. We remark that, due to friction forces, the system (3) is dissipative. This implies that if we plug in (3) a constant in time control us , that we call static optimal control, given by the solution of the stationary control problem (8), then the system converges as T → +∞ to the optimal stationary equilibrium q described in Sect. 3.1 - see (6). Numerical Tests for Dynamic Optimal Reachability. In the following simulations optimal static controls are used as initial guess for the search of dynamic optimal controls, as well as benchmarks for dynamic optimization. In particular, the optimality system is discretized using a finite difference scheme in space, and a velocity Verlet scheme in time. Then the optimal solution is computed iteratively, using an adjoint-based gradient descent method. More precisely, starting from the optimal static control us as initial guess for the controls, we first solve
28
S. Cacace et al.
(3) forward in time. The solution-control triplet (q, σ, u) is plugged into (11), which in turn is solved backward in time. We obtain then the solution-control vector (q, σ, u, q¯, σ ¯ ) which is used in (12) to update the value of u. This routine is iterated up to convergence in u. Table 7. Dynamic parameter settings. Parameter description
Setting
Mass distribution
ρ(s) = exp(−s)
Curvature constraint penalty ν(s) = 10−3 (1 − 0.09s) Environmental friction
β(s) := 2 − s
Internal friction
γ(s) := 10−6 (2 − s)
Final time
T =2
Time discretization step
Δt = 0.001
Jq∗
Ju
Jv
Fig. 3. Time evolution of target, control and kinetic energy for the stationary (first row) and dynamic (second row) optimal control.
We recall from [2] the investigation of the dynamic counterpart of Test 2, a scenario in which the median section of the device is uncontrolled – see Sect. 3.1 and, in particular, Table 4. Parameters are set in Table 5 and, for the dynamic aspects, in Table 7. In what follows, we compare the performances of static and
Optimal Reachability and Grasping for a Soft Manipulator
29
dynamic optimal controls. In particular, we consider the dynamic optimal control ud , i.e., the numerical solution of (10), and the static optimal control us , which we recall is constant in time and it coincides with the solution of (9), see Fig. 1. Given a solution q(s, t) of (3) associated with either the static or dynamic optimal control, in Fig. 3 we plot the time evolution of the three components of the cost functional J : – Tip-target Distance: Jq∗ (t) := |q(1, t) − q ∗ |2 . We prioritized the reaching of the target by assigning an high value to the weight τ – see Table 5. As a consequence, in agreement with the theoretical setting, this is the component displaying the most prominent gap between the static and dynamic controls. We in particular see how, up to some oscillations, the dynamic control steers and keeps close to the target the tip q(1, t) for most of the time of the evolution. 1 – Control Energy: Ju (t) := 0 u2 (s, t)ds. In the case of static controls, Jus (t) is constant by construction. The dynamic optimal controls are subject to the greatest variations in the beginning of the evolution and then stabilize around the static control energy, i.e., in agreement with the dissipative nature of the system, they actually converge to the optimal controls at the equilibrium. Finally note that dynamic controls perform better than static controls in terms of energy cost. In other words, T T Jud (t) is smaller than 0 Jus (t). 0 1 – Kinetic Energy: Jv (t) := 12 0 ρ(s)|qt (s, t)|2 ds. Note that the evolutions of the Jv are comparable, but at final time T , dynamical optimal controls yield a kinetic energy Jv (T ) slightly slower than the one associated to the stationary optimal controls. This is consistent with the fact that the cost functional J depends on the kinetic energy only at final time. Remark 3. We remark that in our tests the mass distribution ρ has an exponential decay, modeling a three-dimensional device with exponentially decaying thickness, see Sect. 2. This choice is motivated by the fact that such structure can be viewed as an interpolation of self-similar, discrete hyper-redundant manipulators [12], that is, robots composed by identical, rescaled modules. Besides the advantage in terms of design, self-similarity gives access to fractal geometry techniques [8], allowing for a detailed investigation of inverse kinematics of the self-similarity manipulator, see for instance [13].
4
Optimal Grasping
In this section we address a static optimal grasping problem, i.e., we look for stationary solutions of (3) minimizing the distance from a target object and, at the same time, an integral quadratic cost on the associated curvature controls. More formally, let us denote by Ω0 an open subset of R2 representing the object to be grasped, and by dist(·, ∂Ω0 ) the distance function from its boundary ∂Ω0 . Moreover, we denote by χΩ0 (·) and χΩ0c (·), respectively the characteristic
30
S. Cacace et al.
functions of Ω0 and its complement Ω0c in R2 . We consider the optimal control problem min G, subject to (6) and to |u| ≤ 1. (13) where 1 G(q, u) := 2
u2 ds [0,1]\I 1
1 + 2τ
dist (q(s), ∂Ω0 ) χΩ0 (q(s)) + μ0 (s)χΩ0c (q(s)) ds 2
0
,
(14)
is the cost functional for the grasping problem. The first integral term of G is a quadratic cost on the controls, computed only on the controlled region [0, 1]\I, see Sect. 3.1. The second term of G aims to minimize the distance from the boundary of the target object without compenetration. More precisely, the first contribution, given by χΩ0 , acts as an obstacle, forcing all the points of the manipulator to move outside Ω0 . The second term attracts points outside Ω0 on its boundary, according to μ0 , a non negative weight describing which parts of the manipulator are preferred for grasping. In particular, if Ω0 is just a point and μ0 is a Dirac delta concentrated at s = 1, we formally recover the tip-target distance. On the other hand, different choices of μ0 allow one to obtain very different behaviors, as shown in the following tests – see Table 8. Numerical Tests for Static Optimal Grasping. We first consider the case in which Ω0 is a circle of radius r0 , and we set μ0 (s) = χ[s0 ,1] (s) for some 0 ≤ s0 ≤ 1. In this way, we expect the manipulator to surround the circle using only its terminal part of length 1 − s0 . Figure 4 shows the results for Test 5, corresponding to the choice r0 = 0.1 and s0 = 0.55. Note that the length of the active part is 1 − s0 = 0.45, namely less than the circumference 2πr0 0.62 of Ω0 , hence the manipulator can not grasp the whole circle. Nevertheless, the ¯ on the curvature of Ω0 is equal to r10 = 10, which is below the upper bound ω curvature κ of the manipulator. This results in a good grasping, indeed we can observe a plateau in the graph of κ right around the value 10. Table 8. Settings for the target Ω0 and for the grasping weight, μ0 , where s0 = 0.55 and r0 = 0.1. The barycenter of Ω0 is q ∗ = (0.3563, −0.4423), that is the target point for the tests in Sect. 3.1. Test
Ω0
μ0
Test 5 Circle of radius r0 χ[s0 ,1] Test 6 Circle of radius r0 δs0 + δ1 Test 7 Square of side 2r0 χ[s0 ,1] Test 8 Square of side 2r0 δs0 + δ s0 +1 + δ1 2
Optimal Reachability and Grasping for a Soft Manipulator
(a)
31
(b)
Fig. 4. In (a) the solution q of Test 5, in (b) the related signed curvature κ(s).
In Test 6, we choose only two active points at the extrema of the interval [s0 , 1], namely we set μ0 (s) = δs0 (s)+δ1 (s). In Fig. 5, we show the corresponding solution. Note that the maximum of κ is now slightly larger than 10, while the curvature energy is better optimized, since the arc between the active points no longer needs to be attached to Ω0 .
(a)
(b)
Fig. 5. In (a) the solution q of Test 6, in (b) the related signed curvature κ(s).
We now consider Test 7, that is the case in which Ω0 is a square of side 2r0 , and we set again μ0 (s) = χ[s0 ,1] (s) for s0 = 0.55 and r0 = 0.1. This example is more challenging than the previous one, since a good grasping at the sharp corners of the square implies the divergence of the curvature of the manipulator.
32
S. Cacace et al.
(a)
(b)
Fig. 6. In (a) the solution q of Test 7, in (b) the related signed curvature κ(s).
That is why we neglect the curvature constraints, reporting the results in Fig. 6. We clearly observe the presence of three spikes in the graph of κ, corresponding to the contact points at three corners. Moreover, we recognize some nearly flat parts of κ corresponding to the sides of the square. Finally, in Test 8 we choose only three active equispaced points in the interval [s0 , 1], namely setting μ0 (s) = δs0 (s)+ δ s0 +1 (s)+δ1 (s). Moreover, we restore the 2 constraint on the maximal curvature of the manipulator. In Fig. 7, we observe that the optimized configuration results from a non trivial balance between the contact energy and the curvature energy.
(a)
(b)
Fig. 7. In (a) the solution q of Test 8, in (b) the related signed curvature κ(s).
Optimal Reachability and Grasping for a Soft Manipulator
5
33
Conclusions
In this paper we investigated a control model for the symmetry axis of a planar soft manipulator. The key features of the model (inextensibility, bending moment and curvature constraints and control) determine, together with internal and enviromental reaction forces, an evolution described by a system of fourth order nonlinear, evolutive PDEs. In particular, the equations of motion are derived as a formal limit of a discrete system, which in turn models a planar hyper-redundant manipulator subject to analogous, discrete, constraints. The comparison between the discrete (and rigid) and continuous (soft) model is a novelty in the investigation of the model, earlier introduced in [1]. We then addressed optimal reachability problems in both a static and dynamic setting: we tested the model in the case of uncontrolled regions and we showed that the model is also suitable for the investigation at the equilibrium of optimal inverse kinematics for hyper-redundant manipulators. We then turned to optimal grasping problems, i.e., the problem of grasping an object (without compenetrating it) while minimizing a quadratic cost on the curvature controls. Our setting is a generalization of the earlier discussed reachability problem, and it allows for the investigation of optimal grasping for a quite large variety of objects. Moreover, the model allows to prioritize the preferred contact regions of the manipulator with the grasped object: we presented several numerical tests to illustrate this feature. Future perspectives include stationary grasping problems for more complex target objects, possibly characterized by concavities and irregular boundaries. Our approach sets the ground for the search of optimal dynamic controls: we plan to explore this in the future also with other optimization techniques, including model predictive control and machine learning algorithms.
References 1. Cacace, S., Lai, A.C., Loreti, P.: Modeling and optimal control of an octopus tentacle. SIAM J. Control Optim. 58(1), 59–84 (2020) 2. Cacace, S., Lai, A.C., Loreti, P.: Control strategies for an octopus-like soft manipulator. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 1, ICINCO, pp. 82–90. INSTICC, SciTePress (2019). https://doi.org/10.5220/0007921700820090 3. Chirikjian, G.S., Burdick, J.W.: An obstacle avoidance algorithm for hyperredundant manipulators. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 625–631. IEEE (1990) 4. Chirikjian, G.S., Burdick, J.W.: The kinematics of hyper-redundant robot locomotion. IEEE Trans. Robot. Autom. 11(6), 781–793 (1995) 5. Jones, B.A., Walker, I.D.: Kinematics for multisection continuum robots. IEEE Trans. Robot. 22(1), 43–55 (2006) 6. Kang, R., Kazakidi, A., Guglielmino, E., Branson, D.T., Tsakiris, D.P., Ekaterinaris, J.A., Caldwell, D.G.: Dynamic model of a hyper-redundant, octopus-like manipulator for underwater applications. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4054–4059. IEEE (2011)
34
S. Cacace et al.
7. Kazakidi, A., Tsakiris, D.P., Angelidis, D., Sotiropoulos, F., Ekaterinaris, J.A.: CFD study of aquatic thrust generation by an octopus-like arm under intense prescribed deformations. Comput. Fluids 115, 54–65 (2015) 8. Lai, A.C.: Geometrical aspects of expansions in complex bases. Acta Math. Hung. 136(4), 275–300 (2012) 9. Lai, A.C., Loreti, P.: Robot’s hand and expansions in non-integer bases. Discrete Math. Theor. Comput. Sci. 16(1) (2014). https://dmtcs.episciences.org/3913 10. Lai, A.C., Loreti, P.: Discrete asymptotic reachability via expansions in non-integer bases. In: 2012 9th International Conference on Informatics in Control, Automation and Robotics (ICINCO), vol. 2, pp. 360–365. IEEE (2012) 11. Lai, A.C., Loreti, P., Vellucci, P.: A model for robotic hand based on Fibonacci sequence. In: 2014 11th International Conference on Informatics in Control, Automation and Robotics (ICINCO), vol. 2, pp. 577–584. IEEE (2014) 12. Lai, A.C., Loreti, P., Vellucci, P.: A continuous Fibonacci model for robotic octopus arm. In: 2016 European Modelling Symposium (EMS), pp. 99–103. IEEE (2016) 13. Lai, A.C., Loreti, P., Vellucci, P.: A Fibonacci control system with application to hyper-redundant manipulators. Math. Control Sig. Syst. 28(2), 15 (2016) 14. Tr¨ oltzsch, F.: Optimal Control of Partial Differential Equations: Theory, Methods, and Applications, vol. 112. American Mathematical Society (2010)
On Local Observer Design for LQR Problems with Tracking Paolo Di Giamberardino(B)
and Daniela Iacoviello
Department of Computer, Control and Management Engineering Antonio Ruberti, Sapienza University of Rome, Rome, Italy {paolo.digiamberardino,daniela.iacoviello}@uniroma1.it https://www.diag.uniroma1.it/digiamb/website https://www.diag.uniroma1.it/iacoviel
Abstract. The paper addresses the problem of an observer design for a nonlinear system for which a linear approach is followed for the control synthesis. The linear context driven by the control design allows to focus the observers design in the class of local, i.e. linear, observers. It is shown that when the control contains an external reference, the solution obtained working with the linear approximation to get local solutions produces non consistent results in terms of local regions of convergence for the system and for the observer. The case of a control law which solves a LQR problem with tracking is addressed and two different approaches with respect to the classical one for the observer design are studied. The results are applied to an epidemic spread control to check the differences in the performances for the two different approaches.
Keywords: Nonlinear systems LQR epidemic spread
1
· Linear observer · Optimal control ·
Introduction
The problem of the state measure for dynamical systems plays an important role in control theory since state feedback solutions are often obtained in the design procedures. When the state of a dynamics is not measurable, the design of a state observer becomes a necessary step. The history of solutions to this problem begins with the case of linear dynamics [9] and, less than ten years later, it is enriched with the first results for nonlinear ones. Several solutions have been presented in literature for the design of state observers, many of them, especially in the more recent literature, dealing with the case of nonlinear dynamics, for which nonlinear solutions have been proposed. Some of the results are represented by nonlinear solutions which mainly follows the idea initially proposed in [9] for linear systems: an observer can be designed starting from a copy of the dynamics with corrective terms added to get the stabilization of the linear approximation of the observer and of the full c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 O. Gusikhin et al. (Eds.): ICINCO 2019, LNEE 720, pp. 35–60, 2021. https://doi.org/10.1007/978-3-030-63193-2_3
36
P. Di Giamberardino and D. Iacoviello
interconnected system. Examples are [2,16] and [15], where autonomous dynamics are considered. The importance of starting with a local solution is usually put in evidence separating explicitly the linear component of the system from the remaining nonlinear terms, to better highlight the local behaviours, [8]. A further example of a solution based on the possibility of linearising the error dynamics is represented by [12]. The explicit presence of the input in the nonlinear dynamics may complicate the approaches, since suitable bounding conditions must be given; a different solution can be obtained modelling the input as the output of an exosystem with known structure [14]. The list of references could be very long, till nowadays with, for example, [13], where an approximated linearising feedback for the system dynamics is introduced. When observers are part of a feedback control schemes for systems with non measurable state, the control design and the observer determinations are two problems that must be solved at the same time. In the linear case, where the Separation Principle holds, it is possible to design the state feedback control and the observer separately, since it can be proved that the addition of an observer in the control scheme does not change the dynamical characteristics of the controlled system. In the nonlinear case, in general the Separation Principle does not hold but can be invoked once local liner approximations are considered, so restricting to local solutions. This paper addresses the problem introduced in the local observer design for a nonlinear system when the nominal state feedback introduced changes the equilibrium point of the controlled system. In this case, an ambiguity arises since the concept of local for the control and for the observer refers to two different equilibrium points: the initial one, considered for the linearised control design, and the new one, the equilibrium for the controlled dynamics and then the actual working point. The classical approach refers to the computation of local solutions referring to the open loop equilibrium point; it always works for linear systems, and works in nonlinear contest when the control does not change such a point. When the control law contains a tracking term, the equilibrium point changes and the meaning of local must be discussed. A tracking term always appears when a LQR problem with tracking term is defined for the control law design. This is the case addressed in the paper to better define the class of problems considered. Two possible approaches to the control and the observer design are reported. The first one aims at working in the new equilibrium point for the controlled dynamics and the local solution for the observer design are referred to such a point. The second one tries to introduce the observer design at the same time as the control one, changing the LQR problem formulation. Since this paper is an extended version of [7] presented at the ICINCO2019 conference, the first approach is introduced and described in the cited paper, along with the discussion of some simulation results, and is here shortly recalled. The extension is represented by the introduction of the second approach which, after its presentation, discussion and implementation on the same case study adopted in [7], is compared to the previous one.
On Local Observer Design for LQR Problems with Tracking
37
The proposed procedures are applied to a case study, represented by the control of an epidemic spread of a virus, the one responsible of HIV/AIDS infections, to compare their performances. The model proposed in [3,4] is adopted and, following [5], the control problem is formulated in the framework of optimal control theory introducing a quadratic cost index. In Sect. 2, after the general problem definition, the differences between a linear and nonlinear cases introduced by the tracking term are illustrated. A first approach to overcome such a problem is presented and discussed in Sect. 3 while a different solution obtained by changing the problem definition, designing the controller and the observer together, is discussed in Sect. 4. The procedure is then applied to the case study in Sect. 5 for comparative purpose and some results of numerical simulations are reported in Sect. 6 to validate the proposed solution. Concluding remarks in Sect. 7 end the paper.
2
Problem Definition
Given the nonlinear dynamics x˙ = f (x) + g(x)u y = h(x)
(1) (2)
with x ∈ n , u ∈ m , y ∈ p , and one equilibrium point xe (f (xe ) = 0, g(xe ) = 0, h(xe ) = 0), define an optimal control problem introducing the cost function ∞ T x Qx + uT Ru dt (3) J= 0
to be minimised. Despite the nonlinearity of the system, the quadratic form in (3) suggests that if the dynamics were linear, the problem would be solved using a classical LQR design, with the additional advantage to obtain a closed loop state feedback solution. Then, the idea followed in [5] is to linearise the dynamics in a neighbourhood of one equilibrium point and solve the problem in a first order approximated way. The linear approximation of (1)–(2) can be computed, getting x ˜˙ = A˜ x + Bu y˜ = C x ˜ where, as usual, x ˜ = x − xe and ∂f A= , ∂x x=xe
B = g(xe ),
(4)
C=
∂h ∂x x=xe
(5)
In order to have homogeneous expressions in the optimal control problem, once the local referred state variable x ˜ = x − xe is introduced, the cost function to be minimised must be rewritten as ∞ (˜ x + xe )T Q(˜ x + xe ) + uT Ru dt (6) J= 0
38
P. Di Giamberardino and D. Iacoviello
The problem can be faced as a LQR problem with a reference term. Classical theory shows that the stationary Riccati equation can be introduced, KBR−1 B T K − KA − AT K − Q = 0
(7)
whose solution K gives the optimal control −1 u = −R−1 B T K x ˜ − R−1 B T KBR−1 B T − AT Qxe
(8)
The form of the feedback control (8) is the same as the one considered in [7], u = Fx ˜+r
(9)
The linear term F x ˜ satisfies the local stability of the controlled system in a neighbourhood of the equilibrium point. The additional presence of a forcing constant term r in (9) is present. In this case, the closed loop dynamics under state measurement becomes x ˜˙ = (A + BF )˜ x + Br
(10)
As performed in [5,7], the necessity of a state observer is considered. The procedure followed in [5] used a classical Luenberger linear state observer [9] z˜˙ = (A − GC)˜ z + Bu + Gy
(11)
designed on the basis of the linear approximated dynamics (4) for which the optimal state feedback (8) has been designed. The control law (8) becomes u = F z˜ + r
(12)
The consequent dynamics of the error e = z˜ − x ˜, locally in a neighbourhood of xe , is described by e˙ = (A − GC)˜ z + Bu + Gy − A˜ x − Bu = (A − GC)˜ z + GC x ˜ − A˜ x = (A − GC)e
(13)
asymptotically convergent to zero once σ(A − GC) ∈ C − . Then, the asymptotic z−x ˜ = 0 holds and it can be rewritten as lim ˜ z−x ˜ = condition lim ˜ t→∞
t→∞
lim ˜ z + xe − x = 0 showing that if z˜ is the estimate of x ˜, then z = z˜ + xe is t→∞ the estimate of the original state x. Remaining in the approximated context, the whole system obtained using the state reconstructed by the observer in the control law, (12), is described by x ˜˙ = A˜ x + BF z˜ + Br z˜˙ = (A − GC)˜ z + BF z˜ + Br + GC x ˜
(14)
and, replacing the observer dynamics with the one of the estimation error e = z˜ − x ˜, the full dynamics is given by x ˜˙ = (A + BF )˜ x + BF e + Br e˙ = (A − GC)e
(15)
On Local Observer Design for LQR Problems with Tracking
39
that is the proof of the Separation Principle. A weakness of the procedure described above has been put in evidence in [7] and arises once the solution is applied to the original nonlinear model. In order to analyse the effects of each contribution in the whole controlled system, the use of the nominal state feedback (9) is firstly introduced. The controlled dynamics can be written as x˙ = f (x) + g(x) (F (x − xe ) + r) = Fc (x, r)
(16)
Computing the equilibrium points, denoted as xce to put in evidence its origin from the controlled dynamics, one has f (xce ) + g(xce ) (K(xce − xe ) + r) = Fc (xce , r) = 0
(17)
It is easy to verify that if r = 0, xce = xe . Otherwise, the new equilibrium point xce is different from xe . This change implies that, at steady state, the system is in the equilibrium point xce . The introduction of an observer to estimate the state for a feedback implementation must preserve this asymptotic behaviour, as it happens in the linear case, and the equilibrium point must remain xce . The fulfilment of this condition can be verified analysing the whole system obtained introducing the estimated state given by the observer for the state feedback (8) in the compact form (12) applied to system (1). On the basis of the relationships between the local state x ˜ and its estimate z˜, as well as between the original state x and its estimate z, the control law (9) can be expressed, in the original coordinates, as u = F z − F xe + r
(18)
and the observer dynamics (11) can assume the form z˙ = (A − GC)(z − xe ) + Bu + Gy = (A − GC)z − (A − GC)xe + Bu + Gy (19) The full interconnected dynamics is then described by x˙ = f (x) + g(x)F z − g(x)F xe + g(x)r z˙ = (A − GC + BF )z − (A + BF )xe + GCx + Br
(20)
In order to check the effectiveness of the controlled system (20), as a first step the error dynamics like (13) can be computed to verify its convergence to zero. e˙ = (A − GC + BF − g(x)F )e + (A + BF − g(x)F )x − f (x) − (A + BF − g(x)F )xe + (B − g(x))r
(21)
40
P. Di Giamberardino and D. Iacoviello
Approximating (21) in a neighbourhood of x = xe , recalling (5), one gets e˙ = (A − GC)e
(22)
the same as in (15) as expected. That is, locally around xe , the estimation error goes to zero. The problem is that, for the full dynamics (20), xe is not an equilibrium point. In fact, to compute the equilibrium points the system f (x) + g(x)F z − g(x)F xe + g(x)r = 0 (A − GC + BF )z − (A + BF )xe + GCx + Br = 0
(23)
must be solved. It is easy to verify that substituting x = xe , as well as z = xe since the estimation error goes to zero, in (23), the condition Br = 0
(24)
is obtained, clearly impossible. This means that it is no more guaranteed that the dynamics evolves in a neighbourhood of xe . On the other hand, not even x = xce , and then z = xce , are equilibrium conditions for the two subsystems because, by substitution in (23), the expressions − g(xce )F xe = 0 (A + BF )(xce − xe ) + Br = 0
(25)
are obtained, once again impossible. It is possible to conclude that this approach cannot work properly because i. the insertion of the observer dynamics interferes with the characteristics of the controlled system, changing the equilibrium point; ii. the observer does not work as expected, since the manifold in which the local convergence is assured does not coincide with a neighbourhood of the new equilibrium point. In [7], an improvement in the procedure recalled above has been introduced, remaining in the locally linearised approximated context but trying to avoid the undesired effects i. and ii. previously mentioned. Such a new procedure is shorty recalled in next Sect. 3.
3
The Improved Design Procedure
The idea for the solution proposed in [7] is based on the possibility of designing a state observer in such a way that the equilibrium point of the controlled system is the same both when the state is supposed to be measured and when its estimate provided by the observer is used. Starting from the system (1)–(2), suppose it has been defined a linear state feedback control with a regulation term of the form (9), expressed in the original coordinates, u = Fx + r (26)
On Local Observer Design for LQR Problems with Tracking
41
The controlled dynamics is described by x˙ = f (x) + g(x)F x + g(x)r = Φ(x, r)
(27)
with output (2). Using the same notation previously adopted, be xce the equilibrium point for the controlled system (27), Φ(xce , r) = 0. The design technique is again based on a linear observer and on local convergence of the estimation error, but preserving the convergence of the system to xce . To this aim, the linear approximation of (27) in a neighbourhood of xce is computed as ¯ (28) x ¯˙ = Ac x where Ac = ∂Φ(x,r) and x ¯ = x − xce . Now, a linear observer is designed on ∂x x=xce
the basis of the closed loop system, i.e. an observer for the state of (27). The structure is the same as in (11), so that it has the form
where y = Cc x ¯ and Cc
z¯˙ = (Ac − GCc )¯ z + Gy (29) = ∂h(x) are defined as in the previous case ∂x c x=xe
for a different equilibrium point. Conditions under which the estimation error converges asymptotically to zero for the so defined problem are trivial, being σ(Ac − GCc ) ∈ C − . The so obtained observer is used in the full closed loop system to provide a state estimation for the state feedback (26). Clearly, since z¯ is the estimation of x − z¯ = 0, z = z¯+xce is the estimation of x; in fact lim ¯ x − z¯ = x ¯, that is lim ¯ t→∞
t→∞
lim (x − xce ) − (z − xce ) = lim x − z = 0 t→∞ t→∞ In order to study the effect of such a control scheme, the full closed loop dynamics has to be written. One has x˙ = f (x) + g(x)F z + g(x)r = Φ(x, r) + g(x)F (z − x) z˙ = (Ac − GCc )z − Ac xce + GCc x
(30)
If the dynamics of the error e = z − x is computed, the expression e˙ = (Ac − GCc )(e + x) − Ac xce + GCc x − Φ(x, r) − g(x)F e = (Ac − GC − g(x)F )e + Ac (x − xce ) − Φ(x, r)
(31)
is obtained. Its approximation in a neighbourhood of x = xce , for which Φ(x, r) = Ac (x − xce ),
Bc = g(xce )
(32)
yields to e˙ = (Ac − GCc − Bc F )e
(33)
42
P. Di Giamberardino and D. Iacoviello
which converges, if the pair (Ac − Bc F, Cc ) is detectable, once matrix G is computed to have σ(Ac − Bc F − GCc ) ∈ C − . At the same time, once the equilibrium points of (30) are computed, it is easy to verify, by straightforward substitution, that x = xce and z = xce are solutions. In fact Φ(xce , r) + g(xce )F (xce − xce ) = 0 (Ac − GCc )xce − (Ac − GCc )xce = 0 Rewriting (30) in the new coordinates (x, e) x˙ = Φ(x, r) + g(x)F e e˙ = (Ac − GC − g(x)F )e + Ac (x − xce ) − Φ(x, r)
(34)
and computing its linear approximation in a neighbourhood of x = xce as x˙ = Ac x + Bc Ke − Ac xce e˙ = (Ac − GC − Bc K)e
(35)
the dynamical matrix AT OT =
Ac Bc F 0 (Ac − Bc F ) − GCc
(36)
is obtained. It shows that, even if the Separation Principle does not hold strictly, the design and the application of the linear observer after the synthesis of the state feedback control does not affect the controlled dynamics which keeps, in its linear approximation, the dynamical matrix Ac . This procedure is applied in Sect. 5 and the results of numerical simulations are reported in Sect. 6.
4
A Different Problem Formulation
The approach described above is a possible solution to the problem of designing a linear observer after that a state feedback with a drift term has been computed for a nonlinear dynamics, like the ones that are obtained solving a LQR problem with a reference term. A different formulation can be adopted for the same problem, aiming at maintaining the validity of the Separation Principle and, at the same time, at taking into account the presence of the observer in the full controlled closed loop dynamics. Then, in view of the use of a linear observer for the solution of the LQR problem for a nonlinear system, the full dynamics composed by the connection between the nonlinear dynamics and the linear observer can be considered in the design problem; the feedback control law can be requested to be a feedback from the observer state only, and the estimation error can be considered as a further term in the cost function.
On Local Observer Design for LQR Problems with Tracking
43
Under these hypothesis, the optimal control problem can be formulated as follows. Given a nonlinear dynamics of the form x˙ = f (x) + g(x)u z˙ = Lz + M u + N y
(37)
with y = h(x), f (xe ) = 0, g(xe ) = 0 and h(xe ) = 0, find the control u which minimises the cost function ∞ T x Qx + uT Ru + (z − x)T P (z − x) dt (38) J= 0
with P positive definite matrix. During the design procedure, choose properly the matrices L, M and N . Following the same consideration as in Sect. 2, if a linear approximation of the dynamics is considered, the problem can be solved in the framework of the LQR theory. Assuming the same approximation as in (4) for the first part of the dynamics and thanks to the predefined linear structure for the observer dynamics, (37) can be approximated, in a neighbourhood of (xe , xe ), as x ˜˙ = A˜ x + Bu z˜˙ = L˜ z + M u + N Cx ˜
(39)
ˆ + Bu ˆ ξ˙ = Aξ
(40)
or, in a more compact form, where ξ=
x ˜ , z˜
Aˆ =
A 0 , NC L
ˆ= B
B M
(41)
Under these positions, also the cost function (38) can be rewritten in the compact form ∞ ˆ − d) + uT Ru dt (ξ − d)T Q(ξ J= (42) 0
with ˆ= Q
Q + P −P −P P
,
d=−
xe xe
(43)
ˆ is positive definite. The Riccati equation It is easy to verify that the matrix Q ˆT K ˆ −K ˆ Aˆ − AˆT K ˆ −Q ˆ=0 ˆ BR ˆ −1 B K
(44)
ˆ Following the can be solved, obtaining the positive definite symmetric matrix K. ˆ can partition induced by the structure of the matrices involved, the matrix K be written as K11 K12 ˆ K= (45) K12 K22
44
P. Di Giamberardino and D. Iacoviello
with K11 , K12 and K22 symmetric positive definite. The block structure of the matrices involved in (44) may help to simplify the computations. Then, (44) can be written as T K11 K12 B K11 K12 −1 T B M R K12 K22 K12 K22 M −
K11 K12 K12 K22
T T T A 0 K11 K12 Q + P −P A C N − − = 0 (46) K12 K22 NC L −P P 0 LT
Once the solution of the Riccati equation has been computed, the optimal control, due to the presence of the tracking term, assumes the state feedback form (as in (8)) −1 ˆ BR ˆ −1 B ˆ T Kξ ˆ + R−1 B ˆT K ˆ T − AˆT ˆ = Fˆ ξ + rˆ Qd u = −R−1 B
(47)
The first term is the optimal stabilising state feedback while the second one is the forcing term for the reference tracking. Focusing on the gain matrix Fˆ , its structure can be put in evidence writing the matrices with their block partition. One has K11 K12 ˆT K ˆ = −R−1 B T M T Fˆ = −R−1 B K12 K22 T −1 T T B K11 + M K12 B K12 + M T K22 = −R (48) The full system (39), under the feedback control, becomes x ˜˙ = (A − BR−1 (B T K11 + M T K12 ))˜ x − BR−1 (B T K12 + M T K22 )˜ z + Bˆ r ˙z˜ = (L − M R−1 (B T K12 + M T K22 ))˜ z + (N C − M R−1 (B T K11 + M T K12 ))˜ x + M rˆ (49) I 0 Multiplying the dynamical matrix of (49) by on the left and by −I I −1 I0 I 0 = on the right, one gets the equivalent matrix II −I I I 0 A−BR−1 (B T K11 +M T K12 ) −BR−1 (B T K12 +M T K22 ) I 0 (I I ) = −I I N C−M R−1 (B T K11 +M T K12 ) L−M R−1 (B T K12 +M T K22 ) −1 T T −1 T T A−BR (B (K11 +K12 )+M (K12 +K22 )) −BR (B K12 +M K22 ) L+N C−A−(M −B)R−1 (B T (K11 +K12 )+M T (K12 +K22 )) L−(M −B)R−1 (B T K12 +M T K22 )
(50) On the basis of these expressions, two different hypothesis can be formulated in order to have the feedback control depending on the observer state only.
On Local Observer Design for LQR Problems with Tracking
4.1
45
First Hypothesis
According to the classical observer based control schemes, it can be introduced the constraint that the control law is a feedback from the estimated state z˜ only; this means that in (48) it must be verified that B T K11 + M T K12 = 0
(51)
It is interesting to notice that this position makes the observer dynamics independent from the not measurable state x ˜, since the term N C x ˜ = N y is measurable. Under condition (51), the control law is u = −BR−1 (B T K12 + M T K22 )˜ z + Bˆ r
(52)
where the forcing term rˆ, written putting in evidences the block decomposition of the matrices involved, assumes the expression rˆ = −R−1 B T M T × −1 −C T N T Qxe −AT (53) 0 (K12 B + K22 M )R−1 B T (K12 B + K22 M )R−1 M T − LT In (53), it should be verified, as it will be done later, that the matrix −C T N T −AT (K12 B + K22 M )R−1 B T (K12 B + K22 M )R−1 M T − LT
(54)
is still full rank, after the design procedure, even with the introduction of the constraint (51). If (51) holds, Riccati equation (46) is rewritten as 0 0 K11 A + K12 N C K12 L − K12 A + K22 N C K22 L 0 (K12 B + K22 M )R−1 (B T K12 + M T K22 ) T T T T T T A K11 + C N K12 A K12 + C N K22 Q + P −P − − =0 (55) −P P LT K12 LT K22 From (55), four equations can be obtained − (K11 A + K12 N C) − (AT K11 + C T N T K12 ) − (Q + P ) = 0
(56)
− K12 L − (AT K12 + C T N T K22 ) + P = 0
(57)
− (K12 A + K22 N C) − LT K12 + P = 0
(58)
(K12 B + K22 M )R
−1
(B K12 + M K22 ) − K22 L − L K22 − P = 0 T
T
T
(59)
The sum of the four equations gives (K12 B + K22 M )R−1 (B T K12 + M T K22 ) − (K12 + K22 )(L + N C) − (L + N C)T (K12 + K22 ) − (K11 + K12 )A − AT (K11 + K12 ) − Q = 0
(60)
46
P. Di Giamberardino and D. Iacoviello
An analysis of (60), in which both the term K12 B + K22 M and the term K12 + K22 are present, suggests that if the position M = B is set, the whole expression (60) simplifies. On the other hand, M = B means that in the observer design, the same effect of the input as in the original dynamics is chosen, so that in the observation error the contribution of the input is neglected. It corresponds to the choice of a Luenberger classic observer. Moreover, if M = B is used in the expression of the matrix (50), in addition to (51), one gets A − BR−1 B T (K12 + K22 ) −BR−1 B T (K12 + K22 ) (61) L + NC − A L With M = B, constraint (51) yields to K11 + K12 = 0 ↔ K12 = −K11
(62)
and the four equations (56)–(59) simplify as − K11 (A − N C) − (A − N C)T K11 − (Q + P ) = 0
(63)
K11 L + (AT K11 − C T N T K22 ) + P = 0
(64)
(K11 A − K22 N C) + LT K11 + P = 0
(65)
−1
B (−K11 + K22 ) − K22 L − L K22 − P = 0
(66)
(−K11 + K22 )BR−1 B T (−K11 + K22 ) −(−K11 + K22 )(L + N C) − (L + N C)T (−K11 + K22 ) − Q = 0
(67)
(−K11 + K22 )BR
T
T
and (60) becomes
Setting L + N C = A, equation (67) corresponds to the Riccati equation which arises when the LQR problem is defined for the system (4), the plant part of (37), and with cost function (6). In addition, it must be observed that the gain matrix in (52) reduces to −R−1 B T (−K11 + K22 ), that is the same obtained solving (67). Then, the relation L + N C = A is defined and the LQR problem with full state measurement for (4) and (6) can be firstly solved, finding the positive definite symmetric matrix Ks = −K11 + K22 and using it in the full problem. Moreover, under all these hypothesis, matrix (61) becomes A − BR−1 B T (−K11 + K22 ) −BR−1 B T (−K11 + K22 ) (68) 0 A − NC from which it can be concluded that the closed loop system (49) is asymptotically stable if A − N C has all its eigenvalues with negative real part, since the block A − BR−1 B T (−K11 + K22 ) is asymptotically stable from the reduced LQR problem solution.
On Local Observer Design for LQR Problems with Tracking
47
Finally, it is possible to rewrite the optimal control law (47) in a simpler form, particularising (52) and (53). One gets u = −R−1 B T Ks z˜ −1 −(N C)T Qxe −AT −1 T −R B I I 0 Ks BR−1 B T Ks BR−1 B T − LT
(69)
−AT −(N C)T Ks BR−1 B T Ks BR−1 B T − LT 0 II I −I −AT + Ks BR−1 B T = 0I 0 I −LT Ks BR−1 B T
Since
(70) (71)
computing the inverse of both members, one has that =
I −I 0 I
−(N C)T −AT −1 T Ks BR B Ks BR−1 B T − LT
−1 (72)
II 0I (73) and L are invertible by
−(AT − Ks BR−1 B T )−1 0 T −1 −(L ) (Ks BR−1 B T )(AT − Ks BR−1 B T )−1 −(LT )−1
which can be computed since both AT − Ks BR−1 B T construction. With all these positions, the expression for the optimal control (69) becomes u = −R−1 B T Ks z˜ −(AT − Ks BR−1 B T )−1 0 Qxe − R−1 B T I 0 0 ∗ −(LT )−1 = −R−1 B T Ks z˜ + R−1 B T (AT − Ks BR−1 B T )−1 Qxe
(74)
Comparing the expression (8) for the control when the state feedback without observer is considered (obtained in Sect. 2), with (74) obtained with the proposed procedure, it clear that (74) corresponds to (8) with the state x ˜ substituted by the estimated state z˜. 4.2
Second Hypothesis
A different choice can be performed, instead of introducing the constraint (51) to have the feedback control law depending on the observer state only. It can be assumed that, once the control (47) is computed, the state x ˜ is substituted by its estimate z˜. With this position, dynamics (49) becomes z + Bˆ r x ˜˙ = A˜ x − BR−1 (B T K11 + M T K12 + B T K12 + M T K22 )˜ z˜˙ = (L − M R−1 (B T K12 + M T K22 + B T K11 + M T K12 ))˜ z + N Cx ˜ + M rˆ (75)
while matrix (50) assumes the expression m11 m12 m21 m22
(76)
48
P. Di Giamberardino and D. Iacoviello
with m11 = A − BR−1 (B T (K11 + K12 ) + M T (K12 + K22 )) m12 = −BR−1 (B T (K11 + K12 ) + M T (K12 + K22 )) m21 = L + N C − A − (M − B)R−1 (B T (K11 + K12 ) + M T (K12 + K22 )) m22 = L − (M − B)R−1 (B T (K11 + K12 ) + M T (K12 + K22 )) It can be verified that, setting M = B and L = A − N C, matrix (76) reduces to A − BR−1 B T (K11 + 2K12 + K22 ) −BR−1 B T (K11 + 2K12 + K22 ) (77) 0 A − NC At the same time, if theRiccati equation (46) is multiplied on the left by I I I and on the right by , using the relations defined above for M and L, easy I computations give the expression (K11 + 2K12 + K22 ) BR−1 B T (K11 + 2K12 + K22 ) − (K11 + 2K12 + K22 ) A − AT (K11 + 2K12 + K22 ) − Q = 0
(78)
Defining, in this case, Ks = K11 + 2K12 + K22
(79)
matrix (77) becomes the same as (68), which results asymptotically stable since Ks is the solution of the Riccati equation (78) and N is chosen so that A − N C is asymptotically stable, provided that (A, C) is observable. Both the hypothesis adopted in Subsect. 4.1 and in the present Subsect. 4.2 bring to the same procedure for the control system design. It consists in solving the LQR problem defined by dynamics (4) under cost function (6), solving a Riccati equation of the form (7), with K = Ks , and using the state feedback (12) using the estimate state z˜ provided by the linear observer (11) with G = N computed to have A − N C asymptotically stable. The behaviour and the performances of this approach are compared with the same of the approach described in Sect. 3 making use of the numerical example introduced in next Section.
5
The Case Study
As anticipated in the Introduction, the case study of an epidemic spread is considered for comparing the proposed approaches. The mathematical model, briefly recalled in Subsect. 5.1, is the one introduced in [3] and [4], for which a LQR optimal control has been proposed in [5]. The classical state feedback solution there obtained has been enhanced introducing a state observer in [6], whose design is here shortly recalled in Subsect. 5.2, necessary since (at least) one of the state variable is not measurable. Finally, in [7], an improvement in the observer design is introduced to compensate some of the effects of the nonlinearity of the dynamics, as recalled in Subsect. 5.4. The solution here proposed, described in Sect. 4, is applied in Subsect. 5.5.
On Local Observer Design for LQR Problems with Tracking
5.1
49
A Short Recall of the Mathematical Model
In this paper, the model of the HIV/AIDS diffusion presented in [3,4] is adopted and is here briefly recalled. The state variables introduced in the model denote the healthy people S1 , not aware of dangerous behaviours and then can be infected, and S2 , the ones that, suitably informed, give great attention to the protection, and the three levels of infectious subjects: I, the infected but unaware of their status, P , the HIV positive patients, A, the AIDS diagnosed ones. The control actions are the information campaign, u1 and the test campaign to discover the infection as soon as possible, u2 . A third action, u3 , the therapy which aims at reducing the transition from HIV to AIDS, is also considered, since mortality among A is higher that in P . Then, the mathematical model is S1 I S˙ 1 = Z − dS1 − β + γS2 − S1 u1 Nc S˙ 2 = −(γ + d)S2 + S1 u1 S1 I I I˙ = β − (d + δ)I − ψ u2 Nc Nc I P˙ = εδI − (α + d)P + φψ u2 + P u3 Nc I A˙ = (1 − ε)δI + αP − (μ + d)A + (1 − φ)ψ u2 − P u3 Nc
(80)
where Nc = S1 + S2 + I. In (80), d denotes the rate of natural death; Z denotes the flux of new subjects in the class S1 ; β is related to the dangerous interactions between S1 and I categories; γ is the rate of wise subjects that could change, incidentally, their status, increasing S1 (t); ψ is related to the control action aiming at helping the individuals in class I to discover their infectious condition, and therefore to flow to the P or the A class; φ is the percentage of test positive subjects with HIV ((1 − φ) the percentage with AIDS); δ is the rate of transition from I to P (percentage ε) or A (percentage (1−ε)) without any external action; α is the rate of the natural transition from P to A; μ is the rate of death in class A caused by the infection. The compact form for (80) can be expressed as X˙ = f (X) + g(X)U = F (X, U )
(81)
once X = (S1 S2 I P A)T , U = (u1 u2 u3 )T and ⎛
⎞ 1I Z − dS1 − βS Nc + γS2 ⎜ ⎟ −(γ + d)S2 ⎜ ⎟ βS1 I ⎟ f (X) = ⎜ − (d + δ)I ⎜ ⎟ Nc ⎝ ⎠ εδI − (α + d)P (1 − ε)δI + αP − (μ + d)A
(82)
50
P. Di Giamberardino and D. Iacoviello
⎛ −S1 0 ⎜ S1 0 ⎜ 0 −ψ NIc g(X) = g1 g2 g3 = ⎜ ⎜ ⎝ 0 φψ NIc 0 (1 − φ)ψ NIc
⎞ 0 0 ⎟ ⎟ 0 ⎟ ⎟ P ⎠ −P
(83)
are defined. For the choice of possible output functions, it must be observed that the subject with a positive diagnosis, P and A, can be easily measured, since reported by medical operators. Then, it seems reasonable to assume the measure of the total number of diagnosed individuals, P (t) + A(t), as the possible meaningful output, so giving y(t) = CX(t), C= 00011 (84) 5.2
Control Problem Definiton and LQR Problem Formulation
An optimal control problem for the HIV/AIDS dynamics (80) has been formulated in [5]. It aims at minimising the most dangerous class of individuals, the infected I, keeping the control amplitude as low as possible. With these choices, the distinction between P and A, as well as the presence of the control u3 , ˆ = (u1 u2 )T become not relevant. Then, the two–dimensional control vector U is introduced, neglecting, consequently, the vector field g3 (·) in (83) introducing the matrix gˆ(X) = g1 g2 . Under these positions, the cost function ∞ 2 1 ∞ T ˆ) = 1 ˆ T RU ˆ dt (85) X QX + U J(X, U qI + r1 u21 + r2 u22 dt = 2 t0 2 t0 is defined, with Q the five dimensional square matrix with all zero entries except r1 0 Q(3, 3) = q, and R = , r1 , r2 > 0. 0 r2 The quadratic structure of (85) and the preference for a state feedback implementability of the control law drove the solution of such problem in [5] to a LQR form designed on the linearised approximation of (80) in the neighbourhood of one equilibrium point. A study of the existence of equilibrium points and of their stability properties has been performed in [3,4], yielding to the two possible solutions ⎞ ⎛ ⎛ ⎞ 1/H 1/d ⎟ ⎜ 0 ⎜ 0 ⎟ ⎟ ⎜ ⎜ ⎟ H−d ⎟ ⎜ e e ⎜ ⎟ H(d+δ) (86) X2 = ⎜ X1 = ⎜ 0 ⎟ Z ⎟Z εδ(H−d) ⎟ ⎜ ⎝ 0 ⎠ ⎝ H(α+d)(d+δ) ⎠ δ(H−d)[(1−ε)d+α] 0 H(α+d)(d+δ)(μ+d)
where H = β − δ. The non negativeness of the elements in the vector state X2e implies the condition H ≥ d > 0; therefore the equilibrium point X2e is a feasible
On Local Observer Design for LQR Problems with Tracking
51
one if and only if H ≥ d, being X1e = X2e if H = d. The presence of a bifurcation in the stability analysis is discussed in [4]. Making use of the same values for the model parameters as in [5], condition H > d holds, so that both the equilibrium points exist. Computing the two linearised dynamics of (81) in the neighbourhood of the two equilibrium points one gets ˜ +B ˆi U ˆ ˜˙ = Ai X X ˜ y˜ = C X
with Ai =
∂f ∂X
X=Xie
(87)
ˆi = gˆ(X e ) y˜ = C X ˜ = y − CX e and ,B i i ˜ = X − Xe X i
(88)
i = 1, 2 depending on the choice. Despite the procedure can be adopted making reference to both the equilibrium points, easy computations show that the linear dynamics which approximates the nonlinear one in the neighbourhood of X1e is neither detectable nor controllable. So, in view of a control synthesis in the local linear domain, the linearisation in a neighbourhood of X2e is chosen. Then, the linear dynamics is (87) with i = 2. 5.3
The Linearised Optimal Control Problem Solution
˜ consequence of (88), implies a change of The use of the new coordinates X, ˜ must appear instead of X. The variables also in the cost function (85), where X new expression is ˆ ˜ ˜ ˆ ˆ ) = J(X ˜ + X e, U J(X, U 2 ) = J( X, U ) 1 ∞ T ˜ ˜ ˆ T RU ˆ dt X − r¯ Q X − r¯ + U = 2 t0 2 ∞ 2 2 ˜ + H−d = 12 t0 q I(t) + r u (t) + r u (t) dt 1 1 2 2 H(d+δ)
(89)
H−d . where r¯ = (∗ ∗ r¯I˜ ∗ ∗)T denotes the LQR tracking term, with r¯I˜ = − H(d+δ) For a dynamics (87), the optimal control problem with cost function (89) corresponds to a classical Linear Quadratic Regulator (LQR) problem with a constant tracking term. The result, computed and discussed in [5], is a state feedback control law with a reference term. Once the Algebraic Riccati Equation
ˆ2 R−1 B ˆ2T KR − KR A2 − AT2 KR − Q 0 = KR B
(90)
is solved w.r.t. KR , the state feedback optimal control law is given by [1] ˆ = −R−1 B ˆ T KR X ˜ + R−1 B ˆ T gr¯ = F X ˜ +r U 2 2
(91)
52
P. Di Giamberardino and D. Iacoviello
−1 T ˆ2 R−1 B T − AT 0 0 qr¯I˜ 0 0 ; = KR B Q r¯ with Q r¯ = 2 2 ˆ T KR is the gain matrix as in (9) while r = R−1 B ˆ T gr¯ is the constant K = −R−1 B 2 2 tracking term r. Stability for the linear controlled system is proven in [5]. In the same paper, the problem of the unavailability of a measure of all the state variables has been solved computing a linear state observer under the hypothesis that, once only local solutions are available, due to the request of a state feedback control, then also for the observer a linear approach can be sufficient, whose approximation is well compensated by its simplicity of design and implementation. Then, once verified the detectability property on (A2 , C) in (87) (i = 2), ˜ verifying the asymptotic condition the state estimation z˜(t) of the state X ˜ − z˜(t) = 0 can be obtained as the state evolution of the Luenlim X(t) where gr¯
t→+∞
berger like linear observer ˆ (t) + G˜ ˆ2 U y (t) z˜˙ (t) = (A2 − GC) z˜(t) + B
(92)
with matrix G chosen in order to have all the eigenvalues of the dynamic matrix (A2 − GC) with negative real part. The whole control system is a dynamical output feedback control with state observer and feedback from the state estimation. 5.4
The Solution with the Improved Observer Design
The approach proposed in [7] is here recalled. Following what illustrated in Sect. 3, consider the control law of the form (91) computed solving the LQR control problem on the basis of the linear approximation of the dynamics in a neighbourhood of X2e . Under the action of this state feedback, the controlled system assumes the form ˜ + r) = Φ(X, r) X˙ = f (X) + gˆ(X)(F X
(93)
with its linear approximation asymptotically stable. Its equilibrium point can be denoted by Xec : Φ(Xec , r) = 0 and the linear approximation of (93) in the neighbourhood of Xec can be computed; it is given by ¯˙ = Ac X ¯ X where Ac = ∂Φ(X,r) ∂X
X=Xec
(94)
¯ = X − X c. and X e
A local linear observer for the linear approximating dynamics can be designed in the usual form z¯˙ = (Ac − GC)¯ z + G¯ y (95) when dealing directly with the linearised dynamics, or z˙ = (Ac − GC)z + Gy − Ac Xec
(96)
On Local Observer Design for LQR Problems with Tracking
53
expressed in the original state variables. According to the general discussion in Sect. 3, once the observer has been designed, the control law (91) can be implemented using the state estimate z¯ instead of the real but not measurable ¯ (or z instead of X). Then, the control law (91) can be rewritten as state X ˜ + r = F (X − X e ) + r = F (X ¯ + X c − X e) + r u = FX 2 e 2 ¯ + r + F (Xec − X2e ) = FX
(97)
so that the dynamics (93), using the observer (95), assumes the expression X˙ = f (X) + gˆ(X)(F z¯ + r) + gˆ(X)F (Xec − X2e )
(98)
to be considered along with the observer dynamics (95). Some manipulations allow to write the dynamics (98) as X˙ = f (X) + gˆ(X)(F z¯ + r) + gˆ(X)F (Xec − X2e ) ˜ + X c − X e ) = Φ(X, r) + gˆ(X)F (z − X) (99) = Φ(X, r) + gˆ(X)F (¯ z−X e 2 ¯ or z˜ − X ˜ according to the convenience. where z − X can be replaced by z¯ − X It is easy to verify by substitution that the whole dynamics (95)–(99) has the ¯ = z¯ = 0). This means that asymptotically equilibrium point X = Xec , z = Xec (X the state of the observer and one of the original system are equal. The fact that the asymptotic error is equal to zero can be proved also computing the error dynamics z + GC x ¯ − F (X, r) − gˆ(X)K(z − x) (100) e˙ = (Ac − GC)¯ and evaluating it in a neighbourhood of X = Xec , yielding ˆc K(¯ e˙ = (Ac − GC)¯ z + GC x ¯ − Ac x ¯−B z−x ¯) ˆc K)e = (Ac − GC + B
(101)
ˆc K) ∈ C − the error goes asymptotically to zero. Then, with σ(Ac − GC + B It is confirmed what stated in the previous Section: the observer (95) works properly, without producing undesired changes in the system dynamics and converging asymptotically to the system state, once G is designed to have ˆc K) ∈ C − , provided that (Ac − B ˆc K, C) is a detectable pair. σ(Ac − GC − B 5.5
The Proposed Combined Approach
Following the description reported in Sect. 4, the solution of the problem consists of the two steps. The first one concerns the design of the optimal state feedback control computing matrix Ks as the solution of the Riccati equation (67) which, for the present example, assumes the expression (90) with KR = Ks , G = N , and where A2 = L − N C. The solution (91) corresponds to the expression (74). With respect to the initial formulation of the problem in Sect. 4, Ks can represent −K11 +K22 if the choice described in Subsect. 4.1 is adopted, or K11 +2K12 +K22 for the choice of Subsect. 4.2. The second step is the observer design, for which N and L = A2 − N C must be computed. Under the observability condition on (A2 , C), N can be computed so that n eigenvalues with negative real part can be set to matrix L.
54
6
P. Di Giamberardino and D. Iacoviello
Numerical Results and Discussion
In this Section, a numerical analysis is performed to compare the solution proposed in [7] and the one obtained following the approach her proposed in Sect. 4. The values for the parameters in the dynamics (80) adopted for the numerical computations have been taken from [3,5,10,11]: d = 0.02, γ = 0.2
β = 1.5, 5
ψ = 10 ,
δ = 0.4,
ε = 0.6,
α = 0.5,
μ = 1,
φ = 0.95 Z = 1000
Then, H = β − δ = 1.1 > 0 so that the equilibrium point X2e exists and it is T locally asymptotically stable. Numerically, X2e = (0.91 0 2.34 1.08 0.9) · 103 . The linear approximation in the neighbourhood of this equilibrium point is described by C as in (84) and by the following numerical matrices: ⎛ ⎞ −0.80 0.20 −0.12 0 0 ⎜ 0 −0.22 0 0 0 ⎟ ⎜ ⎟ ⎟ 0.78 0 0.30 0 0 (102) A2 = ⎜ ⎜ ⎟ ⎝ 0 0 0.24 −0.52 0 ⎠ 0 0 0.16 0.5 −1.02 ⎛ ⎞ −0.91 0 ⎜ 0.91 0 ⎟ ⎜ ⎟ 3 ˆ ⎜ −72 ⎟ B2 = ⎜ 0 (103) ⎟ 10 ⎝ 0 68.40 ⎠ 0 3.60 6.1
Case of Improved Observed Design
As far as the controller is concerned, the control law is computed as the solution of the LQR problem with offset (tracking) term defined in Subsect. 5.3 thanks ˆ2 in (102) and (103). Performing the to the controllability property of A2 , B computations, the LQR reference term r¯I˜ to be used in (91) assumes the value r¯I˜ = −2.34 · 103 . As in [5], the cost function weights q = 10−4 , r1 = 1, r2 = 1000 are chosen. The solution KR of the Algebraic Riccati Equation (90) gives ⎛ ⎞ 0.07 −0.01 0.14 0 0 ⎜ −0.01 0.02 −0.05 0 0 ⎟ ⎜ ⎟ −6 ⎟ (104) KR = ⎜ ⎜ 0.14 −0.05 4.33 0 0 ⎟ 10 ⎝ 0 0 0 0 0⎠ 0 0 0 00 T and then gr¯ = 0.05 0 −1.03 0 0 · 10−2 . The optimal control (91) so obtained, which should drive the state variable I˜ of the linearised system to the reference value r¯I˜, is of the form (91) with −0.71 0.25 −1.78 0 0 −4 (105) K = 10 −0.10 0.04 −3.12 0 0
On Local Observer Design for LQR Problems with Tracking
and r=
0.41 0.74
55
(106)
As far as the observer design is concerned, following the procedure described in Sect. 3, the observer to be designed has the form (29) rewritten as in (30) and here reported for the present case z˙ = (Ac − GC)z + GCx − Ac Xec
(107)
where G has to be computed, according to (33), after having verified the detectability property for the pair (Ac −Bc K, C), in order to have σ(Ac −Bc K − GC) ∈ C − . The numerical value of the equilibrium point Xec for the controlled system (93) is T Xec = 104 1.0176 3.9822 0 0 0 (108) For the computation of the matrix Ac − Bc KR , Ac has to be computed as the Jacobian of the controlled system evaluated in (108), Bc = g(Xec ), while KR is the output of the LQR optimal control problem previously solved. One has ⎛ ⎞ 1.09 0.21 1.04 0 0 ⎜−1.11 −0.23 −1.34 0 0 ⎟ ⎜ ⎟ 0 0 −0.34 0 0 ⎟ (109) Ac = ⎜ ⎜ ⎟ ⎝ 0 0 0.24 −0.52 0 ⎠ 0 0 0.16 0.50 −1.02 and, then, ⎛
⎞ 1.04 0.21 0.92 0 0 ⎜−1.06 −0.23 −1.22 0 0 ⎟ ⎜ ⎟ ⎜ 0 ⎟ Ac + Bc K = ⎜−0.76 −0.01 −23.40 0 ⎟ ⎝ 0.72 0.01 22.15 −0.52 0 ⎠ 0.04 0 1.31 0.50 −1.02
(110)
Observability of the couple Ac − Bc KR , C) can easily be checked and then it is possible to compute G so to verify the convergence condition. Discussion about the characteristics of the transient in the observer dynamics are reported in [5] and they bring to the choice of the set of eigenvalues Λ = {−1.0, −1.1, −1.2, −1.3, −1.4} to be assigned to the matrix Ac − Bc KR + GC. One has T (111) G = 103 −0.22 1.20 0.03 −0.02 0 6.2
Case of the Solution Proposed
The control law design for this case follows the same steps as for the previous ˆ2 as case. Firstly the solution of the Riccati equation (90) with A2 as in (102), B in (103) and where KR = Ks has to be computed. With the same choice of the
56
P. Di Giamberardino and D. Iacoviello
cost function weights q = 10−4 , r1 = 1, r2 = 1000, the solution Ks of (90) is the same as KR in (104). As a consequence, the same holds for gr¯. Since the solution here adopted is based on the use of a linear state observer ˆ2 , L = A2 − N C and N computed in order to of the form as in (39) with M = B have sigma(L) ∈ C − . The eigenvalues can be chosen equal to the ones for the previous case. The, the gain matrix N obtained is T N = 4.18 15.31 14.87 2.19 1.56
7
x 10
(112)
4
6
S1(t)
5 4
State feedback Improved observer Combined approach
3 2 1 0 0
10
20
Time t
30
40
50
Fig. 1. Time history of individuals in S1 (t).
3
x 10
4
2.5
S2(t)
2 State feedback Improved observer Combined approach
1.5
1
0.5
0 0
10
20
Time t
30
40
50
Fig. 2. Time history of individuals in S2 (t).
On Local Observer Design for LQR Problems with Tracking 4
6
x 10
5 4 3 I(t)
State feedback Improved observer Combined approach
2 1 0 −1 0
10
20
Time t
30
40
50
Fig. 3. Time history of individuals in I(t). 18000 16000 14000
P(t)
12000 10000
State feedback Improved observer Combined approach
8000 6000 4000 2000 0 0
10
20
Time t
30
40
50
Fig. 4. Time history of individuals in P (t). 10000 9000 8000 7000
A(t)
6000
State feedback Improved observer Combined approach
5000 4000 3000 2000 1000 0 0
10
20
Time t
30
40
50
Fig. 5. Time history of individuals in A(t).
57
58
6.3
P. Di Giamberardino and D. Iacoviello
Simulation Results
Three cases have been simulated to compare their behaviours. The first is the direct use of the state feedback optimal control without the use of an observer. It is introduced as a benchmark for the other two approaches. This case in denoted in the legend of the Figs. 1, 2, 3, 4 and 5 by State feedback. Moreover, the two solutions described in this paper are reported in Figs. 1, 2, 3, 4 and 5; the approach described in Sect. 3 and numerically given in Subsect. 6.1 is referred as Improved observer while the approach proposed in Sect. 4 with numerical values in Subsect. 6.2 is addressed as Combined approach. The pure state feedback control scheme, whose behaviour is reported in the Figures with a solid line, shows the effectiveness of the control strategy based on a local LQR problem solution: the number of infected I is very quickly reduced to a very small value, Fig. 3, and consequently, a decrease of the number of the diagnosed patients P , Fig. 4, and A, Fig. 5 is obtained. At the same time, the number of healthy individuals S1 is maintained sufficiently high, Fig. 1, while, due to a reduction of the infection probability, the passage from S1 to S2 is no more necessary for the spread containment and the individuals in S2 , reported in Fig. 2, naturally tend to zero by natural death. The effects of the two proposed approaches are also reported in the Figures: the dashed line, for which the denomination in the legend is Improved observer, depicts the behaviours of the state variables when the solution proposed in Sect. 3 for the observer design, with numerical values for the present case study reported in Subsection 6.1 is applied; the dash-dot line, marked with Combined approach, is devoted to depict the time histories of the state variables in the case of application of the procedure described in Sect. 4 and particularised to the present case study in Subsect. 6.2. The most evident effect in the use of a state observer, independently from which approach is followed in its design, is an expected different behaviour during the transient, since the control law bases its effect on a not well estimated state. When, asymptotically, the state estimation error goes to zero, the differences with the state feedback is strongly reduced. A comparative analysis of the performances of the two observer design procedures proposed put in evidence their differences at steady state, since the one which takes into account the actual working point shows a behaviour which tends to be similar to the one of the sate feedback. The other approach pays its design simplicity with an offset difference due to the fact that in this case the observer is designed to better approximate the state in a region that is different to the actual one according to the differences between the equilibrium point for the uncontrolled system and the one for the controlled dynamics.
7
Conclusions
This paper is an extension of the contribution presented at ICINCO2019. It discusses the problem of the implementation of a state feedback control, obtained
On Local Observer Design for LQR Problems with Tracking
59
solving a LQR problem for a linearised nonlinear dynamics, using a local asymptotic state observer. The approach proposed at ICINCO2019 is based on the construction of the local state observer which better approximates the dynamics in the neighbourhood of the final working point, supposed equal to the equilibrium of the state feedback controlled system. This result is here enriched starting from a different problem definition in which the linear state observer design is included in the LQR control problem and the local approximations are necessary performed in a neighbourhood of an equilibrium point of the initial uncontrolled system. The effectiveness of the two solutions obtained, as well as a comparison between them, are evidenced from the results of their implementation on a case study represented by a virus spread dynamics.
References 1. Anderson, B.D.O., Moore, J.B.: Optimal control (1989) 2. Andrieu, V., Praly, L.: On the existence of a Kazantis-Kravaris/Luenberger observer. SIAM J. Control Optim. 45(2), 432–456 (2006) 3. Di Giamberardino, P., Compagnucci, L., De Giorgi, C., Iacoviello, D.: A new model of the HIV/AIDS infection diffusion and analysis of the intervention effects. In: 25th IEEE Mediterranean Conference on Control and Automation (2017) 4. Di Giamberardino, P., Compagnucci, L., De Giorgi, C., Iacoviello, D.: Modeling the effects of prevention and early diagnosis on HIV/AIDS infection diffusion. IEEE Trans. Syst. Man Cybern. Syst. 4(10), 2119–2130 (2019) 5. Di Giamberardino, P., Iacoviello, D.: LQ control design for the containment of the HIV/AIDS diffusion. Control Eng. Pract. 77, 162–173 (2018) 6. Di Giamberardino, P., Iacoviello, D.: An output feedback control with state estimation for the containment of the HIV/AIDS diffusion. In: Proceedings of the 26th Mediterranean Conference on Control and Automation (MED) (2018) 7. Di Giamberardino, P., Iacoviello, D.: An improvement in a local observer design for optimal state feedback control: the case study of HIV/AIDS diffusion. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2019), vol. 1 (2019) 8. Kazantzis, N., Kravaris, C.: Nonlinear observer design using Lyapunov’s auxiliary theorem. In: Proceedings of the 36th Conference on Decision & Control (1997) 9. Luenberger, D.G.: Observing the state of a linear system. IEEE Trans. Mil. Electron. MIL-8(2), 74–80 (1964) 10. Massad, E.: A homogeneously mixing population model for the AIDS epidemic. Math. Comput. Model. 12(1), 89–96 (1989) 11. Naresh, R., Tripathi, A., Sharma, D.: Modeling and analysis of the spread of AIDS epidemic with immigration of HIV infectives. Math. Comput. Model. 49, 880–892 (2009) 12. Respondek, W., Pogromsky, A., Nijmeijer, H.: Time scaling for observer design with linearizable error dynamics. Automatica 40, 277–285 (2004) 13. Sassano, M., Astolfi, A.: A local separation principle via dynamic approximate feedback and observer linearization for a class of nonlinear systems. IEEE Trans. Autom. Control 64(1), 111–126 (2019) 14. Sundarapandian, V.: Local observer design for nonlinear systems. Math. Comput. Model. 35, 25–36 (2002)
60
P. Di Giamberardino and D. Iacoviello
15. Sundarapandian, V.: Reduced order observer design for nonlinear systems. Appl. Math. Lett. 19, 936–941 (2006) 16. Zeitz, M.: The extended Luenberger observer for nonlinear systems. Syst. Control Lett. 9, 149–156 (1987)
6-DoF Extrinsic Calibration Between a Manipulator and a 2D Laser Scanner Jan Alberts(B) , Sebastian P. Kleinschmidt, and Bernardo Wagner Real-Time Systems Group, Institute for Systems Engineering, Leibniz Universit¨at Hannover, Appelstraße 9A, 30167 Hannover, Germany {alberts,kleinschmidt,wagner}@rts.uni-hannover.de Abstract. A precise geometric calibration between the subsystems of a robotic system is crucial for its usage and applicability. As mobile robots are already being used in industry and households, the interest in mobile manipulators rises, and the calibration procedures between manipulators and the robot’s exteroceptive sensors becomes necessary. Nevertheless, state-of-the-art calibration procedures between a manipulator and a 2D laser scanner often require complex calibration targets or additional sensors for calibration. As a consequence the practical usage in an industrial context is limited. For this reason, we propose a simple but effective approach for determining the transformation between a manipulator and a 2D laser scanner. The transformation is solved for six degrees of freedom using a 1D calibration target. In this paper, the modeling of the original approach presented in [1] is extended to decrease the number of restrictions. Therefore, the objective function was improved leading to better optimization results and the methodology of the proposed approach is presented in more detail. Moreover, additional experiments are carried out to evaluate the accuracy of the extended calibration procedure. The effect of different calibration trajectories and calibration target’s geometries is evaluated additionally. Keywords: Mobile robotics · Flexible robotic machining · Industry 4.0 · Recalibration · Extrinsic calibration · Mobile manipulation · LiDAR · Parameter optimization
1 Introduction In this chapter, the proposed calibration procedure is presented. The motivation is stated and the calibration procedure is placed in the context of related works. 1.1 Motivation The scientific attention regarding autonomous mobile manipulators has grown steadily over the past years. The combination of manipulation and perception attached to a mobile platform promises a multitude of new applications and a huge potential in industrial robotics, logistics but also in search and rescue or disaster scenarios [10]. A key factor in the context of industry 4.0 is the modularization and flexibilization of manufacturing systems [11]. Mobile manipulators could move within the manufacturing c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 O. Gusikhin et al. (Eds.): ICINCO 2019, LNEE 720, pp. 61–78, 2021. https://doi.org/10.1007/978-3-030-63193-2_4
62
J. Alberts et al.
system and transport materials or products between the working stations. They could also be used for so-called flexible robotized machining, meaning a mobile manipulator approaches a working station and uses its manipulator to work on a work piece [8]. In the context of disaster scenarios, a manually controlled mobile manipulator has already been deployed, fulfilling exploration tasks in Fukushima [15]. In the EU-Project Smokebot another automated mobile platform was used to support firefighter by building maps of potentially hazardous areas [7]. However, in order to combine perception with manipulation it is essential to relate the associated subsystems in the same coordinate system. To put several subsystems into a geometric context, they need to be extrinsically and intrinsically calibrated [10]. Depending on the subsystems and the task, there are different requirements of the calibration procedure regarding the required time, the environmental constraints, the simplicity of use and the accuracy [1]. Furthermore, we distinguish between three types of calibration: – Initial Calibration. – Recalibration. – Flexible Calibration. The Initial Calibration needs to be applied when a robotic system has been set up or has been modified. The Recalibration has to be fulfilled whenever the robotic subsystems geometric context has changed - for example, because of environmental disturbances as vibration or harsh shocks. Flexible Calibration is the process when a flexible robot registers to a new working station and needs to be localized relative to this station. Initial Calibration and Recalibration can be performed automatically or manually, while the Flexible Calibration is intended solely for automated calibration. The 2D laser scanner is of particular relevance for mapping and localization [5], emphasizing its importance for mobile manipulators. As it is also widely used for object detection in working cells [8] or monitoring of large areas in manufacturing environments [17], this paper focuses on the calibration of a 2D laser scanner and a manipulator. A calibration procedure that we presented before in [1] has been extended, improved, and evaluated with more experiments in a real environment. In consequence, the proposed procedure is not limited to complex calibration targets or additional sensors, and therefore, easily applicable for non-professionals and automated calibration. 1.2
Related Works
Calibration procedures have been studied extensively in the context of robotic systems [4]. Regarding flexible robotic machining or mobile manipulators there exist procedures concerning intrinsic calibration of the manipulator as [19] as well as procedures for extrinsic calibration of the manipulator and exteroceptive sensors [10]. Concerning the extrinsic calibration between a 2D laser scanner and a manipulator, the known movement of the manipulator needs to be set in relation to the corresponding measurements of the laser scanner. Two principle approaches have been followed by the research community so far:
6-DoF Extrinsic Calibration Between a Manipulator and a 2D Laser Scanner
63
– Use of additional sensors. – Use of specific calibration target. In the following, both principles are compared with each other and associated publications are presented. Finally, our approach is put into the context of the related works. To calibrate the manipulator with a 2D laser scanner by using additional sensors, an exteroceptive sensor is attached to the end effector of the manipulator. This attached sensor is calibrated to the 2D laser scanner, meaning the transformation between the additional exteroceptive sensor and the laser scanner is determined. As the configuration of the manipulator is known, the transformation between the manipulator base and the 2D laser scanner can be identified. The calibration between a laser scanner and a camera is well examined in the research community. Several publications as [6, 13, 14], and [20] evaluated this approach. The common usage of those procedures is an advantage of this approach. The major drawback is that an additional sensor is needed. Moreover, a further calibration between the end effector and the camera is necessary. This problem is known as hand-eye-calibration-problem and investigated widely, for example in [9] and [18]. This additional calibration may also reduce the accuracy of the calibration. To calibrate the manipulator to a 2D laser scanner by using a calibration target an object with known geometry is attached to the end effector. The configuration of the manipulator is chosen such that a cross-section of the calibration target within the laser plane is measured by the laser scanner. This cross-section depends on the configuration of the manipulator, the geometry of the calibration target and the transformation between manipulator base and laser scanner. As a consequence, the measurements of the laser scanner can be used for calibration. A 3D calibration target is used in [12] to determine the transformation between a manipulator and a 3D laser scanner. A sphere is attached to the end effector, and the translation and orientation are determined in two iterative steps. A different approach requiring a 3D measurement is used in [8]. A 2D laser scanner is attached to a straight rail and can measure 3D points by moving along the rail. In at least five measurements, the pose of a simple 2D plate is estimated to determine the transformation between a manipulator, the rail, and the laser scanner. In [3], a 3D calibration target is designed in such a way that 2D cross sections lead to unique estimates of the laser scanners pose. This enables to detect the objects pose with a single measurement. An object attached to the end effector could be used to determine the transformation between the end effector and laser scanner leading to the transformation between laser scanner and manipulator base as the configuration is known. Only one measurement is required, but the calibration target is complex, and as stated in [2] the target should span an area of at least 1 m2 for a reliable calibration. In [16], several calibration procedures for calibrating a robotic system with two manipulators have been presented. An approach has been proposed for calibrating a tilting 2D laser scanner with a manipulator. A checkerboard was attached to the end effector and intensity measurements of the tilting laser scanner have been used to determine the checkerboards pose in the laser scan. A tilting laser scanner is required for this procedure and the transformation between the end effector and the checkerboard has to be known initially or has to be determined by a further calibration. In [2], an approach is presented where a fixed 2D laser scanner determines the pose of a complex 2D calibration target without using intensities. The geometry of the calibration target is well known, and features of the
64
J. Alberts et al.
cross section are used in multiple measurements to determine its pose in the laser scan. The procedure is evaluated in simulation, and while the translational error of calibration is quiet low, the rotational error is relatively large (±0.1 cm and 4.22◦ ). In our original publication [1], we proposed a calibration procedure using a 1D calibration target. As mentioned earlier in this section, calibration procedures using a calibration target do not require additional sensors. Moreover, the use of a calibration target that is available for the robot during operation makes it applicable for recalibration. Using our approach, there is no need for 3D measurements like in [8, 16], and [12] or measuring intensities like in [16]. The approach has been evaluated in simulation as well as real experiments, and the remaining error after calibration were less significant than in [2]. In this paper, our proposed calibration procedure is extended by the use of a twisted 1D calibration target. Thus, the procedure is even simpler applicable requiring less restrictions. The evaluation in [1] did not consider the effect of the calibration trajectory on the quality of the calibration. Furthermore, it did not examine the effect of the calibration target’s geometry. For this reason, additional experiments are evaluated and presented regarding the effect of the trajectory and calibration targets geometry in this publication. Besides, the methodology and underlying assumptions are presented in more detail. Finally, the extended procedure will be compared to the original approach.
2 Methodology In this chapter our approach for calibrating a serial manipulator with a 2D laser scanner is formulated and the mathematical background is given. We start by formalizing the kinematic chain of the serial manipulator, the calibration target, and the laser scanner. Finally the closed transformation chain is modelled heading to the definition of the optimization problem, which is the core of our proposed method. The proposed formulation of the optimization problem is an extension of the formulation published in [1] and facilitates the calibration requiring less restrictions. 2.1
Mathematical Background
In 3D space, a rigid body transformation from coordinate frame Fa to Fb is given by Tτb:a ∈ R3 → R3 . The vector τ describes the transformation and consists of a translational and rotational component. The translational part is represented by three parameters describing the translations along the three main axes. A minimal representation of the rotation is given by a set of three angles. The so-called Euler Angles represent sets of angles, composing three successive rotations leading to unique orientations. Thus, the vector τ is a 6D vector defined by τ = ψ, ϑ, ϕ, tx , ty , tz , where tx , ty , tz define the translation along the dedicated axis and ψ, ϑ, ϕ, ∈ [0, 2π[ define the sequential rotations in ZYX-Euler-Convention. The homogenous transformation Hab ∈ R4x4
6-DoF Extrinsic Calibration Between a Manipulator and a 2D Laser Scanner
65
is a non-minimal representation of Tτb:a which consists of a rotation matrix Rab ∈ R3x3 a translation vector Pab ∈ R3 from a to b and a stretching factor ε ∈ R: Rab | Pab Hab = 0 0 0| A rigid body transformation in 3D space for ZYX-Euler angles is represented as homogenous transformation by ⎡ ⎤ cϕ cϑ cψ − sϕ sψ −cϕ cϑ cψ − sϕ sψ cϕ sϑ tx ⎢sϕ cϑ cψ + cϕ sψ −sϕ cϑ sψ + cϕ cψ sϕ sϑ ty ⎥ ⎥. H(τ ) = ⎢ ⎣ −sϑ cψ sϑ sψ cϑ tz ⎦ 0 0 0 1 By convention ci and si substitute cos(i) and sin(i). The multiplication of homogenous transformations models the sequential transformations of serial kinematic chains Hac = Hab · Hbc . 3D points are modelled as 4D vectors, making them applicable for homogenous coordinate transformations. Thus, a 3D point in coordinate Frame Fa is denoted as pa = pax , pay , paz , 1 T . 2.2 Problem Formulation Figure 1 illustrates the experimental setup. A serial manipulator with its base frame FB is shifted and rotated towards a 2D laser scanner with its frame FL . The purpose of the calibration procedures is to determine the static transformation TΨB:L from FL to FB , where Ψ = ψ, ϑ, ϕ, tx , ty , tz is the estimated 6D vector describing the translational and rotational transformation using the ZYX Euler convention. This transformation is B (Ψ ). expressed as the homogenous transformation matrix HL 2.3 Modeling of the Manipulator’s Kinematic Chain The kinematics of a serial manipulator are represented as a transformation chain connecting the coordinate frames of the corresponding links Fi , where i ∈ F and F is the set of existing coordinate frames. The manipulator’s configuration is defined as c ∈ C, where C describes the set of all configurations during a calibration procedure. Thus, the transformation from manipulator base FB to the end effector FE along the several links can be combined to one homogenous transformation as a function of c leading to E (c). Figure 2 illustrates the kinematic chain of a serial manipulator. The transforHB mation from the end effector link to the grippers Tool Center Point (TCP) is defined as G0 . By convention, the TCP’s Z-axes points towards grasping direction. HE
66
J. Alberts et al.
x y
FE z
z
z y
x
FB
y
FL
x
Fig. 1. Setting and frame diagram of manipulator, laser scanner and end effector [1]. F2 FE F1 H12
H2E
1 HB
FB E B
Fig. 2. Illustration of a serial manipulator, the coordinateframes and the kinematic chain from manipulator base to end effector.
2.4
Modeling of the Calibration Target
The calibration target used in our approach is a cylinder with height h and diameter d. Figure 3a shows the calibration target attached to the gripper. As h d the calibration target is modeled as a 1D line in R3 . The calibration target is attached to the TCP of the gripper. It is assumed that the angle between the TCP and calibration target is rotated about Y-axes by λϑ . This angle is variable during attachment but fixed during calibration procedure. The angles around X- and Z-axes are assumed to be 0. Thus, the
6-DoF Extrinsic Calibration Between a Manipulator and a 2D Laser Scanner
67
λϑ
(a) Photo of cylindric calibration target attached to the gripper
(b) Photo of twisted calibration target attached to the gripper
Fig. 3. Attached calibration target.
calibration target can be described by the following homogenous transformation ⎡ ⎤ cos(λϑ ) 0 sin(λϑ ) 0 ⎢ 0 1 0 0⎥ Ge ⎥ HG (λϑ , λz ) = ⎢ ⎣− sin(λϑ ) 0 cos(λϑ ) λz ⎦ , 0 0 0 0 1 where λz ∈ [0, h] and λϑ ∈ [0, 2π[. Figure 3b shows a photo of a twisted calibration target attached to the gripper with angle λϑ . It is clearly shown that the twist of the calibration target by attachment to a regular gripper can be described by a single rotation. 2.5 Modeling of the Laser Scanner The proposed approach uses a LiDAR (light detection and ranging) Laser Scanner. A rotating laser beam is reflected by surrounding objects. The resulting time of flight of the laser beam enables conclusions about the distance to obstacles along the corresponding path of the laser beam. The total of all measurements of a 360-degree scan delivers a model of the environment’s geometric conditions for a specific time. Thus, the 2D Laser Scanner spans a 2D plane. A single measurement can be modT eled in cartesian space as t xL (i), t y L (i) , where i ∈ N|i ≤ N is the measurements index, t represents the time, N is the amount of measurements contained in a full 360◦
68
J. Alberts et al.
laser scan, and L denotes the coordinate frame of the laser scanner. In the following the T laser scan is modeled as a 4D vector t X L (i) = t xL (i), t y L (i), 0, 1 making it applicable with homogenous transformations. The measurement of a full 360◦ laser scan at time t is defined as the merged set L(t) :=
N
tX
L
(i, t).
i=1
2.6
Modeling of the Closed Transformation Chain
For the proposed calibration procedure, the calibration target is attached to the manipulators end effector (see Fig. 1). A closed transformation chain between laser scanner, calibration target, and manipulator occures as the calibration target intersects the 2D laser plane. The resulting intersection is illustrated in Fig. 4. E HB
FE Gi HE
FGi FB B HL
XL
L
Fig. 4. Illustration of setting, frame diagram, and transformation chain between laser scanner, end effector, calibration target, and laser-measurement [1]. B The transformation between the laser scanner and the manipulator HL (Ψ ) is assumed to be static. The transformation between the manipulator base and the end E (c). As each configuration c leads to effector is a function of the configuration: HB a unique intersection point in the laser plane, the intersection may be modelled independent of time and measurement index. This leads to X L (c), which is the subset of L describing the intersection point between the calibration target and laser plane. The intersection along the Z-axes of the calibration target al.so depends on the configuration Gi (γϑ (c), γz (c)), with γz (c) ∈ R|0 ≤ γz (c) ≤ h and γϑ ∈ [0, 2π[. leading to HG 0 The transformation chain is modeled as ⎡ ⎤ 0 ⎢0⎥ G0 Gi L B E ⎥ (1) X (c) = HL (Ψ ) · HB · HE · HG0 (c, γϑ , γz ) · ⎢ ⎣0⎦, 1
X B (c,γϑ ,γz )
6-DoF Extrinsic Calibration Between a Manipulator and a 2D Laser Scanner
69
and can be summarized as B X L (c) = HL (Ψ ) · X B (c, γϑ , γz ).
(2)
X B (c, γϑ , γz ) can be interpreted as the intersection point referring to the manipulator base frame FB . 2.7 Modeling of the Optimization Problem Laser measurements of the intersection points between calibration target and laser plane combined with the corresponding configurations of the manipulator lead to a unique B (Ψ ). By the formulation of an optimization problem this missing transformation HL transformation can be determined optimal in terms of a defined mean error. Thus, the principle of the calibration procedure is based on an optimization problem minimizing the abbreviation of the resulting closed transformation chain expressed by Eq. 2. As Eq. 2 relates the missing parameters to the configuration and the laser measure ments, the translational error vector E = ex ey ez 0 T is defined by B (Ψ ) · X B (c, γϑ , γz ) − X L (c). E(c, Ψ, γϑ , γz ) = HL
(3)
γϑ (c) and γz (c) determine the intersection point referring to the calibration target. Thus, γz (c) varies for each configuration. The solution of the formulated optimization problem will also lead to optimal values of the mentioned variables which are not of interest. The scalar error is given by: 1 E T (c) · E(c), (4) E(C, Ψ, γϑ , γz ) = N c∈C
where N = |C|. Note that the formulated objective function corresponds to the mean translational abbreviation between given laser measurements and calibration target intersections. Thus, the objective function leads to a geometric interpretable result of the optimization. Furthermore, using the square root function leads to better optimization results as the nonlinear function penalizes errors closer to zero. Finally, the free optimization problem is formulated as min E(C, Ψ, γϑ , γz ) Ψˆ , γˆϑ , γˆz = arg (5) Ψ,γ (c),γ (c) ϑ
z
Note that this optimization problems solution provides the parameters of the missing transformation between laser scanner and manipulator base Ψ such as γz (c) and γϑ (c), which describe the intersection points along the calibration target. 2.8 Preprocessing of Laser Measurements As the real calibration target responds to a 3D geometry with the shape of a cylinder but was simplified by a 1D representation, the corresponding measurements of the crosssection need to be preprocessed.
70
J. Alberts et al.
dc
dhe
dc
Fig. 5. Illustration of the cross section between the 2D laser plane and the real calibration target.
A cylinder that intersects a 2D plane vertically creates a circle. As the entrance angle decreases, this circle is distorted, and an elliptical shape with the so-called main and secondary axis is formed. This is illustrated in Fig. 5. The length of the secondary axis of the resulting ellipse always corresponds to the diameter of the calibration target dc . The length of the main axis dhe results in dhe =
dc ≥ dc , sin(δ)
(6)
where δ defines the entrance angle. Imposing the restriction that δ ≤ 60◦ the resulting distortion can be limited to dhe 1 ≈ 1.155. = dc sin(60◦ ) Under the assumption that the laser measurements result in a perfect circle, the approach is chosen to determine the arithmetic mean of the laser measurements and to add a fixed correction value π · dc /8. Thus, the center of the circle is determined, which can be used for the optimization. The correction is illustrated in Fig. 6. This assumption is a simplification that neglects the distortion of the cross-section.
3 Experiments To validate the extended calibration procedure, real experiments are performed, which will be described subsequently. The following chapter presents the metrics for evaluation, the used hardware and ground truth, the experiments regarding different trajectories, and the experiments concerning various geometries of calibration targets.
6-DoF Extrinsic Calibration Between a Manipulator and a 2D Laser Scanner
71
dc ·π 8
Fig. 6. 2D cross section of the cylindric calibration target. Arithmetic mean (grey) and the corrected arithmetic mean (black) [1] (Color figure online).
3.1 Methodology and Metrics Data provided by real measurements is applied to the formulated optimization problem given in Eq. 5 and leads to an optimal solution Ψˆ . For solving the optimization problem, the MATLAB implementation (fmincon) of the interior point algorithm is used. Therefore, the following bounds are defined: rx,y,z ∈ ϕ, ϑ, ψ ∈ γϑ ∈ γz ∈
−2 ≤ rx,y,z −2π ≤ ϕ, ϑ, ψ
≤ ≤
−π −0.3
≤ π ≤ 0.3
≤ ≤
γϑ γz
2 2π
The optimizer is initialised using zero vectors: T Ψ0 = 0, 0, 0, 0, 0, 0 T γz,0 = 0, ... , 0 γϑ,0 = 0 As the optimization problem requires laser measurements for several configurations c a sufficient calibration trajectory is necessary. To determine the accuracy of the calibraB = H(Ψ ˆ ) is tion method the resulting homogenous transformation of the estimate HˆL ˜ B ˜ compared to ground truth HL = H(Ψ ) in several metrics. The transformation of the resulting error is calculated as: R E | PE B · (HˆB )−1 HE = (7) = H˜L L 0 0 0| 1 The rotational error is determined by evaluating the ZYX-Euler angles of RE such that a qualitative and quantitative inspection of rotational error along the several axes of the laser scanner can be carried out. The translational error is given as PE . Moreover, the Euclidean Norms of the errors are determined, leading to:
72
J. Alberts et al.
ex,y,z
eϕ,ϑ,ψ
⎡ ⎤ ex = ⎣ey ⎦ , e|xyz| = |ex,y,z |2 ez ⎡ ⎤ eϕ = ⎣ eϑ ⎦ , e|ϕϑψ| = |eϕ,ϑ,ψ |2 eψ
The optimizers result of the objective function is also determined for the comparison of the several experiments: Eˆ = E(C, Ψˆ , γˆϑ , γˆz ) 3.2
Hardware and Ground Truth
The experimental setup consists of three parts. – Manipulator – 2D Laser Scanner – Calibration Target A UR5 industrial robot by Universal Robots has been used for experiments. It has 6 degrees of freedom in joint space, a range of 850 mm and its repeatability is specified by the manufacturer as ±0.1 mm. The used 2D laser scanner is a Hokuyo UST-10LX-H01. It has a range of 0.06 m to 10 m, a detection angle of 270◦ , and an angular resolution of 0.125◦ . The accuracy of range measurements is specified by the manufacturer as ±40 mm. Two different calibration targets were used. A 3D printed calibration target with a diameter of 2.5 cm (see Fig. 3) and a customary pen with a diameter of 0.8 cm (see Fig. 9). The ground truth used for the evaluation of the estimated transformation is the transformation between the laser scanner and the manipulator that was determined by generating a closed transformation chain between the manipulator’s frame and the laser scanner’s frame. The laser scanner and the last link of the manipulator are assembled on a self-developed, 3D printed calibration mount. Figure 7 shows the used calibration setup. The transformation from the manipulator base to the end effector is given by the manipulator with a repeatability of 0.1 mm. The laser scanners technical dimensions and the center of its frame are specified by the manufacturer. The angular tolerance with respect to the structure is given by ±1◦ about the vertical plane and ±0.3◦ about the horizontal plane. The tolerance of the laser scanners laser beam is 1◦ . The manufacturing tolerance of the used 3D printer is ±0.2 mm. 3.3
Experiments - Trajectories and Procedure Extension
In the first experiments, the effect of the calibration trajectory and the extension of the procedure is evaluated. The main focus of the evaluation concerns the distance of
6-DoF Extrinsic Calibration Between a Manipulator and a 2D Laser Scanner
(a) Disassembled calibration system
73
(b) Assembled calibration system
Fig. 7. Calibration system used to determine the ground truth transformation [1]
the calibration target to the laser scanner. The laser scanner’s accuracy in cartesian space depends on the distance to the laser scanner, directly effecting the quality of the calibration. As already stated, the extension of the procedure concerns the twist of the calibration target. Thus, setting the twisted angle to a value of 0 results in the algorithm used in [1] while defining the twist angle as a parameter to determine by optimization represents the extended algorithm, as proposed in this paper. Two trajectories have been evaluated, each using the original and the extended algorithm. Both calibration trajectories consist of 6 end-effector poses, and all the poses differ in the six degrees of freedom. The first trajectory covers an area of about 245◦ and a distance of 0.2 to 0.35 m with respect to the laser scanner. The second trajectory covers an area of about 220◦ and a distance of 0.65 to 0.95 m. All the experiments have been evaluated with the calibration target, as shown in Fig. 3. Furthermore, the pose of the end effector was defined such that the entry angle of the calibration target is limited to 60◦ as stated by the defined restrictions in Sect. 2.8. The laser scans of both trajectories are illustrated in Fig. 8 a) and b). The duration of the short distance trajectory was 37 s and the duration of the longdistance trajectory was 44 s. The number of measuring points was 200 for each trajectory. Table 1 presents the results of both calibration trajectories. The twist angle γˆϑ is given according to the introduced evaluation metrics. The highest error occures in the Long Distance Trajectory using the extended algorithm. The Euclidean norm of translational error is 20.74 mm, wherein the proportion along the Z-axis is 18.90 mm. The error of the rotational norm is 2.10◦ . However, relating this error to the laser scanner’s technical specifications reveals that it is close to the measurement’s tolerances. As the tolerance of the laser beams thickness is specified by ±1◦ and the tolerance of the laser scanners horizontal plane is specified by ±1◦ , the
74
J. Alberts et al. 1 0.3
0
y
y
0.5
0
−0.5
−0.3 0
x
0.3
(a) Short Distance Trajectory
0
x
0.5
(b) Long Distance Trajectory
Fig. 8. Illustration of preprocessed calibration trajectories measured by the laser scanner. Table 1. Results of experiments concerning different distance trajectories. Evaluation of error metrics and calibration target twist angle. Short distance trajectory Long distance trajectory Original Alg. Extended Alg. Original Alg. Extended Alg. ex /mm
4.84
4.17
8.39
7.81
ey /mm
1.38
1.43
−3.72
−3.95
ez /mm
−9.73
−14.80
−14.71
−18.80
15.44
17.33
20.74 1.40
e|x,y,z| /mm 10.95 ◦
eϕ /
0.06
−0.11
1.57
eϑ /◦
0.49
0.42
1.11
1.56
eψ /◦
−0.59
−0.87
0.05
0.05
e|ϕϑψ| /◦
0.77
0.97
1.92
2.10
γˆϑ /◦ ˆ E/mm
0.00
1.90
0.00
−4.74
0.084
0.079
0.312
0.306
total rotational tolerance about X- and Y-axis is ±2◦ . Furthermore a rotational tolerance of 2◦ leads to a vertical tolerance of 29.67 mm over the manipulators range of 850 mm, relativizing the error along the Z-axis of 18.80 mm. The comparison between the Short Distance Trajectory and Long Distance Trajectory reveals an additional insight. It can be seen that the translational error is lower in Short Distance Trajectory. The lowest Euclidean translational error in Short Distance Trajectory is 10.95 mm and the lowest translational error in Long Distance Trajectory is 17.33 mm. The same applies for the rotational error.
6-DoF Extrinsic Calibration Between a Manipulator and a 2D Laser Scanner
75
The lowest Euclidean rotational error is 0.77◦ in Short Distance Trajectory and 1.92◦ in Long Distance Trajectory. However, the error of rotation around the Z-axis of the Long Distance Trajectory is by a factor of 12 significantly smaller than for the Short Distance Trajectory. Meaning, the accuracy of laser measurements is not the only aspect affecting the error of calibration. The trajectory and selection of correct poses is a crucial factor for calibration. The lower error of rotation arout the Z-axis in Long Distance Trajectory can be explained by the fact that the rotation of a plane can be fitted more accurate by using data points wide spreaded across the measurement space in a long distance to the laser scanner. As the 2D laser scanner only measures in horizontal plane, this fact cannot be generalized for angles around the Y- and X-axis but applies for the Z-axis. Furthermore, it can be seen that there is a correlation between translational error along one axis and the rotational error around the perpendicular axis. The translational error is decreased but the rotational error increased. This effect does not occur for Long Distance Trajectories. The comparison between the original and the extended approach reveals a new aspect. For the Short Distance Trajectory, the original algorithm results in lower translational and rotational errors and the difference between the results is up to 35%. For the Long Distance Trajectory the original algorithm also results in smaller translational and rotatinal error but the difference between the results is less significant. Given the fact that the accuracy of the measurements in near distance is smaller than in far distance but the original algorithm using additional restrictions (ˆ γϑ = 0◦ ) results in less error, we can obtain that the optimization using fewer restrictions seems to compensate model errors instead of fitting parameters. While the objective function does have a minimal improvement of 0.005 mm the error of the calibration is higher (30% in translational norm and 25% about rotational norm). The model errors seem to have a stronger impact for near measurements. Furthermore, all the measurements reveal that the rotatinal error along the Z-axis of the laser scanner is the highest. It is in the range from −9.73 to 18.80 mm. It is assumed that the restriction of poses to a limited entry angle of 60◦ caused this effect as a high entry angle in trajectory leads to a high change in laser measurements, which should improve the fitting of the laser plane along the vertical axis. 3.4 Experiments - Calibration Target In the second experiment, the effect of the calibration targets geometry has been evaluated. For this reason, another calibration target has been evaluated and compared to the calibration target used in the first experiment. The calibration target has been evaluated for Short and Long Distance Trajectory as illustrated in Fig. 8 a) and b). The used calibration target is a customary pen with a diameter of 8 mm, illustrated in Fig. 9. As in the first experiment, the duration of the trajectory are 37 and 44 s and 200 data points were used for optimization. For this experiment the extended algorithm is used. Table 2 shows the results of the experiment. The results of the Short and the Long Distance Trajectory indicate that the same qualitative results can be achieved as in the first experiment. The error of the Short
76
J. Alberts et al.
Fig. 9. A customary pen used as Calibration Target. Table 2. Results of experiments using a customary pen as calibration target. Short distance trajectory Long distance trajectory ex /mm
5.17
10.05
ey /mm
4.74
2.36
ez /mm
−1.87
−15.02
e|x,y,z| /mm 7.26 ◦
18.22
eϕ /
0.53
1.83
eϑ /◦
0.66
2.33
eψ /
−0.64
−0.18
e|ϕϑψ| /◦
1.06
2.96
γˆϑ /◦ ˆ E/mm
1.39
4.01
0.025
0.034
◦
Distance Trajectory is less than the Long Distance Trajectories error except the rotation around the Z-axis. Comparing the pen and the calibration target, as shown in Fig. 3, it could be demonstrated that the results are nearly the same. The translational error of the pen could be reduced, giving an improvement of about 3 mm in Euclidean norm. The rotational error of the pen is slightly increased, giving an degradation of about 0.3◦ in Euclidean norm. There is one significant difference in the error along the Z-axis of the laser scanner using the pen as calibration target. The error along the Z-axis using the pen delivers an error below 2 mm.
6-DoF Extrinsic Calibration Between a Manipulator and a 2D Laser Scanner
77
Even though, the pen’s diameter is a third of the calibration targets diameter shown in Fig. 3, the error of the calibration does not decrease. That result was contradictory at first, as the thinner calibration target leads to less measuring points in the laser plane which is thought to lead to a worse estimate of the calibration targets center. That the result is still better reinforces the impression that the unmodelled distortion of the cross section is the major problem of the procedure. A thinner calibration target leads to a less absolute error caused by distortion as shown by Eq. 6: dhe =
8 mm 25 mm dc = = 9.2 mm < = 28.74 mm sin(60◦ ) 0.87 0.87
Furthermore, it could be proven that the translational error along the Z-axis is more than a factor of 5.2 less than the errors of the previous calibrations. However, it could be shown in the experiment that a customary pen is sufficient for calibration, which is of practical relevance as no specially manufactured calibration target is needed.
4 Conclusions In this paper, the calibration approach proposed in [1] has been extended. By formulating a new optimization goal the number of restrictions could be reduced. The modeling and preprocessing of the laser scan has been described in detail. The extension has been compared with the original approach and the effect of different trajectories and calibration target’s geometry have been evaluated. It could be shown that the algorithm is well suited for calibration as the results with the highest error still lead to a calibration accuracy within the laser scanner’s tolerance. Overall, a trajectory in low distance (20 to 35 cm) to the laser scanner leads to better results than in long distance (65 to 95 cm) as the tranlational error is about 10% less and the rotational error is about 50% less. As the rotational error around the Z-axis is about a factor of 10 reduced in long distance, the relation between calibration trajectory and calibration error is not only affected by distance, but also by pose. Moreover, the experiments show that disregarding the distorted cross-section of the calibration target impairs the calibration results, even though this effect was reduced by restriction of the entry angle. Finally, it could be shown that a customary pen could be used for calibration with a translational error of 7.26 mm and a rotational error of 1.06◦ . In consequence, a customary pen is sufficient for calibration, which is of practical relevance. Neither an additional sensor nor a special manufactured calibration target is necessary for calibration using the proposed procedure. It was shown that the chosen poses for calibration do have an effect on the received calibration error. This relationship cannot be simplified to the distance to the laser scanner. Furthermore, it was shown that the distortion of the cross-section, which depends on the entry angle of the calibration target, seems to impair the calibration error. The gained insights motivates the research of the following two aspects in more detail. Future research will focus on modeling of uncertainty to reveal optimal poses or trajectories for calibration. Moreover, the research will focus on improving the algorithm by considering the distortion of the calibration targets cross-section.
78
J. Alberts et al.
References 1. Alberts, J., Kleinschmidt, S.P., Wagner, B.: Robust calibration procedure of a manipulator and a 2D laser scanner using a 1D calibration target. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 112–119 (2019) 2. Andersen, T.T., Andersen, N.A., Ravn, O.: Calibration between a laser range scanner and an industrial robot manipulator. In: 2014 IEEE Symposium on Computational Intelligence in Control and Automation (CICA), pp. 1–8 (2014) 3. Antone, M.E., Friedman, Y.: Fully automated laser range calibration. In: British Machine Vision Conference (BMVC) (2007) 4. Elatta, A.Y., Gen, L.P., Zhi, F.L., Daoyuan, Y., Fei, L.: An overview of robot calibration. Inf. Technol. J. 3(1), 74–78 (2004) 5. Engineering, I.: Mapping and localization in 3D environments using a 2D laser scanner and a stereo camera. J. Inf. Sci. Eng. 144, 131–144 (2012) 6. Fan, Y., Huang, Y., Shan, J., Zhang, S., Zhu, F.: Extrinsic calibration between a camera and a 2D laser rangefinder using a photogrammetric. Sensors 19(9), 2030 (2019) 7. Fritsche, P., Zeise, B., Hemme, P., Wagner, B.: Fusion of radar, LiDAR and thermal information for hazard detection in low visibility environments. In: 2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR), pp. 96–101. IEEE (2017) 8. Heikkil¨a, T., Ahola, J.M., Koskinen, J., Sepp¨al¨a, T.: Calibration procedures for object locating sensors in flexible robotized machining (2014) 9. Horaud, R., Dornaika, F.: Hand-Eye Calibration. Int. J. Robot. Res. 14(3), 195–210 (1995) 10. Kriegel, S., Kaßecker, M., Suppa, M., Brucker, M., Bodenmu, T.: Toward fully autonomous mobile manipulation for industrial environments. Int. J. Adv. Robot. Syst. 1–19, August 2017 11. Lasi, H., Fettke, P., Kemper, H.G., Feld, T., Hoffmann, M.: Industry 4.0. Bus. Inf. Syst. Eng. 6(4), 239–242 (2014) 12. Li, J.: Calibration of a portable laser 3-D scanner used by a robot and its use in measurement. Opt. Eng. 47, 1–8 (2008) 13. McIvor, A.M.: Calibration of a laser stripe profiler. In: Second International Conference on 3-D Digital Imaging and Modeling (Cat. No.PR00062), pp. 92–98 (1999) 14. Mei, C., Rives, P.: Calibration between a central catadioptric camera and a laser range finder for robotic applications. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006, pp. 532–537 (2006) 15. Nagatani, K., Kiribayashi, S., Okada, Y., Otake, K., Yoshida, K., Tadokoro, S., Nishimura, T., Yoshida, T., Koyanagi, E.: Emergency response to the nuclear accident at the fukushima daiichi nuclear power plants using mobile rescue robots. J. Field Robot. 30(1), 44–63 (2013) 16. Pradeep, V., Konolige, K., Berger, E.: Calibrating a multi-arm multi-sensor robot: a bundle adjustment approach. In: International Symposium on Experimental Robotics (ISER). New Delhi, India (2010) 17. R¨owek¨amper, J., Ruhnke, M., Steder, B., Burgard, W., Tipaldi, G.D.: Automatic extrinsic calibration of multiple laser range sensors with little overlap. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 2072–2077 (2015) 18. Tsai, R.Y., Lenz, R.K.: A new technique for fully autonomous and efficient 3D robotics hand/eye calibration. IEEE Trans. Robot. Autom. 5(3), 345–358 (1989) 19. Wieghardt, C.S., Wagner, B.: Self-calibration of a mobile manipulator using structured light. In: 2017 18th International Conference on Advanced Robotics (ICAR), pp. 197–203 (2017) 20. Zhang, Q., Pless, R.: Extrinsic calibration of a camera and laser range finder (improves camera calibration). In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol. 3, pp. 2301–2306 (2004)
Simultaneous Control of Two Robotics Arms Sharing Workspace via Visual Predictive Control E. Le Fl´echer1,2 , A. Durand-Petiteville3(B) , V. Cadenat1,2 , and T. Sentenac1 1
3
CNRS, LAAS, 7 Avenue du Colonel Roche, 31400 Toulouse, France {eleflech,cadenat,sentenac}@laas.fr 2 Univ de Toulouse, UPS, 31400 Toulouse, France Department of Mechanical Engineering, Federal University of Pernambuco UFPE, Recife, Brazil [email protected]
Abstract. This work addresses the problem of simultaneously controlling two robotic arms to automatically pick up fruits in an orchard. When considering such a scenario, the design of the controller has to face several challenges: (i) the arms share the same workspace, (ii) the presence of obstacles such as branches and other fruits, and (iii) the system evolves in a dynamic environment, i.e., the positions of the fruits and branches change over time. In this paper, it is proposed to control the two robotic arms relying on a Visual Predictive Control (VPC) scheme. It allows to control the system in a reactive fashion while taking into account a large number of constraints. The global and local models of the VPC scheme as well as constraints dealing with collisions or joint boundaries are presented. Finally, the efficiency of the proposed approach is highlighted with numerous simulation results using the PR2 arms model. Keywords: Agricultural robotics · Multi-arms · Non-linear model predictive control · Image based visual servoing
1 Introduction To meet the increasing food demands from a growing world population, agriculture will need to double its production by 2050 [11, 14]. This cannot be achieved by simply doubling the inputs (land, water, seeds, labor, etc.) because of constrained resources and environmental concerns. Among the solutions presented in [11, 14] to increase the global agricultural total factor productivity, robotics is one with the potential to effect agriculture in a broad, systemic way, and contribute significantly to meeting our future needs. Up to today, the three most significant impacts robotics and automation have had on agriculture are: (i) Precision agriculture, i.e., the use of sensors to precisely control when and where to apply fertilizers or water; (ii) Auto-guidance on field crop machinery, which today can drive down a field with an accuracy unattainable by human drivers; (iii) Machines that harvest fruits and vegetables for processing (e.g., tomato c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 O. Gusikhin et al. (Eds.): ICINCO 2019, LNEE 720, pp. 79–98, 2021. https://doi.org/10.1007/978-3-030-63193-2_5
80
E. Le Fl´echer et al.
paste and orange juice) [3]. Research is now focused on the next wave of sensing, mobility, and manipulation technologies that promise to increase agriculture output and productivity. Over the last years, one has focused on robotic systems performing weeding, spraying, trimming, or harvesting in orchards. In order to achieve any of these tasks, the following capacities are expected for the robotic system: (i) autonomously navigate in an orchard; (ii) detect and localize a set of natural landmarks of interest; (iii) approach the landmarks with some robotic manipulators; (iv) manipulate/interact with the landmarks of interest without damaging them. After addressing the autonomous navigation problem in [8, 20], the focus is now on the landmark approach problem when considering numerous robotic manipulators embedded on a mobile platform. This problem has already been addressed to design harvesting systems for cucumbers [25] or tomatoes [29]. However, despite the promising results, the obtained harvesting throughput does not match with the farmers needs and must be increased [28]. To achieve this aim, one of the most adopted approach consists of embedding several robotic arms on the mobile platform, as it is done in [28] for example. However, the throughput increase is not proportional to the number of manipulators, it also has to do with their distribution on the machine and their workspace. Indeed, it is pointed in [26] that fruits are not uniformly distributed on commercial orchards trees. To maximize the fruits accessibility, growers use specific trees with high fruits density, thus facilitating the manual harvesting. As a consequence, when designing a harvesting embedding multiple arms, it is also important to consider the workspace of each arm and the possible overlapping. Thus, even if the arms do not collaborate to manipulate one single object, they share a common workspace. It is then possible to use several arms to pick fruits in the same high density area, which increases the overall throughput. In this work, it is proposed to design a control scheme for two robotic arms aiming at positioning their end effector to achieve an agricultural task in an orchard. When considering performing a task in orchard, it seems relevant to focus on reactive controllers. Indeed, a tree is a highly dynamic environment because of the flexibility of the branches and the weight of the fruits: each time a fruit is harvested, the local distribution of fruits and branches is modified. Among the available control scheme, it is proposed to use an image based visual servoing (IBVS) scheme [6]. It requires the use of a camera, and nowadays it is simple to obtain a camera small enough to be install close to the endeffector without creating any trouble. Moreover, it seems to be the appropriate sensor to track the fruit of interest. IBVS consists of computing the error between the current coordinates of points of interest and a set of desired values. This error is then minimized via a controller based on the image Jacobian matrix, allowing to control the displacements of the end effector. Computing the control law directly in the image space makes this approach robust to modeling errors and reactive objective modification, however it does not provide any guarantee regarding the trajectory of the camera in the workspace. This represents a major issue when considering arms sharing a common workspace. Numerous works have been developed in order to modify the camera trajectory in the workspace or to handle issues such as the loss of the visual features, collisions with obstacles, ... For example, the work presented in [12] proposes to use advanced visual features to obtain more appropriate camera trajectories. However, they do not
Simultaneous Control of Two Robotics Arms Sharing Workspace
81
guarantee the non-collision with obstacles. Another approach consists of coupling IBVS with other elementary controllers, and to switch from one controller to the other based on the current situation (obstacles, occultations, joint boundaries, ...) [4, 7, 9]. This approach requires to design numerous ad-hoc controllers, and defining the switching rules can happen to be tenuous. Other works use the robot redundancy to modify the trajectory in order to satisfy the given constraints [5, 22]. However, this method can only be used if the main task does not use all the degrees of freedom [21], which might be the case when using systems with large number of degrees of freedom, such as humanoid robots. Finally, it has been proposed to couple IBVS with path planning [24], [18], in order to generate a path free of occultations and collisions. However, to do so, it is mandatory to have a model of the environment prior to the task execution, which might be challenging in a dynamic environment. In this work, it is proposed to rely on a well adopted solution allowing to take into account constraints on position, velocity, or workspace: the model predictive control scheme [15]. This approach relies on the minimization of a cost function made of the predicted states of the system and defining the task to achieve. Moreover, when minimizing the cost function with a numerical solver, this latter can handle numerous constraints [15]. Thus, to obtain a reactive controller dealing with constraints, it is proposed to design an IBVS-based NMPC (Nonlinear Model Predictive Control) controller, also named Visual Predictive Control (VPC). For this particular case, the cost function is defined as the error between the image coordinates of the visual features representing the landmark of interest at the current pose and the desired one. To develop a VPC scheme, it is then mandatory to (i) obtain the prediction models of the system state, (ii) design the cost function defining the task to achieve, (iii) design the mandatory constraints to achieve the task, and (iv) select the numerical solver allowing to minimize the cost function under the given constraints. Thus, this paper addresses the aforementioned points for the specific case of two simultaneously controlled robotic arms. Moreover, it is an extended version of the work presented in [10] which has been complemented with an analysis of the related works, a discussion regarding the terminal constraint and a more comprehensive set of simulation results. The paper is organized as follows. First, Sect. 2 is dedicated to an overview of the existing VPC scheme. Next, in Sect. 3, one presents the different used predictive models, before addressing in Sect. 4 the design of the cost function and of the different constraints. In Sect. 5, a set of simulations is presented to support the proposed approach and highlight its efficiency. Finally, in Sect. 6, one summarizes and analyses the obtained results before developing the expectations for the future.
2 Related Works As it has been explained, NMPC schemes require to use a model to predict the future states of the system. In the case of VPC, it consists of predicting the coordinates of the visual features knowing a sequence of control inputs. Several works have addressed the design of VPC schemes, and according to [1], they can be classified into two categories: the local and global models. The first ones relies on the interaction matrix (or image Jacobian) [6], whereas the second ones use the camera pose. For both cases, the models
82
E. Le Fl´echer et al.
are built considering a flying camera, i.e., the system embedding the camera is not taken into account in the model. Finally, none of the models proposes an analytic form for the prediction of the visual features. Thus, the predicted values are obtained using a linear approximation, or with advanced numerical integration schemes, such as Runge-Kutta. Most of the existing works belong to the first category, i.e., relying on local models. For example, in [2] the authors use a first order approximation to control a camera embedded on a robotic arm. Similar approaches are presented in [17] for a flying camera and in [19] for a mobile robot. In [27] and [16], the visual servoing scheme is represented by a linear parameter varying (LPV) system. The objectives of these works is to demonstrate the feasibility of VPC subject to constraints on the camera field of view or command vector. The results show the relevance of the approach, but are obtained for small displacements between the initial and desired pose, minimizing the impact of the predictions. On the opposite, the work presented in [23] proposes to control an UAV while avoiding other aircraft. In this case, the distance between the vehicles is large enough to use an approximate value of the depth in place of the real one. In this work, it is proposed to first perform an evaluation of the accuracy of the prediction models using the real or approximated depth values. This evaluation is done in the context of a camera embedded on a robotic arm. Following this evaluation, we then present the design of our VPC scheme to simultaneously control two robotic arms.
3 Modeling In this section, one first focuses on the modeling of the system, i.e., the robotic arms, the cameras and the targets [10]. Then, two models to predict the visual feature coordinates of a point are presented [1, 10]. One ends with a discussion regarding the viability of these models for the visual predictive control problem. 3.1
System Modeling
The robotic system considered in this work is composed of two identical arms, each of them containing mq revolute joints. A camera is attached to the end effector of each arm. To model the system, let us first define a global frame Fb = (Ob , xb , yb , zb ) attached to the system base. Moreover, two frames Fci = (Oci , xci , yci , zci ), with i ∈ [1, 2], are used to represent the pose of the camera attached to the ith arm. Let define qij and q˙ij , with j ∈ [1, ..., mq ], respectively the angular position and velocity of the j th joint of the ith arm. Thus, for the ith arm, one obtains the configuration vector Qi = [qi1 , qi2 , ..., qimq ]T and the control input vector Q˙ i = [q˙i1 , q˙i2 , ..., q˙imq ]T . Finally, for the whole system one obtains Q = [QT1 , QT2 ]T and Q˙ = [Q˙ T1 , Q˙ T2 ]T . The cameras mounted on the end effectors are modeled using the pinhole model. Thus, the projection matrices HIi /ci mapping the 3D coordinates (x, y, z) of a point in the camera frame Fci to its 2D projection (X, Y ) on the image plan FIi is defined as follows:
Simultaneous Control of Two Robotics Arms Sharing Workspace
⎡ ⎤ ⎡ X f /z ⎢Y ⎥ ⎢ 0 ⎢ ⎥=⎢ ⎣z⎦ ⎣ 0 1 0
0 f /z 0 0
⎤⎡ ⎤ 00 x ⎢ ⎥ 0 0⎥ ⎥ ⎢y ⎥ ⎦ 1 0 ⎣z ⎦ 01 1
83
(1)
In this work, the arms are controlled to make each camera reach a pose defined by the coordinates of point visual features in the image space. To do so, one uses two landmarks made of four points. When considering the ith camera/target couple, the coordinates of the projection of each point on the image is denoted Sil = [Xil , Yil ]T , with i ∈ [1, 2] and l ∈ [1, 2, 3, 4]. Thus, the visual features vector for each camera is defined as Si = [Si1 , Si2 , Si3 , Si4 ]T , whereas the system one is given by S = [S1T , ..., SnTa ]T . In the same way, a vector for the visual features desired coordinates of each camera is ∗ ∗ ∗ ∗ T , Si2 , Si3 , Si4 ] . defined as Si∗ = [Si1 3.2 Prediction of the Visual Features A visual predictive control scheme requires to be able to predict the values of the visual features for a given sequence of control inputs. In [1], the authors present two ways to perform the prediction step for a flying camera: the local and global approaches. This section is devoted to the presentation of both methods for the case of a 6 degrees of freedom camera embedded on a robotic arm. The Global Model. Global models rely on the pose of the camera. It can be decomposed as follows: from the image coordinates, the location of the visual features in the current camera frame is calculated (2D to 3D); the visual features coordinates are then computed in the predicted camera frame (3D to 3D); and finally the visual features are projected on the predicted image plane (3D to 2D). To do so, the camera pose is generally predicted by integrating the camera kinematic screw [1]. In this work, one integrates the robotic arm, which allows to predict the camera pose based one the robot command vector, minimizing the approximation. To do so, let us consider a discrete time system with a sampling period Ts , where tk+1 = tk + Ts , and the notation .(k) = .(tk ). The configuration Qi (k + 1) of the ith arm, with i ∈ [1, 2], at the next iteration obtained after applying the control inputs Q˙ i (k) is given by: Qi (k + 1) = Qi (k) + Q˙ i (k)Ts
(2)
Knowing the arm configuration at instants tk and tk+1 , it is then possible to compute the homogeneous transformation matrices between the camera frame Fci and the base one Fb . o¯Fb (.) = Hb/ci (.)¯ oFci (.) (3) where o¯ are the homogeneous coordinates of a point feature expressed in the base or camera frame. Finally, coupling (3) with (1), one obtains a prediction of the coordinates of a point feature in the image space. Sil (k + 1) = −1 HIi /ci (k + 1)Hb/c (k + 1)Hb/ci (k)HI−1 (k)Sil (k) i i /ci
(4)
84
E. Le Fl´echer et al.
The Local Model. Local models rely on the differential equation mapping the derivative of the visual feature coordinates S˙ il to the robot control input vector Q˙ i . To obtain this equation, one uses the robot’s Jacobian Ji , mapping the camera kinematic screw Tci and Q˙ i (5), and the interaction matrix Lil , mapping S˙ il to Tci (6).
where
− z1l 0 Lil = 0 − z1l
Tci = Ji Q˙ i
(5)
S˙ il = Lil Tci
(6)
Xl zl Yl zl
Xl Yl −(1 + Xl2 ) Yl 1 + Yl2 −Xl Yl Xl
(7)
By combining Eqs. (5) and (6), one obtains the following local model for each arm/camera couple: S˙ il = Lil Ji Q˙ i (8) Usability of the Models. When considering using one of the models on an experimental system, several aspects have to be taken into account. First, both models require a measure/estimate of the visual feature depth z (see Eqs. (1) and (7)). However, when using a monocular camera, this value can only be retrieved by estimation. Another widely adopted option consists of using a constant value, such as the depth value at the initial or desired pose [6]. The second issue is related to the integration of the local model to obtain the prediction equation. A first option consists of using an advanced numeric scheme to accurately compute the integration, whereas a second one relies on a linear approximation of the model, such as: Sil (k + 1) = Sil (k) + Lil Ji Q˙ i (k)Ts
(9)
Thus, depending on the user choices regarding these issues, the prediction feature coordinates will be more or less accurate. In Sect. 5, different configurations will be evaluated for our specific application in order to select the most appropriate and realistic solution.
4 Controller Design In this section the control of the arms motion to reach their own targets and compounding necessary constraints are presented. To achieve these goals, NMPC control has been implemented to find the optimal inputs vector which minimizes a cost function subject to constraints. To do so, we first present the cost function to be minimized in order to reach the desired targets. Then, the expression of each constraint will be defined. A similar approach was presented in [10] for a multi-arm robotic system.
Simultaneous Control of Two Robotics Arms Sharing Workspace
85
4.1 Visual Predictive Control The proposed VPC scheme couples NMPC (Nonlinear Model Predictive Control) with IBVS (Image-Based Visual Servoing). Thus, similarly to NMPC, it consists of com¯˙ puting an optimal control sequence Q(.) that minimizes a cost function JNP over a prediction horizon of NP steps while taking into account a set of user-defined con¯˙ straints C(Q(.)). Moreover, similarly to IBVS, the task to achieve is defined as an error in the image space. Thus, the cost function to minimize is the sum of the quadratic ˆ predicted over the horizon Np and the error between the visual feature coordinates S(.) desired ones S ∗ . It then possible to write the VPC problem as:
¯˙ ˙ Q(.) = min JNP (S(k), Q(.)) (10) ˙ Q(.)
with ˙ JNP (S(k), Q(.)) =
k+Np −1
ˆ ˆ [S(p) − S ∗ ]T [S(p) − S∗]
(11)
p=k+1
subject to ˆ = (4) or (8) S(.) ˆ S(k) = S(k)
(12b)
¯˙ C(Q(.)) ≤0
(12c)
(12a)
As mentioned in the previous section, the visual feature coordinates can be predicted either using the global or local models (Eq. (12a)). To compute the prediction, one uses the last measured visual features (Eq. (12b)). Finally, Eq. (12c) defines the set of constraints to be taken into account when minimizing the cost function (11). ¯˙ Thus, solving Eq. (10) leads to the optimal sequence of control inputs Q(.). In this ¯˙ work, and as it is usually done, only the first element Q(1) of the sequence is applied to the system. At the next iteration, the minimization problem is restarted, and a new sequence of optimal control inputs is computed. This loop is repeated until the task is achieved, e.i., the cost function (11) is vanished. 4.2 Constraints As seen in the previous section, it is possible to add a set of constraints in the minimization problem. In this section, one presents the constraints related to the geometry of the robot, the velocity of the joints, and the field of view of the camera. One finishes with defining a set of constraints that guarantee the non-collision of the arms despite sharing a common workspace. System Constraints. The joint velocity lower and upper bounds are respectively denoted Q˙ min and Q˙ max . They are two NQ˙ long vectors, with NQ˙ = 2mq Np . They allow to define the lower and upper limits of each joint of each arm over the
86
E. Le Fl´echer et al.
˙ prediction horizon. Consequently, the constraint on Q(.) is described in the domain κ = [Q˙ min , Q˙ max ], and the velocity constraint is then written as:
Q˙ − Q˙ max ≤0 Q˙ min − Q˙
(13)
Mechanical limits follow the same principle. The lower and upper bounds of the angular values are denoted Qmin and Qmax . They are two NQ long vectors, with NQ = 2mq Np . These constraints are then given by:
Q − Qmax ≤0 (14) Qmin − Q Finally, the visual constraints are defined by the image dimension. One denotes the bounds by Smin and Smax . They are two NS long vectors, with NS = 8na Np . Thus, all the predicted visual features coordinates need to be kept into this space, and the constraints can then be written as:
S − Smax ≤0 (15) Smin − S Overlapping Workspace Constraint. Due to the overlapping workspace, it is mandatory to guarantee the non-collision between the arms. To do so, one computes the shortest distance di,j|i ,j between the j th joint of the ith arm and the j th joint of the ith arm. The closest distance is evaluated as the norm of the segment perpendicular to both assessed links as it is illustrated in Fig. 1. Finally, in order to avoid collisions, one defines the following constraint: dmin − d ≤ 0
(16)
where d = [d1,1|2,1 , ..., d1,mq |2,mq ] and dmin is a pre-defined safe distance.
Fig. 1. Computation of the shortest distance di,j|i ,j between links Lij and Li j . Source [10].
Simultaneous Control of Two Robotics Arms Sharing Workspace
87
Terminal Constraint. When using a NMPC scheme, it is necessary to guarantee that the set of control inputs makes the camera converge towards the desired pose. Due to the fact that a NMPC scheme minimizes the distance between a predicted trajectory and a desired one, there is no guarantee that the ultimate predicted state has converged towards the desired one. Moreover, the prediction horizon might be too short or the constraints on the control inputs might be too restrictive to reach the goal. [15]. Thus, to check the feasibility, one adds a final constraint in the optimization problem. It is defined ˆ + Np − 1) as the error between the prediction of the visual feature coordinates S(k obtained at the end of the prediction horizon, and the desired ones S ∗ . If the solver ¯˙ that respects this constraint, then the cannot compute a sequence of control inputs Q problem is not feasible, and there is no guarantee regarding the system convergence. ˆ + Np − 1) − S ∗ | − δtc ≤ 0 |S(k
(17)
where δtc is a user defined threshold. Two remarks can be made concerning the use of the terminal constraint. First, none of the works studied during the literature review proposes to include a terminal constraint, despite its ease of implementation and utility. Moreover, authors usually weight the last predicted value based on the distance to the desired one. This method helps the solver to converge towards an optimal solution but it does not guarantee the convergence as the terminal constraint does. Second, the terminal constraint guarantees the convergence of the predicted states towards the desired ones. In the case the predictions are strongly erroneous, then the real system cannot converge towards the desired sate. Thus, using a terminal constraint does not provide an absolute guarantee of the task realization. All the constraints are now defined and can be integrated into the non-linear function ¯˙ C(Q(.)): ⎤ ⎡ Q˙ − Q˙ max ⎥ ⎢ Q˙ min − Q˙ ⎥ ⎢ ⎥ ⎢ Q − Qmax ⎥ ⎢ ⎥ ⎢ Qmin − Q ¯˙ ≤ 0 ⇒ ⎢ ⎥≤0 (18) C(Q) ⎥ ⎢ S − Smax ⎥ ⎢ ⎥ ⎢ Smin − S ⎥ ⎢ ⎦ ⎣ ˆ Dmin − D ∗ ˆ |S(k + Np − 1) − S | − δtc ¯˙ Thus, the optimal input must belong to the domain defined by C(Q(.)).
5 Simulation In this section, one presents the results obtained by simulating the control of two arms with MATLAB software. A 7 DOF model has been selected, based on the arms embedded on the PR2 robot. In these simulations, the task consists of positioning each camera with respect to a given target. Each final pose is defined in the image space by Si∗ = [x1 ∗, y1 ∗, x2 ∗, y2 ∗, x3 ∗, y3 ∗, x4 ∗, y4 ∗]T , which are the coordinates of four points representing the corresponding target. Moreover, one setups the prediction horizon to Np = 5. Regarding the constraints, the angular velocity limits for the first
88
E. Le Fl´echer et al.
prediction step are given by κ = [−0.2 0.2] for each joint, whereas it is extended to κ = [−5 5] for the rest of the prediction step. This allows to decrease the computation time while guaranteeing a feasible control input for the first step. The size of the image is Smin = [−4 − 4] and Smax = [4 4]. The angular constraints on each joint are setup as described in the PR2 documentation [13]. Moreover, the minimum collision avoidance distance is fixed to Dmin = 0.1. The initial coordinates of the visual features are computed based on the initial configuration of the robot given by qinit = [0 π2 0 0 0 0 0 0 π2 0 0 0 0 0]T for every simulation. Finally, one fixes Ts = 50 ms. 5.1
Comparison Between the Different Prediction Models
The first set of simulation assesses the accuracy of different predictive models presented in Sect. 3.2. To do so, the two robotics arms are controlled via VPC from qinit to q ∗ = [− π8 π2 0 0 0 0 0 0 π2 0 0 0 0 0]T , using the global model and the real depth values. When simulating a VPC scheme, this configuration can be considered as the ground truth, as there is no approximation in the calculations. Next, at each iteration, the optimal sequence of control inputs calculated by the VPC scheme, is used to compare the different prediction models. To do so, the predicted values at the end of the prediction horizon using a particular model are compared with the reference ones. In Fig. 2, one can see the evolution of the quadratic error between the predicted coordinates of one visual feature and the ground truth. Six models are presented by combining the global model (solid lines), local Euler approximation (dashed lines) and local Runge Kutta approximation (dotted lines) with the depth values real z (black) and desired z ∗ (blue). First, it can be seen that the error decreases over the servoing. Indeed, when the camera is close to the desired pose, the sequence of optimal control inputs is quite small, as there is no much distance left to perform. Having small control inputs then reduces the error in the predicted values. Moreover, at the end of the simulation, z(t) and z ∗ are very similar, which explains the convergence of the blue and black curves for each model. Next, it can be noticed that the global models and the local Runge Kutta are significantly more accurate than the local Euler one. Indeed, using this latter prediction scheme introduces a large error, and might not allow the system to converge toward the desired configuration. In Fig. 3, the errors are summed over the simulation to offer a better view of model performances. When considering the use of the real depth, the global model is the most accurate. However, this result is obtained in a simulation, and its use in a real context would require an estimator of the depth. Thus, one now considers one more realistic scenario where the depth is approximated with the constant value z ∗ . In this case, the local Runge Kutta is the most accurate model. Thus, for the rest of the simulations the local Runge Kutta model with z ∗ is used to evaluate the performances. 5.2
Collision-Free Scenario
In this second simulation, the system has to reach the desired configuration q ∗ = π 3π π π π π π π T [− 16 8 0 4 0 8 0 16 2 4 8 0 0 0] from the initial one qinit while dealing with
Simultaneous Control of Two Robotics Arms Sharing Workspace
89
Fig. 2. Quadratic errors of prediction over the simulation - Solid: Global model - Dashed: Local Euler - Dotted: Local Runge Kutta - Black: z - Blue: z ∗ . Source [10].
Fig. 3. Sum of the quadratic errors of prediction for global, local Euler and local Runge-Kutta methods - Black: z - Blue: z ∗ . Source [10].
the previously presented constraints. The VPC scheme relies on the local model coupled with a Runge Kutta algorithm and the desired depth z ∗ . Using this configuration, the VPC scheme achieves to control the robotic system to converge towards the desired poses, as it can be seen in the different results presented hereafter. First, the evolution of visual features is displayed in Fig. 4 and highlights the achievement of the task for both cameras. Indeed, the proposed scheme was able to control both arms to make the current visual features converge towards the desired ones from their initial locations (respectively represented by the red and green crosses in Figs. 4(a) and 4(b)). To achieve the task, the arms had to reach the configurations presented in Fig. 5. Regarding the constraints, the image limits were never reached and there was no risk of collision during the servoing. Moreover, it can be seen in Figs. 6(b) and 6(a) that the joint angular values remain within the defined domain. Similarly, in Figs. 6(d) and 6(c), one can see that the joint velocities were bounded by the constraints. Indeed, the command inputs stay within the given range κ = [−0.2 0.2] all along the simulation.
90
E. Le Fl´echer et al.
(a) Left camera
(b) Right camera
Fig. 4. Evolution of the visual features using local Runge-Kutta model and z - Blue dotted: Trajectories - Red circles: Final locations - Green circles: Initial locations. Source [10].
Fig. 5. Final poses of the arms - Red dots represent the two targets - Green crosses are the initial poses of the end effectors. Source [10].
5.3
Auto-Collision Scenario
This new simulation is setup in order to create a collision between the two robotic arms. π 3π π π Thus, the system has to reach the desired configuration q ∗ = [ 16 8 0 4 0 8 0 − π π π π T 16 2 4 8 0 0 0] . Moreover, as previously done, one uses the Runge Kutta local model coupled with the desired depth values. As it can be seen in Fig. 7, one more time the desired visual features are reached from the initial poses. However, as shown in Fig. 8, the arms had to cross each other to reach the desired locations. Thanks to the shared environment constraint included in the control problem, the collision was avoided. Indeed, Fig. 9 shows that the distance between the different links was never smaller than the given value Dmin = 0.1m. In addition to the auto-collision avoidance, the other constraints on the joints angular values and velocities were respected (see Figs. 10 and 11).
Simultaneous Control of Two Robotics Arms Sharing Workspace
(a) Left arm joints angular values
(b) Right arm joints angular values
(c) Left arm joints angular velocities
(d) Right arm joints angular velocities
91
Fig. 6. Joint configurations and velocities. Source [10].
(a) Left camera
(b) Right camera
Fig. 7. Evolution of the visual features using local Runge-Kutta model and z - Blue dotted: Trajectories - Red circles: Final locations - Green circles: Initial locations. Source [10].
92
E. Le Fl´echer et al.
Fig. 8. Final poses of the arms - Red dots represent the two targets - Green crosses are the initial poses of the end effectors. Source [10].
5.4
External Collision Scenario
In this scenario, the goal is to achieve the task while dealing with an obstacle on the right arm trajectory. First, one defines the desired pose such as q ∗ = π 3π π π π π π π T [− 16 8 0 4 0 8 0 16 2 4 8 0 0 0] . Next, an obstacle is positioned in the scene ¯ Obs| = (0.79, −0.27, −0.08) (see Fig. 12). Finally, the minimal at the coordinates O Fb distance is setup to 0.1m. As it can be seen in Fig. 13(a), the positioning task is achieved, and the visual features have reached the desired coordinates. Moreover, the corresponding trajectory is plotted in Fig. 12. It should be noticed that the obstacle on the right arm trajectory has been avoided thanks to the distance constraint. Indeed, as it can be seen in Fig. 13(b), the distance between any segment of the arm and the obstacle is always higher than the defined threshold. Thus, it is possible to safely achieve the positioning task despite the presence of unknown obstacle. 5.5
Auto/External-Collision Scenario
This simulation aims at showing the ability of the system to avoid both a collision between the arms and with an external obstacle. To do so, one setups qinit = π π π π π π T [− π8 3π 8 0 4 0 8 0 16 2 4 8 0 0 0] . Moreover, the desired configuration is given π π π π π π π π π ∗ by q = [ 20 2 − 2 6 π 4 0 − 20 2 2 6 0 0 0]. The obstacle is positioned at the ¯ Obs F = (0.79, −0.27, −0.08) (see Fig. 14) and the minimal distance for coordinates O b both arms and obstacle is set up to 0.1m.
Simultaneous Control of Two Robotics Arms Sharing Workspace
93
Fig. 9. Collision distance - Solid: Distances between links - Dotted: Minimal allowed distance. Source [10].
(a) q11
(b) q12
(e) q15
(c) q13
(f) q16
(d) q14
(g) q17
Fig. 10. Configuration of the right arm over the simulation - Blue solid: Angular joint value q1j Red dashed: Angle boundaries. Source [10].
(a) q˙11
(b) q˙12
(e) q˙15
(c) q˙13
(f) q˙16
(d) q˙14
(g) q˙17
Fig. 11. Right arm joints velocities - Blue solid: Joint velocity value - Red dashed: Velocity boundaries. Source [10].
94
E. Le Fl´echer et al.
Fig. 12. Final configurations of the robotic arms.
(a) Evolution of the visual features in the right (b) Distance between the obstacle and the difcamera - Blue dotted: Trajectories - Red cir- ferent links of the right arm - Solid: Distances cles: Final locations - Green circles: Initial lo- - Dotted: Minimal allowed distance cations
Fig. 13. Evolution of the visual features and distance between the right arm links and the obstacle.
Simultaneous Control of Two Robotics Arms Sharing Workspace
95
Fig. 14. Final configurations of the robotic arms.
(a) Left camera
(b) Right camera
Fig. 15. Evolution of the visual features in the right camera - Blue dotted: Trajectories - Red circles: Final locations - Green circles: Initial locations.
One more time, the task is successfully achieved as it can be seen in Fig. 15. Indeed, the visual features manage to converge towards the desired values despite the presence of an obstacle and the possible auto-collision. The corresponding trajectory is plotted in Fig. 14. This trajectory is obtained by taking into account both the constraints related to the arms distances and to the obstacle distance (see Figs. 16(a) and 16(b)).
96
E. Le Fl´echer et al.
(a) Distance between the obstacle and the dif- (b) Collision distance - Solid: Distances beferent links of the right arm - Solid: Distances tween links - Dotted: Minimal allowed dis- Dotted: Minimal allowed distance tance
Fig. 16. Collision distance between the right arm and the obstacle, and between the arms.
6 Conclusion In this paper, the problem of fruit picking using a dual-arm robot in a commercial orchard has been considered. A strategy based on VPC has been developed to make its end-effectors reach a desired pose close to the corresponding fruits, while respecting constraints due to the environment dynamics, the visual manipulation and the shared workspace. This strategy thus allows to benefit from the advantages of a reactive controller to perform a highly complex task in a strongly evolutive environment where various constraints must be taken into account. To validate the approach, a simulation campaign has been conducted. A first step was to evaluate the different prediction models to select the best one for the servoing. Then, the VPC strategy itself was tested, considering different scenarii and using a PR2 model as the considered dual-arm robot. The obtained results have shown its interest and its efficiency to safely perform various positioning tasks in a shared workspace. For future works, two main leads can be followed. First, these works only consider a fixed dual-arm system. To go further, it is necessary to mount it on a mobile base and to coordinate the motions of the complete robotic setup. To do so, the proposed control strategy will have to be coupled to our previous works about autonomous navigation in orchards. A second interesting research axis concerns the realization of experimental tests on a real robotic system. To achieve this goal, it will be required to speed up the processing time. The prediction models will have to be selected depending not only on their accuracy but also on their computational time. In this context, designing new models specifically adapted to Graphical Processing Units could offer a nice opportunity. Such aspects are now at the core of our current work in order to perform a complete fruits picking task in a real environment.
Simultaneous Control of Two Robotics Arms Sharing Workspace
97
References 1. Allibert, G., Courtial, E., Chaumette, F.: Predictive control for constrained image-based visual servoing. IEEE Trans. Rob. 26(5), 933–939 (2010). https://doi.org/10.1109/TRO. 2010.2056590 2. Assa, A., Janabi-Sharifi, F.: Robust model predictive control for visual servoing. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2715–2720 (2014) 3. Bergerman, M., Billingsley, J., Reid, J., van Henten, E.: Robotics in agriculture and forestry. In: Springer Handbook of Robotics, pp. 1463–1492. Springer, Cham (2016) 4. Cadenat, V., Folio, D., Durand-Petiteville, A.: Comparison of two sequencing techniques to perform a vision-based navigation task in a cluttered environment. Adv. Robot. 26(5–6), 487–514 (2012) 5. Chaumette, F., Marchand, T.: A redundancy-based iterative approach for avoiding joint limits: application to visual servoing. IEEE Trans. Robot. Autom. 17(5), 719–730 (2001). https://doi.org/10.1109/70.964671. http://ieeexplore.ieee.org/document/964671/ 6. Chaumette, F., Hutchinson, S.: Visual servo control. I. Basic approaches. IEEE Robot. Autom. Mag. 13(4), 82–90 (2006) 7. Durand-Petiteville, A., Cadenat, V., Ouadah, N.: A complete sensor-based system to navigate through a cluttered environment. In: 2015 12th International Conference on Informatics in Control, Automation and Robotics (ICINCO), vol. 2, pp. 166–173. IEEE (2015) 8. Durand-Petiteville, A., Le Flecher, E., Cadenat, V., Sentenac, T., Vougioukas, S.: Design of a sensor-based controller performing u-turn to navigate in orchards. In: Proceedings of International Conference on Informatics in Control, Automation and Robotics, vol. 2, pp. 172–181 (2017) 9. Durand-Petiteville, A., Cadenat, V.: From the general navigation problem to its image based solutions. In: ViCoMoR 2012, p. 1 (2012) 10. Le Fl´echer, E., Durand-Petiteville, A., Cadenat, V., Sentenac, T.: Visual predictive control of robotic arms with overlapping workspace. In: 16th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2019), pp. 130–137 (2019) 11. Foley, J.A., Ramankutty, N., Brauman, K.A., Cassidy, E.S., Gerber, J.S., Johnston, M., Mueller, N.D., O’Connell, C., Ray, D.K., West, P.C., Balzer, C., Bennett, E.M., Carpenter, S.R., Hill, J., Monfreda, C., Polasky, S., Rockstr¨om, J., Sheehan, J., Siebert, S., Tilman, D., Zaks, D.P.M.: Solutions for a cultivated planet. Nature 478, 337–342 (2011) ˜ c 12. Chaumette, F.: Asservissement visuel. In: La commande des robots manipulateurs, TraitA IC2, vol. 3, pp. 105–150. Chap (2002) 13. Garage, W.: PR2 user manual (2012) 14. Grift, T., Zhang, Q., Kondo, N., Ting, K.C.: A review of automation and robotics for the bio-industry. J. Biomech. Eng. 1(1), 19 (2008) 15. Gr¨une, L., Pannek, J.: Nonlinear model predictive control. In: Nonlinear Model Predictive Control, pp. 45–69. Springer, Cham (2017) 16. Hajiloo, A., Keshmiri, M., Xie, W.F., Wang, T.T.: Robust on-line model predictive control for a constrained image based visual servoing. IEEE Trans. Ind. Electron. 1 (2015). https:// doi.org/10.1109/TIE.2015.2510505 17. Heshmati-alamdari, S., Karavas, G.K., Eqtami, A., Drossakis, M., Kyriakopoulos, K.J.: Robustness analysis of model predictive control for constrained image-based visual servoing. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 4469–4474, May 2014 18. Kazemi, M., Gupta, K., Mehrandezh, M.: Path-planning for visual servoing: a review and issues. In: Visual Servoing via Advanced Numerical Methods, pp. 189–207. Springer, Heidelberg (2010)
98
E. Le Fl´echer et al.
19. Ke, F., Li, Z., Xiao, H., Zhang, X.: Visual servoing of constrained mobile robots based on model predictive control. IEEE Trans. Syst. Man Cybern.: Syst. 47(7), 1428–1438 (2017). https://doi.org/10.1109/TSMC.2016.2616486 20. Le Flecher, E., Durand-Petiteville, A., Cadenat, V., Sentenac, T., Vougioukas, S.: Implementation on a harvesting robot of a sensor-based controller performing a u-turn. In: 2017 IEEE International Workshop of Electronics, Control, Measurement, Signals and their Application to Mechatronics (ECMSM), pp. 1–6. IEEE (2017) 21. Mansard, N., Chaumette, F.: Task sequencing for high-level sensor-based control. IEEE Trans. Rob. 23(1), 60–72 (2007). https://doi.org/10.1109/TRO.2006.889487 22. Marchand, E., Chaumette, F., Rizzo, A.: Using the task function approach to avoid robot joint limits and kinematic singularities in visual servoing. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 1996, Osaka, Japan, vol. 3, pp. 1083–1090. IEEE (1996). https://doi.org/10.1109/IROS.1996.568954. http://ieeexplore.ieee. org/document/568954/ 23. Mcfadyen, A., Mejias, L., Corke, P., Pradalier, C.: Aircraft collision avoidance using spherical visual predictive control and single point features. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 50–56. IEEE (2013) 24. Mezouar, Y., Chaumette, F.: Path planning for robust image-based control. IEEE Trans. Robot. Autom. 18(4), 534–549 (2002). https://doi.org/10.1109/TRA.2002.802218. http:// ieeexplore.ieee.org/document/1044366/ 25. Van Henten, E., Hemming, J., Van Tuijl, B., Kornet, J., Bontsema, J.: Collision-free motion planning for a cucumber picking robot. Biosyst. Eng. 86(2), 135–144 (2003). https://doi.org/ 10.1016/S1537-5110(03)00133-8 26. Vougioukas, S.G., Arikapudi, R., Munic, J.: A study of fruit reachability in orchard trees by linear-only motion. IFAC-PapersOnLine 49(16), 277–280 (2016). https://doi.org/10.1016/j. ifacol.2016.10.051 27. Wang, T., Xie, W., Liu, G., Zhao, Y.: Quasi-min-max model predictive control for imagebased visual servoing. In: 2012 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), pp. 98–103, July 2012. https://doi.org/10.1109/AIM.2012. 6265955 28. Zhao, Y., Gong, L., Huang, Y., Liu, C.: A review of key techniques of vision-based control for harvesting robot. Comput. Electron. Agric. 127, 311–323 (2016). https://doi.org/10.1016/ j.compag.2016.06.022 29. Zhao, Y., Gong, L., Liu, C., Huang, Y.: Dual-arm robot design and testing for harvesting tomato in greenhouse. IFAC-PapersOnLine 49(16), 161–165 (2016). https://doi.org/10.1016/ j.ifacol.2016.10.030
Advanced Algorithm for Interpolation with Wendland Functions Hjortur Bjornsson(B) and Sigurdur Hafstein Science Institute, University of Iceland, Dunhagi 3, 107 Reykjav´ık, Iceland {hjb6,shafstein}@hi.is Abstract. We develop and study algorithms for computing Lyapunov functions using meshless collocation and Wendland functions. We present a software tool that generates a C/C++ library that implements Wendland functions of arbitrary order in a specified factorized form with advantageous numerical properties. Additionally, we describe the algorithm used by the tool to generate these Wendland functions. Our factorized form is more efficient and has higher numerical accuracy than previous implementations. We develop and implement optimal grid generation for the interpolation problem using the Wendland functions. Finally, we present software that calculates Lyapunov functions using these Wendland functions and the optimally generated grid. The software tool and library are available for download with examples of usage. Keywords: Wendland function · Lyapunov functions · Radial basis functions · Code generation
1 Introduction Interpolation and collocation using Radial Basis Functions (RBF), in particular compactly supported RBFs, have been the subject of numerous research activities in the past decades [6, 7, 9, 10, 24–27]. They are well suited as kernels of Reproducing Kernel Hilbert Spaces and their mathematical theory is mature. The authors and their collaborators have applied Wendland’s compactly supported RBFs for computing Lyapunov functions for nonlinear systems, both deterministic [11, 12, 14] and stochastic [5], where Lyapunov functions are a useful tool to analyse stability of these systems, cf. e.g. [18–20, 22, 23]. Various numerical methods have been used to find Lyapunov functions for the systems at hand [14, 16]. Meshless collocation using RBFs is one such method and many different families of RBFs have been studied [25]. In the papers [5, 12–14] and the book [11] meshless collocation is used with Wendland functions, where the Wendland function family is defined in a recursive way and in order to determine the actual functions to use in a software implementation many calculations had to be done by hand. In [2] an algorithm is proposed that determines the Wendland polynomials in expanded form, that is: for each pair of integers l, k ≥ 0, it finds a list of numbers a0 , a1 , . . . ad such that the Wendland function This research was supported by the Icelandic Research Fund (Rannis), grant number 152429-051, Lyapunov Methods and Stochastic Stability. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 O. Gusikhin et al. (Eds.): ICINCO 2019, LNEE 720, pp. 99–117, 2021. https://doi.org/10.1007/978-3-030-63193-2_6
100
H. Bjornsson and S. Hafstein
d ψl,k (r) = i=0 ai ri . However, it was shown in [3] that the evaluations of these polynomials in this form using typical schemes, such as Horner’s scheme, can lead to significant numerical errors. Having the Wendland functions in factorized form [3] is more efficient and numerically accurate, so we propose an alternate method to determine the functions in factorized form. For that purpose, we have created a software tool that generates a reusable software library in C/C++, which implements these Wendland polynomials in factorized form. A first version of this software library was presented in [4]. We have now extended it considerably and added more functionality, most notably efficient grid generation and algorithms to solve interpolation problems for generating Lyapunov functions for stochastic and deterministic dynamical systems.
2 Background Meshless collocation with RBFs is a method that can be used to calculate Lyapunov functions for either stochastic or deterministic dynamical systems. In paper [5] meshless collocation was used to calculate Lyapunov functions for Stochastic Differential Equations (SDE); see e.g. [11, 14] for a similar approach for deterministic systems. The method revolves around solving a linear Partial Differential Equation (PDE). Let Ω ⊂ Rn be a given domain and Γ ⊂ Rn its boundary. Then we want to solve the (PDE) LV (x) = h(x) x ∈ Ω V (x) = c(x) x ∈ Γ, where L is a certain differential operator, and h and c are some appropriately chosen functions. Using meshless collocation to solve the PDE above we choose points X1 = {x1 , . . . , xN } ⊂ Ω and X2 = {ξ1 , . . . , ξM } ⊂ Γ , and solve the interpolation problem LV (xi ) = h(xi ) for all i = 1, . . . , N for all i = 1, . . . , M. V (ξi ) = c(ξi ) The solution is then given in terms of a radial basis function ψ, V (x) =
N k=1
αk (δxk ◦ L)y ψ(x − y) +
M
αN +k (δξk ◦ L0 )y ψ(x − y),
(1)
k=1
where L0 is the identity operator, δy V (·) = V (y) and superscript y denotes that the operator is applied with respect to the variable y. The constants αi are determined as a solution to the linear system Aα = γ, where A, called the interpolation matrix, is the symmetric matrix B C A= CT D
(2)
(3)
Advanced Algorithm for Interpolation with Wendland Functions
101
and the matrices B = (bjk )j,k=1,...,N , C = (cjk )j=1,...,N,k=1,...,M and D = (djk )j,k=1,...,M have elements: bjk = (δxj ◦ L)x (δxk ◦ L)y ψ(x − y) cjk = (δxj ◦ L)x (δξk ◦ L0 )y ψ(x − y) djk = (δξj ◦ L0 )x (δξk ◦ L0 )y ψ(x − y). The vector γ has components given by γj = r(xj ), 1 ≤ j ≤ N γj+N = c(ξj ), 1 ≤ j ≤ M There are different choices for the radial basis function ψ. We want the interpolation matrix A to be symmetric and positive definite and choosing ψ to have compact support can make A sparse. Under a few mild conditions the choice of ψ as a Wendland function, i.e. a compactly supported radial basis function, ensures this [26]. This method works the same way for determining Lyapunov functions for both deterministic systems and SDEs, the difference is the choice of the differential operator L. For deterministic systems it is the orbital derivative, a first order differential operator, and for stochastic systems it is a second order differential operator. To compute such a Lyapunov function a large number of evaluations of the function ψ and its derivatives is necessary, see e.g. the examples given in Eqs. (16) and (19). To verify the properties of a Lyapunov function for the function computed, even more evaluations are necessary. Therefore, it turned out to be essential that these evaluations could be carried out in an efficient and accurate way.
3 Wendland Functions The Wendland functions are compactly supported radial basis functions that are polynomials on their support, which makes computations with them simple. They are a family of functions depending on two parameters l, k ∈ N0 defined by the recursive relations: l
ψl,0 (r) = [(1 − r)+ ] and
ψl,k+1 (r) = Cl,k+1
1
(4)
tψl,k (t)dt,
(5)
r
where (1 − r)+ := max{1 − r, 0} and Cl,k+1 = 0 is a constant. Therefore these functions also satisfy the relation − Cl,k+1 ψl,k (r) =
d dr ψl,k+1 (r)
r
.
(6)
For interpolation using a particular Wendland function as the base function, the value of the constant Cl,k+1 = 0 is not of importance because the Wendland function appears linearly on both sides of a linear equation. Therefore, one can just fix values
102
H. Bjornsson and S. Hafstein
that are convenient for the problem at hand and we will do this in the following section. However, when solving collocation problems, we apply a differential operator, see Eqs. (1) and (3), so we get terms involving both the original Wendland function and its derivatives. The derivative of a Wendland function can be written in terms of a lower order Wendland function, using Eq. (6), and when doing this it is necessary to keep track of the constants Cl,k+1 for the derivatives. It is only the constant for the base function that can be chosen arbitrarily. After the choice has been made, we must keep track of it through all calculations. First, we choose a particular function ψl,k and by abuse of notation we denote it by ψ0 = ψl,k . Then we define ψi (r) =
d dr ψi−1 (r)
r
for i = 1, 2, . . . , k.
The function ψi is then a specific Wendland function of order l, k − i. Now the functions l Φl,0 (r) = [(1 − r)+ ] and 1 Φl,k (r) = Φl,0 (t)t(t2 − r2 )k−1 dt for k > 0
(7)
(8) (9)
r
also satisfy a relation of the form −2(k − 1)Φl,k (r) =
d dr Φl,k+1 (r)
r
,
for all integers k, l ≥ 0, i.e. a relation identical to Eq. (6) with Cl,k+1 = 2(k − 1). Just note that 1 1 d Φl,0 (t)t(t2 − r2 )k−1 dt = −2r(k − 1) Φl,0 (t)t(t2 − r2 )k−2 dt. dr r r Therefore Eq. (9) delivers an alternative way to define the Wendland functions, see [26]. Note that [26] uses a different numbering scheme of the functions than we do in this paper. The Wendland functions have several important properties, cf. e.g. [11, Prop. 3.10]: 1) ψl,k (r) is a polynomial of degree l + 2k for r ∈ [0, 1] and supp(ψl,k ) = [0, 1]. 2) The radial function Ψ (x) := ψl,k (x) is C 2k at 0. 3) ψl,k is C k+l−1 at 1. Frequently we fix the parameter l := n2 + k + 1, where n is the spacial dimension we are working in, and a constant c > 0 to fix the support. By the properties stated above, the radial function Ψ (x) := ψl,k (cx) is then a C 2k function with supp(Ψ ) = B d (0, c−1 ) ⊂ Rn , where B n (0, c−1 ) is the closed n-dimensional ball around the origin with radius c−1 .
Advanced Algorithm for Interpolation with Wendland Functions
103
4 Computing Formulas for Wendland Functions In this section we introduce a method to generate Wendland functions of arbitrary degree. As a first step we discuss polynomial representations in software. 4.1 Polynomials Representation d i as a list of coefficients We represent d-degree polynomials i=0 ai t (a0 , a1 , . . . , ad ). Our implementation uses Python with List objects. Addition and multiplication of polynomials of this form are easily implemented as: d1
ai ti +
i=0
d2
max{d1 ,d2 }
bj t j =
j=0
(ai + bi )ti ,
i=0
where ai = 0 for i > d1 and bj = 0 for j > d2 . Multiplication is given by ⎞
d ⎛ d d 1 2 1 +d2 i ⎝ j⎠ = ai t bj t ci ti , i=0
j=0
where ci =
i=0
ak bj .
k+j=i
An anti derivative of a polynomial is given by (a0 , a1 , a2 , . . . , ad ) → (0, a0 , corresponding to
d
ad a1 a2 , ,..., ), 2 3 d+1
d ai i+1 t , ai t dt = i + 1 i=0 i=0 i
and differentiation by (a0 , a1 , . . . , ad ) → (a1 , 2a2 , 3a3 , . . . , dad ). In order to maximise exact calculations up to computer limitations, we store the coefficients as tuples of Integers, numerator and denominator, avoiding the floating point approximation. Specifically, we used the Rational class provided in Python. Polynomials in two variables can be represented as a polynomial in one of the variables, where each coefficient is a polynomial in the second variable, and each of those coefficients is a rational number. This gives us then a list of lists.
104
4.2
H. Bjornsson and S. Hafstein
The Method
To calculate a polynomial representing the Wendland function ψl,k on the interval [0, 1] we start by fixing the derivative p (t) = (1 − t)l t(t2 − r2 )k−1 , see (9). This function is a polynomial in two variables, which we represent as a polynomial in t where each coefficient is a polynomial in r. Following Eq. (9), we integrate this function with respect to t, and we obtain a new polynomial p(t) in t, again with coefficients that are polynomials in r. We evaluate the polynomial p at t = 1 and at t = r, which in both cases result in a polynomial in r, and we obtain the polynomial ψ(r) = p(1) − p(r). Note that ψ(r) is a representative of a Wendland function of order l, k, that is ψ(r) = C1 ψl,k (r) for some constant C1 = 0. We factor the polynomial ψ, using long division, into the form ψ(r) = C2 (1 − r)l+k pl,k (r)
(10)
such that pl,k (r) is a polynomial with co-prime integer coefficients. This is possible since ψl,0 has a zero of order l at 1, and by using the recursive relation in Eq. (5), we see that ψl,k has a zero of order l + k at 1. The Wendland function ψl,k is only defined up to a multiplication by a non-zero constant, therefore we are free to ignore the constant C2 and use ψ(r) = (1 − r)l+k pl,k (r), a polynomial with integer coefficients, as a starting point for our recursion. Using the relation in (6) and discarding the constant Cl,k , we see that d l+k pl,k (r) dr (1 − r) (11) ψl,k−1 (r) = r 1 = (1 − r)l+k−1 ((1 − r)pl,k (r) − (l + k)pl,k (r)). r Writing the function ψl,k−1 , as ψl,k−1 (r) = (1 − r)l+k−1 pl,k−1 (r), then we see ψl,k−1 (r) (12) (1 − r)l+k−1 1 (1 − r)pl,k (r) − (l + k)pl,k (r) . = r We know is a polynomial, since it is a Wendland function of order l, k, there that ψl,k−1 d (1 − r)l+k pl,k (r) must by divisible by the monomial r. Since (1 − r)l+k−1 fore dr is not divisible by r, the right-hand side of (12) must be a polynomial in r. Therefore pl,k−1 is a well defined polynomial. By pulling out the common factor bk−1 ∈ Z of the coefficients in pl,k−1 we obtain a new polynomial pˆl,k−1 and a constant bk−1 such that pl,k−1 (r) :=
pl,k−1 = bk−1 pˆl,k−1 . Repeating this step, until we arrive at pl,0 , we get a collection of polynomials in the form ψi (r) = b1 · · · bi (1 − r)l+k−i pˆl,k−i (r), i = 1, 2, . . . , k.
(13)
Advanced Algorithm for Interpolation with Wendland Functions
105
where each of the polynomials pˆl,k−i (r) has co-prime integer coefficients and each of the constants bi is a negative integer. The above list follows the notation in [11], where ψ0 is the polynomial given in (10) and is equal to the Wendland function ψl,k , and ψ1 , . . . , ψi are the Wendland functions given by ψl,k−1 , . . . , ψl,k−i respectively, see Eq. (5). It is important to keep track of the constants b1 , . . . , bi in (13) as they are necessary for correct evaluation of formula (1). 4.3 Example We will now demonstrate how the above method determines the Wendland function ψl,k for l = 6 and k = 4. Here we start with the function p (t) = (1 − t)6 t(t2 − r2 )3 and we obtain 1 ψ(r) = (1 − t)6 t(t2 − r2 )3 dt r
1 14 32 13 1 12 64 11 r − r + r − r 280 1001 8 231 3 32 9 1 8 1 r + r − r6 + r10 − 8 105 8 56 1 4 1 2 1 r − r + + 280 1848 24024 1 10 = (1 − r) (429r4 + 450r3 + 210r2 + 50r + 1). 120120 =
We set ψ0 (r) = (1 − r)10 (429r4 + 450r3 + 210r2 + 50r + 1). For r ∈ [0, 1] we have / [0, 1]): the formulas (recall that ψl,k (r) = 0 if r ∈ ψ6,4 (r) = ψ0 (r) = (1 − r)10 (429r4 + 450r3 + 210r2 + 50r + 1); ψ6,3 (r) = ψ1 (r) =
d dr ψ0 (r)
r = −26(1 − r)9 (231r3 + 159r2 + 45r + 5);
ψ6,2 (r) = ψ2 (r) =
d dr ψ1 (r)
r = 3, 432(1 − r)8 (21r2 + 8r + 1);
ψ5,1 (r) = ψ3 (r) =
d dr ψ2 (r)
r = −102, 960(1 − r)7 (7r + 1);
ψ5,0 (r) = ψ4 (r) =
d dr ψ3 (r)
r = 5, 765, 770(1 − r)6 .
106
H. Bjornsson and S. Hafstein
Note that we have actually computed a lot more useful information than just a family of Wendland functions ψ5,i , i = 0, 1, 2, 3, 4. In our algorithm, for a fixed l, k, we have ψl,k−j = ψj (r) = =
d dr ψj−1 (r)
r
d dr ψl,k−j+1 (r)
r
, for j = 1, . . . , k,
and we have thus delivered all the radial basis functions needed for a collocation problem. This corresponds to computing a whole table as in [11, Table 3.1], but for a collocation problem with arbitrary high derivatives. The software tool, discussed in Sect. 6, also includes the constant c > 0 in the computations, which is used to fix the support of the Wendland functions.
5 Meshless Collocation Using Wendland Functions The method of meshless collocation can be used to calculate Lyapunov functions for both deterministic dynamical systems and stochastic dynamical systems. We just choose the operator L and the boundary values appropriately. The next two sections show some of the explicit formulas involved. We also talk about the optimal grid for the interpolation problem, and some aspects of solving the resulting linear systems using software. 5.1
Deterministic Systems
Consider an autonomous deterministic system, that is a dynamical system of the form x (t) = f (x) with f : Rn → Rn , for which the origin is an asymptotically stable equilibrium. We can generate a Lyapunov function, V : Rn → R, for this system by solving the interpolation problem (2) with the differential operator L being given by LV (x) = ∇V (x), f (x) , setting the boundary Γ = ∅ and choosing the function h appropriately. Setting the radial basis function ψ to be the Wendland function ψ0 (r) = ψl,k (r) for some constants l, k, and then fixing ψ1 and ψ2 according to Eq. (7), the matrix obtained in Eq. (3) is given by elements of the form (see [11]): bkl = ψ2 (xk − xl )xk − xl , f (xk ) xl − xk , f (xl ) − ψ1 (xk − xl )f (xk ), f (xl ) .
(14)
The components of the vector γ are given by γj = r(xj ).
(15)
Advanced Algorithm for Interpolation with Wendland Functions
107
Then the solution to the interpolation problem has the formula, see Eq. (1), V (x) =
N
αk ψ1 (x − xk )xk − x, f (xk ) ,
(16)
k=1
where α is the solution of Aα = γ and ψ0 and ψ1 are given by Eq. (7). 5.2 Stochastic Systems For SDEs of the form dx(t) = f (x(t))dt + g(x(t))dW (t),
(17)
f : Rd → Rd , g : Rd → Rd×Q , we consider the operator L given by the associated generator of the SDE: d 1 ∂2 V LV (x) := ∇V (x) · f (x) + [g(x)g(x) ] (x) 2 i,j=1 ∂xi ∂xj
1 ∂2 mij (x) V (x). 2 i,j ∂xi ∂xj d
= ∇V (x) · f (x) +
(18)
Q Here (mij (x))i,j=1,...,d = g(x)g(x) , that is mij (x) = q=1 giq (x)gjq (x). We choose a Wendland function ψ0 = ψl,k , for some constants l, k, and set ψ(x) = ψ0 (x). We define ψi according to Eq. (7) for i = {1, 2, 3, 4} and we get that the solution to the interpolation problem is given by, see Eq. (1), N αk − ψ1 (x − xk )x − xk , f (xk ) V (x) = k=1
d 1 + mij (xk )[ψ2 (x − xk )(x − xk )i (x − xk )j + δi,j ψ1 (x − xk )] 2 i,j +
M
αN +k ψ0 (x − ξk ).
(19)
k=1
In this formula the vector α is the solution to the linear system in Eq. (2). The formulas for the matrix elements are dkl = ψ0 (ξk − ξl ), ckl = −ψ1 (ξl − xk )ξl − xk , f (xk ) +
d 1 mij (xk )[ψ2 (ξl − xk )(ξl − xk )i (ξl − xk )j 2 i,j=1
+δij ψ1 (ξl − xk )],
(20)
108
H. Bjornsson and S. Hafstein
and, abbreviating β = x − xk , bkl = −ψ2 (β)β, f (xk ) β, f (xl ) − ψ1 (β)f (xk ), f (xl ) d 1 mij (xl ) ψ3 (β)β, f (xk ) βi βj + ψ2 ( β)fj (xk )βi + 2 i,j=1 +ψ2 (β)fi (xk )βj + δij ψ2 (β)β, f (xk ) +
+
d 1 mij (xk ) − ψ3 (β)β, f (xl ) βi βj − ψ2 (β)fj (xl )βi 2 i,j=1 −ψ2 (β)fi (xl )βj − δij ψ2 (β)β, f (xl ) d d 1 mrs (xk )mij (xl ) ψ4 (β)βi βj βr βs 4 r,s=1 i,j=1 +ψ3 (β)[δij βr βs + δir βj βs + δis βj βr +δjr βi βs + δjs βi βr + δrs βi βj ] +ψ2 (β)[δij δrs + δir δjs + δis δjr ] .
5.3
(21)
Generating the Grid
The optimal grid in Ω ⊂ Rn for the interpolation problem (2) was studied in [17]. The grid that delivers the smallest fill-distance, the parameter which determines the accuracy of the solution, is defined using the basis vectors w1 , w2 , . . . , wn ∈ Rn , where (the ei s denote the usual orthonormal basis in Rn ) wk =
k−1
j ej + (k + 1) k ek and k =
j=1
1 2k(k + 1)
.
The grid-points Gα,z = {gi : i ∈ Zn } ⊂ Rn with fill=distance parameter α > 0 and offset z ∈ Rn are then given by gi := z + α
n
ik wk , i = (i1 , i2 , . . . , in ) ∈ Zn .
k=1
Given two vectors a, b ∈ Rn such that ai < bi for i = 1, 2, . . . , n, we want to compute the coordinates of the grid-points gi ∈ Gα,z that are in the cube Ca,b := [a1 , b1 ] × [a2 , b2 ] × · · · × [an , bn ]. By observing that wn is the only basis vector with a nonzero entry in its last component, wn−1 and wn are the only basis vectors with nonzero entries in their second to
Advanced Algorithm for Interpolation with Wendland Functions
109
last component, etc., these can be computed efficiently in a recursive manner. Let us illustrate this with n = 3, the general strategy can be read from the code below. Given the offset vector z = (z1 , z2 , z3 ) ∈ R3 and a3 < b3 , we see that only those i = (i1 , i2 , i3 ) ∈ Z3 with i3 fulfilling 3+1 1 ≤ b3 , = z3 + i3 · α a3 ≤ z3 + i3 · α · (3 + 1) 2·3 2 · 3 · (3 + 1) i.e.
a3 − z3 α
2·3 3+1
≤ i3 ≤
b3 − z 3 α
2·3 , 3+1
(22)
have to be considered, because all other choices of i3 deliver an entry in the third component that is not in the interval [a3 , b3 ]. For each i3 fulfilling this inequality, let us denote it i∗3 , we can generate appropriate i2 components by observing that gi with i = (0, 0, i∗3 ) has the entry z2∗ := z2 + α · i∗3 · 2 = z2 + α · i∗3 ·
1 2 · 2 · (2 + 1)
in its second component. The appropriate i2 s for i∗3 are thus given by considering the formula for w2 and are easily seen to fulfill a2 ≤ z2∗ + α · i2 · (2 + 1) 2 = z2 + α · i∗3 · 2 + α · i2 · (2 + 1) 2 ≤ b2 , which can be written similarly to before as 2·2 2·2 a2 − z2∗ b2 − z2∗ ≤ i2 ≤ . α 2+1 α 2+1
(23)
Now, having fixed an i∗3 fulfilling (22) and subsequently an i∗2 for this i∗3 fulfilling (23), we can in a similar manner compute appropriate i1 s. The vector gi with i = (0, i∗2 , i∗3 ) has the entry z1∗ := z1 + α · (i∗2 1 + i∗3 1 ) = z1 + α · (i∗2 + i∗3 ) ·
1 2 · 1 · (1 + 1)
in its first component. Similarly to before the appropriate i1 s in (i1 , i∗2 , i∗3 ) are read from an inequality: a1 ≤ z1∗ + α · i1 · (1 + 1) 1 = z1 + α · (i∗2 + i∗3 ) · or
a1 − z1∗ α
2·1 1+1
≤ i1 ≤
1 2 · 1 · (1 + 1)
b1 − z1∗ α
+ α · i1 · (1 + 1) 1 ≤ b1
2·1 . 1+1
This recursive procedure computes all grid vectors gi ∈ Gα,z in the cube Ca,b and is implemented in C++ using the Armadillo library [21] is given in Listing 1.1.
110
H. Bjornsson and S. Hafstein Listing 1.1. Code in C++ that generates the optimal grid.
1
2 3 4
5 6 7 8 9
list HexaGridnew(arma::vec a, arma::vec b, ... double c, int N) { a = a(span(0, N - 1)); b = b(span(0, N - 1)); double tol = 1e-10; // add a small tolerance to the ... cube a -= tol*ones(N); b += tol*ones(N); list Ret; unsigned int i, k; vec e(N, fill::zeros);
10
for (k = 1; k m, where n − m is the degree of redundancy. For a redundant manipulator, J is not a square matrix, and thus not invertible, however, an inverse kinematics solution can be found using (2) θ˙ = J + x˙ + nJ where J + is the pseudoinverse of J and nJ is an arbitrary vector in the null space of the Jacobian. The singular value decomposition of J can be represented as J = U ΣV
(3)
where U is an orthonormal matrix represents the output workspace, Σ is a diagonal matrix where its diagonal elements are the singular values, σi ’s, and V is an orthonormal matrix represents the input space, i.e., the joint space. The SVD of J can be also written as r σi ui vi (4) J= i=1
where r is the rank of J , the σi ’s are the ordered singular values, i.e., σ1 ≥ σ2 ≥ · · · ≥ σm ≥ 0, the unit vectors ui represent the output singular vectors, and vi are the input singular vectors. For a robot at a rank-n singularity, there are n singular values, σi ’s, that become zero. Thus, employing a technique that minimizes singular values of J can be used to identify robot singular configurations. 1
This work is based on a previous conference publication [2]. An additional figure and pseudocode were added to Sect. (2.3) to clearly explain the behavior of the nearly-equal singular values and their associated singular directions and their effect on the direction of their corresponding gradients. In Sect. (3), the previous results for the 4-DoF robot were replaced with new results of another 7-DoF robot (Modified PA-10).
120
2.2
A. A. Almarkhi
Identifying Robot Singularities
In this section, we explain how to employ the gradient descent of a singular value of J to drive a robot of an arbitrary kinematic structure to a singularity. This technique is not limited by the rank of the singularity. The singular value σi in (4) can be expressed as σ i = u i J vi .
(5)
Differentiating (5) with respect to time results in ˙ ˙ i. σ˙i = u˙ i J vi + ui J vi + ui J v
(6)
One can note that u i uj and vi vj are zero for i = j and that the derivative of a unit vector is orthogonal to that vector. So, (6) can be further simplified to [12]
˙ σ˙i = u i J vi .
(7)
The partial derivative of σi with respect to some θk can be expressed as
where
∂J ∂σi = u vi i ∂θk ∂θk
(8)
∂j1 ∂j2 ∂J ∂jn = , ,··· , . ∂θk ∂θk ∂θk ∂θk
(9)
The partial derivative of the ith column of the Jacobian is given by [8, 10]
⎧ ⎪ (z p )z − (z z )p i i i k ⎪ k k ⎪ ,k σm−1 ≥ σm , where is a small threshold (virtually zero). To identify rank-2 singularities, one can start with a population of random joint configurations and employ (12) by moving (k) along the −∇σm−1 direction until the σm−1 < condition is satisfied. However, it is not uncommon for an undesirable behavior to occur, that results from having the two singular values σm−1 and σm become nearly equal before they reach zero, i.e., σm−1 ≈ σm > . In this case, the two singular values are not distinct, which means that their corresponding singular vectors are ill-defined. In other words, any singular vectors (u and v) in the {um−1 , um } and {vm−1 , vm } subspaces are valid for solving (7). Figure 1, shows the behavior of the algorithm when the two smaller singular values become nearly equal through the process of driving a robot into a rank-2 singularity. In this case, σ5 ≈ σ6 (at around iteration 600), which makes them indistinct and their corresponding singular vectors ill-defined. The direction of ∇σ5 can completely change direction from one iteration to another, which affects the rate of convergence. To further explore this behavior, consider how the angle between two gradients, ∇σ5 and ∇σ6 , changes as one rotates the corresponding subspaces, {u5 , u6 } and {v5 , 6m }, as shown in Fig. 2. While for some configurations the angle between the two gradients does not change much (as in the bottom part of the figure), in many other cases, the change in angles can be dramatic, ranging from less than 40◦ to 170◦ . To make the situation even worse, if one picks a rotation at random, it is likely that the angle between ∇σ5 and ∇σ6 will be very large. This can give an idea on how much the rotation of the singular subspaces can contribute to generating two competing gradients. To overcome this unwanted effect, one can start with moving the robot along the (k) −∇σm−1 direction until σ5 and σ6 become very close in value. Then, a combination between the two gradients is computed. Because the singular value decomposition is not unique in these cases, any singular vectors u and v in the subspace associated with the equal singular values are valid. One can rotate the singular subspace such that the (k) (k) angle between ∇σ5 and ∇σ6 is minimized, i.e., u5(new) = u5 cos φ + u6 sin φ u6(new) = u6 cos φ − u5 sin φ v5(new) = v5 cos φ + v6 sin φ v6(new) = v6 cos φ − v5 sin φ
(14)
122
A. A. Almarkhi
Fig. 1. In subplot (a), the evolution of σ5 and σ6 is shown as the standard gradient descent algorithm is employed. The singular value σ5 is minimized until σ5 ≈ σ6 at around iteration 600. When they become nearly equal, the angle between the gradients in successive iterations becomes large. These angles are plotted in (b), where the change in the angles reaches 180◦ . It is clear that the convergence requires a long time (about 3000 iterations) due to the large change in the gradient direction. In this case, the convergence time is approximately 40 seconds. The threshold, = 10−6 , is indicated with a red horizontal line [2].
where φ is the angle of rotation. It should be noted that the angle between the gradients of σ5 and σ6 can vary from 0 to π based on the angle of rotation φ. A suitable selection of the rotation angle for the singular subspaces is crucial in minimizing the change in (k) (k) the gradient direction from one iteration to another. Once the ∇σ5 and ∇σ6 that have the minimum angle between them are computed, a combination that minimizes σ5 can be found (k) (k) (15) ∇σ (k) = γ∇σ5 + (1 − γ)∇σ6 where ∇σ (k) is the desired gradient and γ is a positive scalar where 0 ≤ γ ≤ 1. This linear search will minimize the change in the gradient direction from one iteration to another. After ∇σ (k) is computed, the steepest descent method is applied to find an optimal value of αk in (12) that minimizes σ5 . An analogous process can be employed for identifying higher rank singularities. Algorithm 1 provides a pseudocode for implementing such procedure [1].2 To identify high-rank singularities, i.e., where three or more singular values become zero, one can employ a similar approach to that applied for identifying rank-2 singularities. For a robot in a singular configuration, J is of rank r if σi = 0 for i > r, which 2
The value of N needs to be sufficiently large to span the joint space.
An Algorithm for Identifying High-Rank Singularities
123
Fig. 2. The angles between the gradients ∇σ5 and ∇σ6 as a function of the angles of rotation of the singular subspaces {u5 , u6 } and {v5 , v6 } are shown. Each color represents one case when the two singular values σ5 and σ6 are nearly equal. One can observe that the change in the angle between two gradients can vary from very small angle to 170◦ . Algorithm 1. Determine robot singular configurations of all ranks. 1: generate N random joint-space configurations (a random configuration, θ, is an m × 1, where m is the number of robot joints)
2: for i = 6 to 1 do {for each workspace dimension} 3: for j = 1 to N do {for each random configuration} 4: start with j th joint-space configuration θ 5: compute σi {the ith singular value of the Jacobian} 6: while σi ≥ do 7: if (i = 6) then {for rank-one singularities} 8: ∇σ = ∇σ6 9: else {for high-rank singularities} 10: for all σk where σi ≈ σk do {where k < i} 11: rotate U and V subspace associated with σi and σk ’s {to minimize the angles between ∇σi , ∇σi+1 , · · · , ∇σk }
12: compute ∇σ {optimal linear combination of the gradients of the singular values} 13: end for 14: end if 15: compute α {adaptive linear search along ∇σ} 16: update θ {θ (k+1) = θ (k) − α∇σ} 17: if (σi did not decrease) then 18: go to 23 {local minimum of σi } 19: end if 20: end while 21: save θ and singularity rank 22: end for 23: end for
124
A. A. Almarkhi
Fig. 3. In (a), the singular value σ4 is minimized until iteration 106, where (σ4 ≈ σ5 ), indicated with a red circle. At iteration 135 the singular values σ4 , σ5 , and σ6 become nearly equal, indicated with a green square. Subfigure (b) shows the change in the angle between the gradients in successive iterations. The change of angle reaches 140◦ when the three singular values become nearly equal. In this case, the singular values σ4 , σ5 , and σ6 never converge to zero. The threshold, = 10−6 , is indicated with a red horizontal line [2].
also means the robot is in a rank-(m − r) singularity. To find high-rank singularities, one can move the robot by iteratively solving (12) until a desired σi reaches zero. While moving along the −∇σi direction, it is possible that σi and σi+1 become nearly equal. In this case, the procedure in the previous section can be applied. In some cases, more than two singular values become nearly equal but larger than the threshold, i.e., σi ≈ σi+1 ≈ · · · ≈ σm > . For the purpose of illustration, we will consider the case where a robot is being driven to a rank-3 singularity when the situation σ4 ≈ σ5 ≈ σ6 > occurs, as illustrated in Fig. 3. One can note that around iteration 135 in Fig. 3 all three singular values became very close in value. At this point, the (k−1) angle between the gradient of σ4 in successive iterations (the angles between ∇σ4 (k) and ∇σ4 ) became 170◦ , which resulted from having the singular values indistinct and their corresponding singular vectors ill-defined. This also contributed to an unwanted increase in σ4 because the gradients switched direction. In this case, one can rotate their corresponding singular subspaces to find a suitable rotation that minimizes the sum of the angles between the three gradients. The solution is to minimize an objective function H, where m m θi,j (16) H= (i=r+1) (j=i+1)
An Algorithm for Identifying High-Rank Singularities
125
and θi,j is the angle between ∇σi and ∇σj . The rotation of the singular subspaces can be done by iteratively employing (14). That is, because the singular subspaces are threedimensional (or higher), one can iteratively rotate one plane at a time, i.e., in this case, {u4 , u5 } and {v4 , v5 }, then {u5 , u6 } and {v5 , v6 }, and so on. This iterative rotation should be done until the sum of the angles between all gradients is minimized. After finding the gradients of the singular values, one can use (15) to compute a combination between the first two gradients,∇σ4 and ∇σ5 , that minimize σ4 . Then, using (15) again to compute a combination between the resulting gradient and ∇σ6 that minimizes σ4 . This approach guarantees achieving a minimum amount of gradient direction change and thus a shorter convergence time. This process can continue until the algorithm cannot converge to any higher rank singularities. This algorithm, along with an adaptive step size, was applied to the same robot that resulted in Fig. 3 and the results are shown in Fig. 4. It is clear that the convergence is faster and the change in the gradient angle is smaller. The average convergence time was improved from 40 seconds to less than 2 seconds when the proposed technique is employed.
Fig. 4. This figure shows the behavior of the singular values when the proposed algorithm is applied to a robot to find a rank-3 singularity. This is the same robot as shown in Fig. 3. Subfigure (a) shows the values of the three smaller singular values, σ4 , σ5 , and σ6 , while applying the algorithm with an adaptive step size αk . In subfigure (b), the angles between the gradients in successive iterations are shown. The angles average around 120◦ . The algorithm converges in 16 iterations. The convergence time in this case is less than two seconds. The threshold, = 10−6 , is indicated with a red horizontal line [2].
126
A. A. Almarkhi
If one applies the above algorithm to an initial population of random configurations then one can identify all the singular configurations of various ranks. It is then possible to analyze the resulting singularities to determine the singularity conditions for the robot. Some singularity conditions depend on the values of a few joints while other joints can take any value. One may observe that a singularity can be satisfied by an infinite number of joint configurations. In the next section, we discuss a mathematical approach to identify singular directions associated with the singularities that are physically meaningful. 2.4
Identifying Singular Directions
For a robot Jacobian J with rank r, the last m − r output singular vectors, i.e., ui ’s, span the directions of lost end-effector motion. For spatial manipulators, these singular vectors are 6-dimensional and represent a simultaneous translational and rotational velocity. For rank-1 singularities, there is only one unique singular direction, um , and it is easy to visualize. At higher rank singularities, the singular value decomposition is not unique. Thus, the singular vectors corresponding to the zero singular values (σr+1 , σr+2 , · · · , σm ) are ill defined and will likely not be well aligned with the world (or task) frame of the robot. However, one can apply Givens rotations to these vectors in order to identify an intuitive representation for the lost end effector motion. Consider Fig. 5, that shows a 7-DoF robot in a rank-3 singularity, where both subfigures correspond to the same robot in the same singular configuration. The original singular vectors, identified by employing the singular value decomposition, can be rotated to a more intuitive set that is aligned with the world coordinate frame as follows:
Fig. 5. A 7-DoF robot in the same rank-3 singularity is presented. (Fig. 4, 6, 7, 8, and 9 are produced using the Robotics Toolbox [7].) In subfigure (a), the three SVD-generated singular directions are indicated. In subfigure (b), the singular directions are plotted after they are properly rotated. The singular directions, u4 , u5 , and u6 are represented by green, red, and blue respectively. Dotted arrows represent rotational velocity and solid arrows represent translational velocity [2].
An Algorithm for Identifying High-Rank Singularities
127
⎡ ⎤ ⎤ 0 0 0 0 0 0 ⎢ 0 −0.91 ⎢−0.27 0.62 −0.61 ⎥ 0⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0 ⎢ 0.74 −0.17 −0.50 ⎥ 0 0.91 ⎥ ⎥⇒⎢ ⎥ u4 u 5 u 6 = ⎢ ⎢−1 ⎢−0.50 −0.71 −0.50 ⎥ 0 0⎥ ⎢ ⎥ ⎢ ⎥ ⎣ 0 ⎣−0.33 0.08 0.23 ⎦ 0 −0.41 ⎦ 0 −0.41 −0.12 0.28 −0.27 0 ⎡
Using the rotated subspace above (shown in Fig. 5(b)), one can easily determine that there is no joint velocity that can generate any rotational velocity around the −X direction (green). In addition, the robot cannot have a simultaneous velocity with the illustrated components of −Y translational motion with −Z rotational motion (red). Likewise, there are no joint rates that can generate a simultaneous velocity with the illustrated components of Y translational motion with −Z rotational motion (blue). In the next section, we illustrate the results of applying the proposed algorithms to robots of different kinematic structures.
3 Case Studies 3.1 Introduction Our algorithm to identify robot singularities is neither limited by the kinematic structure of the robot, nor by the rank of the singularity. We employed it on several regional and spatial robots of arbitrary structures. We present the results of two 7-DoF robots. For each robot we start with 10, 000 random configurations in the joint space. From each point, we apply the gradient-based algorithm to find all rank singularities. We show few examples of how singular vectors should look like to show represent the most intuitive representation of the lost directions of motion. 3.2 Mitsubishi PA-10 Robot The Mitsubishi PA-10 is a 7-DoF manipulator with a 6-dimensional work space. Its kinematic structure is similar to the human arm with three spherical joints in the shoulder, one joint in the elbow, and three spherical joints in the wrist. The DH parameters of the PA-10 are listed in Table 1. The singularity analysis on the PA-10 resulted in rank1, rank-2, and rank-3 singularities being identified. We were able to find singularity conditions for each singularity rank. The singular directions, ui ’s, were also identified and appropriately rotated. In all figures, loss of directional velocity is indicated with a solid arrow, while the loss of rotational velocity is indicated with a dotted arrow. Blue arrows indicate u6 , red arrows indicate u5 , and green arrows indicate u4 . Rank-1 Singularities. All rank-1 singularities are summarized in Table 2. Joint 4 is critical in that the robot will be singular if θ4 is equal to ±π or 0. One can observe that joint 4 is the only joint that can change the distance between the shoulder and the wrist. The robot singular directions that indicate the loss of the end-effector velocity are all shown in Fig. 6. Because these are rank-1 singularities, their corresponding singular directions are well defined.
128
A. A. Almarkhi Table 1. DH parameters of the PA-10 robot. Linki αi [radians] ai [m] di [m] θi [radians] 1
−π/2
0
0
θ1
2
π/2
0
0
θ2
3
−π/2
0
0.45
θ3
4
π/2
0
0
θ4
5
−π/2
0
0.50
θ5
6
π/2
0
0
θ6
7
0
0
0.45
θ7
Table 2. PA-10 robot’s rank-1 singular configurations (In all tables, “x” means the angle value does not matter.) [2]. i θ1 θ2
θ5
θ6
θ7
1 x
±π, 0 ±π/2 x
θ3
θ4
x
x
x
2 x
±π, 0 x
x
x
±π, 0 x
3 x
x
x
±π, 0 x
4 x
x
x
x
x
x
±π/2 ±π, 0 x
Fig. 6. The PA-10 robot is shown in rank-1 singular configurations. The singular direction, u6 , is also plotted for each singularity. The singularity conditions, 1, 2, 3, and 4, in Table 2 are satisfied in subfigures (a), (b), (c), and (d), respectively [2].
An Algorithm for Identifying High-Rank Singularities
129
Rank-2 Singularities. The PA-10’s rank-2 singularity conditions are shown in Table 3. The common feature between these conditions is that they do not depend on the value of θ1 or θ7 . We employed Givens rotation to make sure that the two singular vectors, u5 and u6 , are rotated to represent the most intuitive set of singular directions. Figure 7 shows the four singular conditions listed in Table 3. Table 3. PA-10 robot’s rank-2 singular configurations [2]. i θ1 θ2
θ6
θ7
1 x
±π, 0 ±π, 0 ±π, 0 x
θ3
x
x
2 x
±π, 0 x
±π, 0 x
±π, 0 x
3 x
x
±π, 0 ±π, 0 ±π, 0 x
4 x
±π, 0 ±π/2 x
x
θ4
θ5
±π/2 ±π, 0 x
Fig. 7. The PA-10 robot is shown in rank-2 singular configurations. The singular directions u5 and u6 are plotted for each singularity condition. The singularity conditions, 1, 2, 3, and 4, in Table 3 are represented in the subfigues (a), (b), (c), and (d), respectively [2].
Rank-3 Singularities. It was found that the PA-10 robot can have rank-3 singularities by aligning the axes of joints 1, 3, 5, and 7 so that their columns of the Jacobian are linearly dependent. The conditions for these rank-3 singularity configurations are listed in Table 4. As before, we have used Givens rotations to make the singular vectors, u4 ,
130
A. A. Almarkhi
u5 , u6 as intuitive as possible. Figure 8 shows the manipulator in two different rank-3 singular configurations. One can observe that the rank-3 singularities occur for the PA10 when it is completely stretched out (workspace boundary singularity) or when it is folded back on itself. Table 4. PA-10 robot’s rank-3 singular configurations [2]. i θ1 θ2 1 x
θ3
θ4
θ5
θ6
θ7
±π, 0 ±π, 0 ±π, 0 ±π, 0 ±π, 0 x
Fig. 8. The PA-10 robot is shown in rank-3 singular configurations. The singular directions, u4 , u5 , and u6 are indicted in green, red, and blue respectively. In (a), the robot is in configuration θ = [0, 0, 0, 0, 0, 0, 0] and in (b) the robot is in θ = [π, 0, π, π, π, π, π] [2].
3.3
The Modified PA-10 Robot
The Mitsubishi PA-10 is well known for its fault-intolerant 4th joint. This joint is the only joint that can change the distance between the shoulder and the wrist of the PA-10. In [4], they suggested a kinematic change to the original PA-10 robot by moving the axis of the 3rd joint to the coordinate frame of joint 4. Such a change would improve the manipulators’ fault tolerance. The DH parameters of the modified PA-10 are listed in Table 5. It was found that the modified PA-10 has only rank-1 and rank-2 singularities, which cannot be summarized in a simple set of singularity conditions. However, the rank-2 singularities do have some structure that can be visually observed as in Fig. 9.3 In this figure, the robot’s singularities are represented in 3D projections in the joint space, 3
There are no projections where rank-1 singularities appear as distinct manifolds.
An Algorithm for Identifying High-Rank Singularities
131
Table 5. DH parameters of the Modified PA-10 robot. Linki αi [radians] ai [m] di [m] θi [radians] 1
−π/2
0
0
θ1
2
π/2
0.45
0
θ2
3
−π/2
0
0
θ3
4
π/2
0.50
0
θ4
5
−π/2
0
0
θ5
6
π/2
0
0
θ6
7
0
0
0.45
θ7
i.e., the joint values (θi ’s) of each singularity are plotted in certain projection to show the singularity manifold(s) of the robot. We provided two different projections to show how complicated such singularity manifolds could be. Noting that the two subfigures are for the same singular configurations but in different projections. One should note that the values θ1 and θ7 are irrelevant to the robot’s singularities. The joint angles θ2 , θ3 , · · · , θ6 are the only joints that can make the robot singular.
Fig. 9. The rank-2 singularities of the modified PA-10 robot are shown. In (a), a 3D projection in θ3 .θ4 , and θ5 is explained. In (b), the same set of rank-2 singular configurations shown in (a) are projected in θ4 , θ5 , and θ6 .
In general, our approach for identifying robot singularities does not consider physical joint limits. One can employ our technique to find any robot singular configuration but one must exclude infeasible angles due to mechanical limits.
132
A. A. Almarkhi
4 Conclusions This work has proposed an algorithm to identify the robot singular configurations by using the gradient descent of a singular value of the robot’s Jacobian. The algorithm is designed to be independent of the kinematic structure of the robot. In simple cases, i.e., rank-1 singularities, the algorithm uses a pure gradient descent of the minimum singular value. In other cases when the aim is to identity the rank-2, or higher, singularities, the algorithm is still able to deal with the ill-defined singular vectors that result from having two or more singular values nearly equal. The algorithm requires the number of starting random configurations to sufficiently span the robot’s joint space. In high-rank singularities, it is not uncommon that the singular directions generated by the SVD do not result in intuitive directions for the lost end-effector motion. A procedure based on Given’s rotation was suggested to obtain the most intuitive representation of the singular directions in such cases. The entire approach was illustrated on the well-known 7-DoF Mitsubishi PA-10 redundant robot and on a kinematically modified PA-10.
References 1. Almarkhi, A.A., Maciejewski, A.A.: Maximizing the size of self-motion manifolds to improve robot fault tolerance. IEEE Robot. Autom. Lett. 4(3), 2653–2660 (2019) 2. Almarkhi, A.A., Maciejewski, A.A.: Singularity analysis for redundant manipulators of arbitrary kinematic structure. In: International Conference on Informatics in Control, Automation and Robotics (ICINCO 2019) pp. 42–49 (2019) 3. Baker, D.R., Wampler, C.W.: On the inverse kinematics of redundant manipulators. Int. J. Robot. Res. 7(2), 3–21 (1988) 4. Ben-Gharbia, K.M., Maciejewski, A.A., Roberts, R.G.: Modifying the kinematic structure of an anthropomorphic arm to improve fault tolerance. In: 2015 IEEE International Conference on Robotics and Automation (ICRA). pp. 1455–1460. IEEE (2015) 5. Boudreau, R., Podhorodeski, R.P.: Singularity analysis of a kinematically simple class of 7-jointed revolute manipulators. Trans. Canad. Soc. for Mech. Eng. 34(1), 105–117 (2010) 6. Burdick, J.W.: On the inverse kinematics of redundant manipulators: characterization of the self-motion manifolds. In: Waldron, K.J., (eds.) Advanced Robotics: 1989. Springer, Berlin, Heidelberg (1989) https://doi.org/10.1007/978-3-642-83957-3 3 7. Corke, P.: Robotics, vision and control: Fundamental algorithms in MATLAB. Springer, Berlin (2017) 8. Groom, K.N., Maciejewski, A.A., Balakrishnan, V.: Real-time failure-tolerant control of kinematically redundant manipulators. IEEE Robot. Autom. Lett. 15(6), 1109–1115 (1999) 9. Hunt, K.H.: Special configurations of robot-arms via screw theory. Robotica 4(3), 171–179 (1986) 10. Klein, C.A., Chu, L.C.: Comparison of extended jacobian and lagrange multiplier based methods for resolving kinematic redundancy. J. Intell. Robot. Syst. 19(1), 39–54 (1997) 11. Litvin, F., Yi, Z., Castelli, V.P., Innocenti, C.: Singularities, configurations, and displacement functions for manipulators. Int. J. Robot. Res. 5(2), 52–65 (1986) 12. Maciejewski, A.A.: The analysis and control of robotic manipulators operating at or near kinematically singular configurations. Ph.D. thesis, The Ohio State University (1988) 13. Waldron, K., Wang, S.L., Bolin, S.: A study of the Jacobian matrix of serial manipulators. J. Mech. Trans. Autom. Des. 107(2), 230–237 (1985)
Adaptive Interaction Control of a Very Flexible Parallel Robot Manipulator Fatemeh Ansarieshlaghi(B)
and Peter Eberhard
Institute of Engineering and Computational Mechanics, University of Stuttgart, Pfaffenwaldring 9, 70569 Stuttgart, Germany {fatemeh.ansari,peter.eberhard}@itm.uni-stuttgart.de http://www.itm.uni-stuttgart.de/ Abstract. Robot manipulators have many applications and their tasks become complicated when they have interaction with their environments and humans. These complicated tasks need complex controllers. Also, the controller’s complexity will be increased when the used robot manipulators are very flexible. The robot in this work is very flexible while its first mode shape frequency is 1.85 Hz and its oscillation amplitude is more than 10% of its long link length. Hence, this flexibility must be taken into account in the modeling and controlling process. Another challenge appears when this very flexible robot manipulator interacts with an unknown environment. In this work, an adaptive impedance control for the very flexible robot manipulator is designed and implemented. To investigate the controller, a numerical simulation in Matlab is utilized. The results show that the robot using the adaptive controller can interact with different surfaces with high performance. Keywords: Very flexible robot · Parallel manipulator · Position control force control · Impedance control
1 Introduction Manipulator designs often provide maximized stiffness to avoid undesired deformations and vibrations. Such a design results in high accuracy trajectory tracking in the end-effector. However, using a lot of material may result in a poor weight-to-payload ratio and high energy usage. In opposite, light-weight manipulators have low energy usage, less mass, and often high working speeds. The light-weight robot’s advantages in contrast to high-weight robots attracted a lot of research interest. The light-weight design of the robot’s bodies causes significant flexibility which yields undesired deformations and vibrations. Therefore, these light-weight manipulators shall be modeled as flexible bodies in flexible multibody systems and their flexibilities must also be taken into account in the control design. Flexible manipulators are typical examples of underactuated multibody systems since they generally have fewer control inputs than degrees of freedom for rigid body motion and deformation. To obtain a good performance in the end-effector positioning of a flexible manipulator, an accurate and efficient nonlinear control is advantageous. Therefore, to control a flexible multibody system with high c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 O. Gusikhin et al. (Eds.): ICINCO 2019, LNEE 720, pp. 133–150, 2021. https://doi.org/10.1007/978-3-030-63193-2_8
134
F. Ansarieshlaghi and P. Eberhard
performance, using an accurate model is required. The difficulty of designing a nonlinear controller with high performance for the highly flexible manipulator is increased when the controller does not have access to direct measurement of the end-effector and all the system states. The used manipulator in this paper is a parallel robot manipulator. This robot has highly flexible links. The end of the short link is connected in the middle of the long link and described using rigid bodies. The links connection creates a λ configuration and a closed-loop kinematics constraint results that presents the parallel configuration of the robot. In previous works on the robot, its modeling and position control were investigated. The modeling includes the basic mechanical modeling [5], its experimental verification [6], and modeling improvements [1]. The position control of this system is divided into a model-based and a non-model-based feedback controller in joint space, see [6] and [18] for non-model-based, in [3] and [7] for model-based nonlinear controllers. Also, the experimental investigation of the rigid model-based control of the lambda robot in Cartesian space was done [17]. Force/position control of the lambda robot is investigated in [4] while the contact tool weight that is installed on the robot’s endeffector is assumed to be included in the weight of the end-effector. In contrast to the previous work, here, the contact tool is modeled with all of the bodies that are included and these added bodies change the system dynamics and flexibility. Hence, a new system is required to be modeled. The contact force control of light-weight robots is a very important task and there are many applications, e.g., industrial [20, 24], surgical [13, 15, 19], or soft robotic [25] applications. Also, the contact force control of a flexible robot is investigated with one flexible link in [8, 10] and for multi flexible links in [24]. For interacting with an environment, the contact force in the interacting part is required to be controlled. There are two categories to control this force i.e., direct and indirect contact force control. For the direct approaches, the controller regulates the force with a control loop using the feedback of the force measurement and the desired value of the force. In this method, an explicit model of the interaction environment is required, see [16]. In contrast, in the indirect force control methods, the force is controlled by a position controller as an impedance, see [11] and [12]. In this work, we control the interaction using an impedance controller that is utilized to regulate the force, position, and velocity of the contact part of the very flexible lambda robot manipulator. The novelty of this work is, that a nonlinear-flexible-model-based position controller of the lambda robot with new dynamics in the work-space is designed. Then, this position controller is combined with an impedance part in order to have a controller architecture that can be used for controlling the position and acting force of the contact tool during its interaction with an environment. The impedance part determines a new desired position for tracking by the position controller part. In the impedance part, the new desired position and velocity are computed based on the measured or estimated force, position, and velocity of the robot’s contact tool. For validating the flexible-model-based controller performances, the lambda robot tracks trajectories and interacts with environments in various scenarios.
Adaptive Interaction Control of a Very Flexible Parallel Robot Manipulator
135
The simulation results of the controlled robot show that the contact tool tracks the given trajectories and interacts with environments with high performance. In order to explain the details, this paper is organized as follows: Sect. 2 describes the used robot, i.e., its mechanical parts, electrical parts, and contact tool bodies. Section 3 includes the modeling of the flexible parallel lambda robot. Section 4 explains the design of the nonlinear feedforward and feedback controller. Then, the design of the impedance controller is described in Sect. 5. In Sect. 6, the proposed nonlinear controllers are tested on the simulated model and the results are discussed. Finally, in Sect. 7, conclusions are presented.
2 Very Flexible Lambda Robot The used lambda robot has a parallel configuration and highly flexible links. Two prismatic actuators are connecting the links to the ground. They are connected using passive revolute joints to the linear actuators. Another revolute joint is used to connect the short link and the middle of the long link. An additional rigid body is attached to the free end of the long link as an end-effector. The drive positions and velocities are measured with two optical encoders. Two full Wheatstone bridge strain gauges are attached on the long flexible link to measure its deformation. The lambda robot configuration as shown in Fig. 1 has been built in hardware, see [6], at the Institute of Engineering and Computational Mechanics of the University of Stuttgart. To have a convenient robot configuration for interacting with an environment, a tool as shown in Fig. 2 is designed and installed on the robot end-effector. The tool platform is made of Aluminum to have minimum weight. It includes a spring to transfer the force, a force sensor to measure the interaction force, and a part to constrain displacement to the interacting force direction. The total weight of the contact tool is 505 g. The tool in Fig. 2 is installed on the robot end-effector shown in Fig. 3.
Fig. 1. Mechanical setup of the flexible parallel lambda robot.
136
F. Ansarieshlaghi and P. Eberhard
Fig. 2. Designed contact tool for the flexible parallel lambda robot
Fig. 3. Attached contact tool on the robot end-effector.
The electrical part of the experimental setup includes power supplies for motors, strain gauge’s amplifiers, force sensor amplifiers, digital/analog input-output boards, one target Speedgoat, a host computer, etc. For controlling the robot, the online control is done with a Speedgoat performance real-time target machine running a Mathworks xPCtarget kernel, which is called Simulink Real-Time since Matlab R2014a. Also, to observe the controller progress, for the inputs, the outputs, the safety logic of the lambda robot and their communication, a graphical user interface is available.
3 Modeling of the Very Flexible Lambda Robot The modeling process is divided into three major steps. First, the flexible components of the system are modeled with the linear finite element method in the commercial finite element code ANSYS. Next, for online control, the degrees of freedom of the flexible bodies shall be decreased. Therefore, model order reduction is utilized using
Adaptive Interaction Control of a Very Flexible Parallel Robot Manipulator
137
MatMorembs [9]. Then, all the rigid and flexible parts are combined as a multibody system with a kinematic loop using the academic multibody code Neweul-M 2 [14]. The flexible beams, i.e., the thin parts of the long link shown in Fig. 4, are modeled using Timoshenko beam elements. For each beam, one hundred beam elements, that are constrained to the horizontal plane, are used. Additionally, the rigid bodies, i.e. the passive joints, the end-effector, and the contact tool, are attached to the flexible segments. The contact tool is modeled by four bodies, i.e., a contact body, a structure to keep the spring deformation only in direction of the interaction force, a force sensor, and a connection part to install the structure on the robot end-effector. The total weights of the contact tool and the end-effector are 505 g and 727 g, respectively, so that the total weight at the end of the long link is 1235 g.
Fig. 4. Flexible long link with contact tool.
The defined elastic bodies include 200 beam elements and some rigid bodies yielding in total 600 degrees of freedom. Also, the movement constraint of the link shall be taken into account. Therefore, two translations for the nodal frame at the first joint and one translation for the second joint are locked. Next, modal analysis is utilized to determine the natural frequencies and mode shapes of the long link in ANSYS. The overall rigid body motion of the long link is described by a nonlinear approach while the small elastic deformation around this guiding motion is modeled linear with 600 unreduced mode shapes. The first eigenfrequency of the long link of the lambda robot without the contact tool is 2.48 Hz as in [2–4, 7] while its first eigenfrequency with the contact tool is 1.85 Hz. In the next step, the ANSYS output is used for model order reduction in MatMorembs. The modal model order reduction method is used to reduce the order of the flexible multibody model. The reduced flexible link model is then exported to the SID-file format [26]. Finally, the flexible long link, the short link, two linear drives, and three revolute joints are assembled as a flexible multibody system in Neweul-M2 . The equation of motion with a kinematic loop constraint, using the generalized coordinates q ∈ Rn , is formulated as ˙ = g(q, q) ˙ + Bu + C T (q)λ, M (q)q¨ + k(q, q) c(q) = 0.
(1a) (1b)
138
F. Ansarieshlaghi and P. Eberhard
The symmetric positive definite mass matrix M ∈ Rn×n depends on the joint angles and the elastic coordinates. The vector k ∈ Rn contains the generalized centrifugal, Coriolis and Euler forces, g ∈ Rn includes the vector of applied forces and inner forces due to the body elasticity. The input matrix B ∈ Rn×p maps the input vector u ∈ Rp to the system coordinates. The constraint equations are defined by c ∈ Rm . The Jacobian matrix of the constraint C = ∂c/∂q ∈ Rm×n maps the reaction force λ ∈ Rm due to the kinematic loop to the generalized coordinates. Figure 5 shows the setup of the robot with the contact tool on the end-effector in Neweul-M2 .
s1 α1 contact point α2 end-effector
s2
Fig. 5. Mechanical model of the lambda robot with contact tool.
While the system has a kinematics loop as a constraint in Eq. (1b), the system coordinates derivation q¨ can be divided into dependent q¨d ∈ R2 and independent q¨i ∈ R3 coordinates and the system constraints can be written as c¨ = C q¨ + c = Ci q¨i + Cd q¨d + c ,
(2)
where qd = [α1 , α2 ]T are the dependent generalized coordinates, qi = [s1 , s2 , qe , l]T represents independent generalized coordinates, Cd and Ci are dependent and independent parts of the Jacobian matrix of the constraints, and c presents local accelerations due to the constraints. The generalized coordinate l is the contact part movement that results from the acted force f as linear function of spring length changes by l = f /k. The spring stiffness of the contact tool has a constant value of k. Based on Eq. (2), the system’s generalized acceleration can be formulated as 03×3 I3×3 q¨i (3) = q¨ + = Jc q¨i + b . q¨ = q¨d −Cd−1 Ci i −Cd−1 c
Adaptive Interaction Control of a Very Flexible Parallel Robot Manipulator
139
Here, Jc and b are Jacobian matrix and local accelerations of the generalized acceleration based on the system constraints on acceleration level. Finally, by left-side multiplication with the transposed of the Jacobian matrix Jc and replacing q¨ by Eq. (3), Eq. (1a) can be written as JcT M (Jc q¨i + b ) = JcT (f + Bu + C T (q)λ)
(4)
while the product of the Jacobian matrix and the Jacobian matrix of the constraints is JcT C T = 0. The equation of motion (1a) can be determined by ¯ ¯ q¨i = f¯ + Bu. M
(5)
¯ = J T M Jc ∈ R3×3 , the input matrix is B ¯ = In Eq. (5), the inertia matrix is M c 3×2 ∈ R , and the force vector is f¯ = JcT (M b + f ) ∈ R3 . The independent generalized acceleration can be obtained as
JcT B
¯ −1 (f¯ + Bu). ¯ q¨i = M
(6)
The control of the interaction point of the contact tool is our goal. Thus, the kinematics for the contact point is formulated as x = s(qi ),
(7)
where x ∈ R2 is the contact point position and s ∈ R2 is a nonlinear function of the independent coordinates of the system. The velocity and acceleration of the contact point in Cartesian space are calculated from derivatives of the interaction point position in Eq. (7) as x˙ = J q˙i ,
(8a)
¨ = J q¨i + J˙ q˙i , x
(8b)
where J and J˙ are the Jacobian matrix of the robot kinematics and its derivative. Finally, by replacing q¨i with Eq. (6) in joint space in Eq. (8b), the dynamics description can be found in work space with ¯ −1 (f¯ + Bu)) ¯ ¨ = J (M x + J˙ q˙i .
(9)
Equation (9) will be used to control the position and contact force of the contact tool during interaction with an environment.
4 Control of the Very Flexible Lambda Robot The lambda robot control is separated into feedforward (offline) and feedback (online) control parts. In the feedforward control part, the desired system states or contact point trajectories or/and velocities are calculated based on the robot work or joint space, the maximum velocity of the robot prismatic joints, and the closed-loop kinematics as kinematics constraints, the maximum electric current of the robot actuators and the flexible coordinates oscillation amplitude as dynamics constraints, see [21].
140
F. Ansarieshlaghi and P. Eberhard
The feedback control part, the online part acts in the robot joint-space or workspace. In joint space, trajectory tracking task is done and the controller computes the lambda robot input using the nonlinear dynamics of the robot, the measured states, and the desired system states, based on the feedback linearization method [3] or Lyapunov approach in [7]. In the workspace feedback controller, the controller tasks are trajectory tracking and interaction control. For the trajectory tracking task, the controller includes two parts, i.e., a position controller, and the forward kinematics. Figure 6 shows the control structure for controlling the contact point position to track a trajectory in Cartesian space. x
kinematics constraint feedforward controller
x˙ xd x˙ d
position controller
qi forward
q˙i
kinematics u
lambda robot
dynamics constraint
Fig. 6. Nonlinear position control structure.
Therefore, the joint space to workspace transformation of the system dynamics is required and it is computed in Eq. (9). The position control input of the robot is designed in Cartesian space and the control law is formulated as ¯ J −1 (J˙ q˙i − KP e − KD e)), ¯ −1 (f¯ + M ˙ u = −B
(10)
where xd and x˙ d are the desired values of the contact point position and velocity. The error and dynamics of error are calculated by e = x − xd and e˙ = x˙ − x˙ d . The matrices KP and KD correspond to the weighting of the feedback error and can be designed via the LQR method or tuned by hand. They should satisfy the stability conditions for nonautonomous systems such as uniform stability. ¯ is not so straightforward to calculate since it is The inverse of the input matrix B not of full row rank. Therefore, the existing left-inverse is used as a pseudo-inverse. The vector u presents the control input of the robot manipulator and the output of the designed position controller based on the Lyapunov approach. In the most complicated task, i.e., the interaction control, there are two approaches to control the interaction force. In first approach, there is a separate actuator to regulate the force and this is named direct force control method. In contrast to the direct approach, the indirect approach is utilized when there is no separate actuator to regulate force and the interaction force is controlled based on its interaction position. The used robot has only two actuated joints that can just move in the horizontal plane. Only they can control the robot position and there is no other actuator to control
Adaptive Interaction Control of a Very Flexible Parallel Robot Manipulator
141
the interaction force. The contact point can only apply a normal force to an environment since friction is low and not considered. The acted force on the environment depends on the position of the contact part and the robot configuration relative to the environment. Thus, the position and force of the contact point shall be controlled dependently. Since the contact force can not be regulated directly hence, it must be controlled indirectly only by its position. Thus, it is controlled by using indirect force control methods. One of the approaches, see [22, 23], that is used for indirect force control is the impedance control method. In this work, the impedance control approach is used that refers to the dynamic relation of the motion and contact force of the manipulator during an interaction with an object. It means that the impedance controller transfers the contact force as a function of the contact part’s position and then, the position controller regulates it. Accordingly, to transfer the contact force as a function of position a spring is used that is shown in Fig. 2. Advanced feedback control in the workspace is required. This controller is a combination of the impedance controller, the forward kinematics, and the position controller when the robot contact part has interaction with an environment in the Cartesian space of the robot manipulator. Based on the feedback controller structure presented in Fig. 6, the new feedback controller architecture is developed as described to control the force and position of the robot contact point at the same time. The developed feedback controller structure is shown in Fig. 7. feedback controller x x˙ feedforward controller
yd
impedance controller
xd x˙ d
forward kinematics
qi q˙i
position controller
u lambda robot f
Fig. 7. Force/position control structure for interaction with an environment.
In this research, the interaction environment is unknown. This means that there is no information about the position and shape of the environment and there is no desired position xd and velocity x˙ d for tracking by the position controller. Therefore, the impedance controller must produce the desired position (xd ) and velocity (x˙ d ) of the contact point for tracking by the position controller part. Also, for interacting with an environment, only the magnitude of the tracking velocity vd and the magnitude of the acting force fd are fed to the feedback controller. The impedance part based on the measured force of the contact part can distinguish whether the top of the force element is in contact or not. Then, it decides for moving and regulating the force on the surface in case of contact otherwise it is moving to have contact.
142
F. Ansarieshlaghi and P. Eberhard
The desired values are computed based on the fed values by the feedforward controller yd = [vd , fd ]T , the measured states qi and q˙i , the measured contact force f , and the computed position and velocity of the contact part by the forward kinematics x and ˙ In Fig. 7, the position controller acts as in the last part in Fig. 6 based on its input x. signals and the dynamics model of the robot but with differently designed weighting matrices. In Sect. 5 more details about the design idea and mathematical formulation of the impedance controller are presented.
5 Impedance Control Impedance control is a method that is used for indirect force control. It refers to the dynamic relation of the motion and the contact force of the robot during interaction with an object. Here, the used force element as impedance is a spring with stiffness k and this spring does not have any deformation. For this reason, the impedance controller can also be called a stiffness controller, see [22, 23]. Also, it is assumed that interacting surfaces are smooth and are made of undeformed materials. Figure 8 shows some of the interacting configurations for the contact part with an environment. a3 a1 a0
a2
x
a4 l0
l2
l1
l0
l3
f1d f1 y
f3 f2 f2d
f3d
Fig. 8. Different configurations of the contact tool of the robot during interaction with a surface.
When interaction is started, in the first case, called a0 situation, the robot moves in negative direction of x-axis without moving in y-direction till f > 0. Then, surface tracking and force regulating will be started. For the a1 situation, the tool is in contact, i.e., f > 0, and is compressed l1 < l0 . It means that the force regulation and moving on the surface is active. In the a1 situation, the desired force is smaller than the measured one f1 > f1d > 0. Therefore, the tool shall be compressed less and the robot shall move in positive direction of the x-axis, and the impedance controller shall compute the desired trajectory such that xdx (t + t) > xx (t). In the third configuration, the amplitude of the acting force is smaller than the desired values f2d > f2 > 0, see Fig. 8, called a2 situation. The robot shall move in negative direction of the x-axis and xdx (t + t) > xx (t). By this movement, the contact tool will be compressed more as l(t + t) > l(t). In the last contact configuration, the a3 situation, the measured and desired force are equal f3 = f3d > 0. In this status, the desired contact part position is computed only based on the obtained x. In the last case, regulation and tracking tasks
Adaptive Interaction Control of a Very Flexible Parallel Robot Manipulator
143
on the surface are ended and the robot finishes its task by keeping its distance with the surface to avoid interaction with it that is shown in Fig. 8 as a4 situation. For interaction with an unknown environment, the desired position and velocity for the contact point for tracking are calculated as xd = g1 (f, fd , qi , x), ˙ vd ), x˙ d = g2 (qi , x, x,
(11a) (11b)
where g1 and g2 are the designed nonlinear functions to obtain the desired signals. The design functions for computing xd and x˙ d have a very important role since based on these functions the contact tool can find the surface, move along it and apply force to it. The length of the contact tool can be changed based on the magnitude of the contact force and the stiffness of the spring that is used in the tool and is described by l. Some parameters are defined that are utilized in g1 and g2 . These parameters are computed based on the inputs of the functions in Eq. (11).
Fig. 9. Schematic view of the lambda robot in contact with an environment.
Figure 9 shows the schematic view of the lambda robot with its designed tool in interaction with an environment. The length of the designed contact tool is changed based on the acting force that is shown in Fig. 9. The angles β1 and β2 are the angles between the y direction of the global coordinate system and the robot’s contact point in two configurations, see Fig. 9. This angle is calculated as β = g3 (qi ) =
π − (α1 − rqe ), 2
(12)
where qe is the flexible generalized coordinate of the system and r is the constant gain that describes the elastic rotation and displacement of the long flexible link. This angle is utilized to compute the desired position and velocity for the contact point for the upcoming same time. This angle is used to obtain the robot’s configuration in interaction while there is no sensor to measure the robot position and configuration in real-time.
144
F. Ansarieshlaghi and P. Eberhard
The angle θ, see Fig. 9, is determined based on the position of the robot contact point at the current time t, and the previous time t − t. The time step for the control loop is t = 250 µs. This angle is computed as θ = arctan(
xx xxt ) − arctan( t−t ). xyt xyt−t
(13)
In Eq. (13), xxt is the position of the contact point at the current time in x-axis and xyt in y-axis direction. The angle θ will be applied for correcting the velocity direction on the surface. Also, this angle is used to estimate the environment shape. There are different contact situations and configurations that are shown in Fig. 8 and for each interaction situation, the impedance controller needs an appropriate solution. Therefore, for the new desired position, the function g1 is designed based on the Eqs. (12), (13), and (11) for next sample time t + t by Pj fe sin(β) xd = g1 = x + . (14) k fe cos(β) p
In Eq. (14), fe is the force regulation error as fe = fd − f and the matrix Pj presents the design gain. The subscript j is related to the different situations that are shown in Fig. 8, and is selected by the controller based on the amplitude of the force regulation error fe and its sign. Different gain matrices Pj are used for different conditions, in the gain scheduling method. This is utilized to improve the position and force controller performance and to adapt the robot to the new interaction situation. Also, to increase the interaction performance and move on the interacted surface, the function g2 for the next sample time is formulated as vd sin(θ + β) cos(θ), sin(θ) + An e, (15) x˙ d = g2 = sin(θ), cos(θ) v vd cos(θ + β) m
z
where vd is the magnitude of the velocity on the surface and it can be selected as being a constant or function of time and the contact point position. Next, ev is defined as velocity error and it is computed using x˙ = [vx , vy ]T as v vd sin(θ) − x . ev = (16) vy vd cos(θ) The matrix An is the design gain and the subscript n is dependent to the amplitude of the velocity error ev . This gain matrix is selected based on the magnitude of the velocity error ev . The vector m depends on the angles β and θ and these angles are required to compute the velocity projection on the surface. The vector z compensates the velocity tracking error for different configurations and increases interaction performance. After designing the controllers, they shall be tested on the simulated model of the lambda robot to investigate their performance in various scenarios.
Adaptive Interaction Control of a Very Flexible Parallel Robot Manipulator
145
6 Simulation Results To validate the designed nonlinear controllers, two tasks in Cartesian space are defined i.e., trajectory tracking and interaction with an unknown environment. For each task, two scenarios are defined. The first task is the trajectory tracking. This task is defined to test the position controller and to be sure that this controller is suitable to be used for interaction. In the first scenario, the contact part is tracking a line trajectory. For the second scenario, to increase the complexity of the task, the trajectory is defined as a nonlinear path, here a sine shape. For the second task, the interaction with a surface, scenario 3 is described as interaction with a flat smooth surface. In scenario 4, the contact tool interacts with a curved surface having a sine shape. To investigate these tasks, the nonlinear controllers are designed based on the flexible multibody model of the lambda robot with the contact tool and implemented on the simulated model. To have highly accurate simulations, the flexible long link of the lambda is modeled with the first 6 mode shapes, see Fig. 4. The short link is model based on the model that is presented in [1] using only the first mode shape. 6.1 Trajectory Tracking
-.1 -.2 -.3 -.4 -.5 -.6 -.7
y-direction desired contact-p
vabs [m/s]
position [m]
The flexible model-based nonlinear position controller performance is investigated by tracking two trajectories. For this task, the position and velocity tracking of the contact point is our control goal. For this task, the defined trajectories are time dependent. For scenario 1, the robot tracks a line that starts at point xstart = [−0.6, −0.5]T and ends at point xend = [−0.6, −0.1]T in the robot’s workspace and this trajectory should be followed for 8 s. The results for scenario 1 are shown in Fig. 10.
x-direction 0
2
4
6 8 time [s]
10
12
.12 .10 .08 .06 .04 .02 0
0
2
4
6 time [s]
8
10
Fig. 10. Tracking the line trajectory in scenario 1, the left side shows trajectory tracking of the contact part (contact-p) in comparison of the desired trajectory in x- and y-direction. Right side shows the absolute velocity (vabs ) of the contact part in comparison with its desired tracking velocity.
For scenario 2, to increase difficulty of the tracking task, a nonlinear trajectory is defined in a way that the contact point in Fig. 5 tracks a sine shape. The nonlinear
146
F. Ansarieshlaghi and P. Eberhard
-.1 -.2 -.3 -.4 -.5 -.6 -.7
y-direction vabs [m/s]
position [m]
trajectory for the y-direction has the same start and end point while the x-direction is described by x = −0.05 sin(0.3φ) − 0.6 m and φ is defined as φ ∈ [0, 10.5] rad. Figure 11 shows the simulation results for tracking this trajectory.
desired contact-p
0
2
4
x-direction 6 8 10 time [s]
12
.12 .10 .08 .06 .04 .02 0
0
2
4
6 time [s]
8
10
Fig. 11. Tracking a sine shape trajectory in scenario 2 in comparison with its desired curved.
The tracking results for both scenarios show that the designed nonlinear control law based on the Lyapunov function in Eq. (10) is able to limit the error and shows uniform stability. Also, the selected design gain matrices can minimize the tracking error. Based on the simulation results of these two scenarios, the flexible model-based position controller can track the desired trajectories with high accuracy as shown in Fig. 12. The trajectory tracking error in Fig. 12 left shows that the robot contact part can track the trajectories with less than 8 mm maximum tracking error. Also, the results show that the robot under the designed controller oscillates with less than 2 mm amplitude. 8 ev [mm/s]
6 ep [mm]
10 8
line sin-shape
4 2 0
0
2
4
6 8 time [s]
10
12
6 4 2 0
0
2
4
6 time [s]
8
10
Fig. 12. Absolute tracking position ep and velocity ev error for trajectory tracking for scenarios 1 and 2.
Foe scenario 1 in Fig. 10, the root-mean-squared error is 2.394 mm. This shows that the contact part tracks this trajectory with more than 99.4% accuracy. In scenario 2, the root-mean-square error is 2.523 mm and the tracking accuracy for the nonlinear trajectory is more than 99%. The results based on both scenarios show that the controlled robot tracks trajectories with high accuracy.
Adaptive Interaction Control of a Very Flexible Parallel Robot Manipulator
147
After assurance of the designed position controllers in Cartesian space, the feedback controller can be extended via the designed impedance controller in Sect. 5, to regulate the acting force and position of the contact point at the same time. 6.2 Interaction with an Environment Finally, the most complicated task, interaction with an unknown environment is investigated. For this task, two surfaces are defined, a flat (scenario 3) and a curved (scenario 4). The contact part interacts with these environments in the workspace of the lambda robot. To have a better analysis of the controlled robot’s behavior, the desired operations have two phases. In the first phase, the desired velocity signal is time-dependent while in the second phase, the desired velocity signal is constant. For scenario 3, the robot interacts with a flat surface that is located in x = −0.6 m and its width is y = [−0.6, −0.3]T in the robot workspace. The interaction results are shown in Fig. 13. Scenario 4 is defined to show interaction with a curved surface. In this scenario, the contact part has manipulation with a surface that is described by a nonlinear function, here a sine shape. This surface is located in x = −0.6 m and it has the same width but this surface has curved part in x-direction and is described by x = −0.05 sin(0.3φ) − 0.6 m. Interaction results on the curved surface in scenario 4 are shown in Fig. 14. The force regulating error and the velocity error of the contact point on the surface are depicted in Fig. 15. The simulation results of this task show that the robot by using the controller can successfully interact with an unknown environment. The root-meansquare error of the force regulation during interaction with the flat surface is 0.1 N while this value for the curved environment is 0.15 N. The controlled robot by the designed impedance controller can regulate interaction force with an accuracy 95% for the flat surface and 93% for the curved one. A similar analysis for moving on the surfaces show that the contact part is moved with 0.3 cm/s root-mean-square error of the tracking velocity on the flat one and 0.6 cm/s on the curved surface. The investigation results illustrate that the robot can move with the desired velocity with accuracy 94% in scenario 3 and with 84% in scenario 4. Also, the velocity tracking 8 6
2.0
vm [cm/s]
f-act [N]
2.5 1.5 desired contact-p
1.0 0.5 0
0
2
4
6 time [s]
8
10
4 2 0 0
2
4
6 time [s]
8
10
Fig. 13. Interacting with a flat surface in scenario 3, comparing results of the acting force (f-act) and magnitude of velocity (vm ) with their desired values.
148
F. Ansarieshlaghi and P. Eberhard 8
2.0
vabs [cm/s]
f-act [N]
2.5 1.5 desired contact-p
1.0 0.5 0
0
2
4
6 time [s]
8
6 4 2 0 0
10
2
4
6 time [s]
8
10
.7 .6 .5 .4 .3 .2 .1 0
flat curve
3 ev [cm/s]
ef [N]
Fig. 14. Comparison results in scenario 4 of the acting force and velocity on a curved surface with their desired values.
0
2
4
6 time [s]
8
10
2 1 0
0
2
4
6 time [s]
8
10
Fig. 15. Absolute force regulation and moving velocity error during interaction with unknown surfaces.
results of the contact point in the right sides of Figs. 13 and 14 show that the contact points first move forward to contact with the surfaces, as it is the priority of this task. It means that first creating interaction and then moving and force regulating on surfaces are distinct phase of the impedance controller in an unknown interaction. The highly accurate interaction by the adaptive impedance controller verifies the mathematical formulation presented in Sect. 5. Also, the performance of the force/position controller in each interaction scenario shows that the flexible modelbased controller reaches its goals successfully.
7 Conclusions In this paper, a high-performance contact position controller and its combination with an adaptive impedance controller were presented for very flexible robot manipulator with its contact tool. The designed position controller based on the independent system coordinates computed the robot’s input using the measurable states of the system, the obtained, and the desired position, and the velocity of the contact point. The nonlinear feedback controller was extended with an impedance part in order to regulate the contact force, too. The simulation results on the lambda robot model with the contact part illustrated that the nonlinear controllers based on the flexible model have high
Adaptive Interaction Control of a Very Flexible Parallel Robot Manipulator
149
performance and the robot is able to interact with an unknown environment. Also, the adaptive interaction controller did all complicated tasks with high accuracy and successfully overcame all challenges. For future work, the designed controllers, i.e., the position and adaptive impedance controller will be tested on the robot hardware with the contact tool and their performance will be investigated. Also, to improve interaction performance, the friction force between the contact tool and the interacted environment during interaction shall be modeled and compensated. Acknowledgements. This research was performed within the Cluster of Excellence in Simulation Technology SimTech at the University of Stuttgart and is partially funded by the Landesgraduiertenkolleg Baden-W¨urttemberg.
References 1. Ansarieshlaghi, F., Eberhard, P.: Design of a nonlinear observer for a very flexible parallel robot. In: Proceedings of the 7th GACM Colloquium on Computational Mechanics for Young Scientists from Academia and Industry, Stuttgart, Germany (2017) 2. Ansarieshlaghi, F., Eberhard, P.: Experimental study on a nonlinear observer application for a very flexible parallel robot. Int. J. Dyn. Control 7(3), 1046–1055 (2018) 3. Ansarieshlaghi, F., Eberhard, P.: Trajectory tracking control of a very flexible robot using a feedback linearization controller and a nonlinear observer. In: Proceedings of 22nd CISM IFToMM Symposium on Robot Design, Dynamics and Control, Rennes, France (2018) 4. Ansarieshlaghi, F., Eberhard, P.: Hybrid force/position control of a very flexible parallel robot manipulator in contact with an environment. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics Volume 2: ICINCO, Prague, Czech Republic, pp. 59-67. INSTICC, SciTePress (2019) 5. Burkhardt, M., Holzwarth, P., Seifried, R.: Inversion based trajectory tracking control for a parallel kinematic manipulator with flexible links. In: Proceedings of the 11th International Conference on Vibration Problems, Lisbon, Portugal (2013) 6. Burkhardt, M., Seifried, R., Eberhard, P.: Experimental studies of control concepts for a parallel manipulator with flexible links. In: Proceedings of the 3rd Joint International Conference on Multibody System Dynamics and the 7th Asian Conference on Multibody Dynamics, Busan, Korea (2014) 7. Eberhard, P., Ansarieshlaghi, F.: Nonlinear position control of a very flexible parallel robot manipulator. In: Proceedings ECCOMAS Thematic Conference on Multibody Dynamics, Duisburg, Germany (2019) 8. Endo, T., Sasaki, M., Matsuno, F., Jia, Y.: Contact-force control of a flexible Timoshenko arm in rigid/soft environment. IEEE Trans. Autom. Control 62(5), 2546–2553 (2017) 9. Fehr, J., Grunert, D., Holzwarth, P., Fr¨ohlich, B., Walker, N., Eberhard, P.: Morembs-a model order reduction package for elastic multibody systems and beyond. In: Reduced-Order Modeling (ROM) for Simulation and Optimization, pp. 141–166 (2018) 10. Feliu-Talegon, D., Feliu-Batlle, V., Tejado, I., Vinagre, B.M., HosseinNia, S.H.: Stable force control and contact transition of a single link flexible robot using a fractional-order controller. ISA Trans. 89, 139–157 (2019) 11. Hogan, N.: Impedance control: an approach to manipulation: Part II-Implementation. J. Dyn. Syst. Meas. Contr. 107(1), 8–16 (1985) 12. Jung, S., Hsia, T.C., Bonitz, R.G.: Force tracking impedance control of robot manipulators under unknown environment. IEEE Trans. Control Syst. Technol. 12(3), 474–483 (2004)
150
F. Ansarieshlaghi and P. Eberhard
13. Kamikawa, Y., Enayati, N., Okamura, A.M.: Magnified force sensory substitution for telemanipulation via force-controlled skin deformation. In: IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, pp. 1–9 (2018) 14. Kurz, T., Burkhardt, M., Eberhard, P.: Systems with constraint equations in the symbolic multibody simulation software Neweul-M2 . In: Proceedings of the ECCOMAS Thematic Conference on Multibody Dynamics, Brussels, Belgium (2011) 15. Li, Y., Ganesh, G., Jarrass´e, N., Haddadin, S., Albu-Schaeffer, A., Burdet, E.: Force, impedance, and trajectory learning for contact tooling and haptic identification. IEEE Trans. Rob. 34(5), 1170–1182 (2018) 16. Luh, J., Fisher, W., Paul, R.: Joint torque control by a direct feedback for industrial robots. IEEE Trans. Autom. Control 28(2), 153–161 (1983) 17. Morlock, M., Burkhardt, M., Schr¨ock, C., Seifried, R.: Nonlinear state estimation for trajectory tracking of a flexible parallel manipulator. IFAC-PapersOnLine 50(1), 3449–3454 (2017) 18. Morlock, M., Burkhardt, M., Seifried, R.: Control of vibrations for a parallel manipulator with flexible links - concepts and experimental results. In: Proceedings of the MOVIC & RASD, International Conference, Southampton, England (2016) 19. Sandoval, J., Su, H., Vieyres, P., Poisson, G., Ferrigno, G., Momi, E.D.: Collaborative framework for robot-assisted minimally invasive surgery using a 7-DoF anthropomorphic robot. Robot. Auton. Syst. 106, 95–106 (2018) 20. Schindlbeck, C., Haddadin, S.: Unified passivity-based cartesian force/impedance control for rigid and flexible joint robots via task-energy tanks. In: IEEE International Conference on Robotics and Automation (ICRA), Seattle, Washington, pp. 440–447 (2015) 21. Seifried, R., Burkhardt, M., Held, A.: Trajectory control of flexible manipulators using model inversion. In: Proceedings of the ECCOMAS Thematic Conference on Multibody Dynamics, Brussels, Belgium (2011) 22. Siciliano, B., Khatib, O. (eds.): Springer Handbook of Robotics. Springer, Heidelberg (2016) 23. Siciliano, B., Villani, L.: Robot Force Control. Springer, Heidelberg (1999) 24. Suarez, A., Giordano, A.M., Kondak, K., Heredia, G., Ollero, A.: Flexible link long reach manipulator with lightweight dual arm: soft-collision detection, reaction, and obstacle localization. In: IEEE International Conference on Soft Robotics (RoboSoft), Livorno, Italy, pp. 406–411 (2018) 25. Vogel, J., Haddadin, S., Jarosiewicz, B., Simeral, J.D., Bacher, D., Hochberg, L.R., Donoghue, J.P., van der Smagt, P.: An assistive decision-and-control architecture for forcesensitive hand-arm systems driven by human-machine interfaces. Int. J. Robot. Res. 34(6), 763–780 (2015) 26. Wallrapp, O.: Standardization of flexible body modeling in multibody system codes, part i: definition of standard input data. Mech. Struct. Mach. 22(3), 283–304 (1994)
Bayesian Transfer Learning Between Uniformly Modelled Bayesian Filters Ladislav Jirsa1 , Lenka Kukliˇsov´a Pavelkov´a1(B) , and Anthony Quinn1,2 1
The Czech Academy of Sciences, Institute of Information Theory and Automation, Prague, Czech Republic {jirsa,pavelkov,aquinn}@utia.cas.cz 2 Trinity College Dublin, the University of Dublin, Dublin, Ireland http://www.utia.cas.cz Abstract. We investigate sensor network nodes that sequentially infer states with bounded values, and affected by noise that is also bounded. The transfer of knowledge between such nodes is the principal focus of this chapter. A fully Bayesian framework is adopted, in which the source knowledge is represented by a bounded data predictor, the specification of a formal conditioning mechanism between the filtering nodes is avoided, and the optimal knowledge-conditioned target state predictor is designed via optimal Bayesian decision-making (fully probabilistic design). We call this framework Bayesian transfer learning, and derive a sequential algorithm for pairs of interacting, bounded filters. To achieve a tractable, finite-dimensional flow, the outputs of the time step, transfer step and data step are locally projected onto parallelotopic supports. An informal variant of the transfer algorithm demonstrates both strongly positive transfer of high-quality (low variance) source knowledge—improving on a former orthotopically supported variant—as well as rejection of low-quality (high variance) source knowledge, which we call robust transfer. Keywords: Bayesian transfer learning · Fully probabilistic design · Bayesian filtering · Uniform noise · Parallelotopic bounds · Robust transfer
1 Introduction Transfer learning relates to contexts where a computer-based learning task undertaken by a processing node—called the target node—is enhanced in performance and/or efficiency if the results of a related task—undertaken by a distinct source task—are made available to it. Within this generic formulation, there is significant current interest in the defining and implementation of transfer learning schemes for a range of machine intelligence [34] and big data problems [42]. One such important focus is in distributed knowledge representation and processing in sensor networks, addressing IoT and Industry 4.0 priorities [6]. Local inferences, driven by data in the individual sensors, are merged or fused in order to design a global inference [21]. Applications range from health monitoring [38] and environmental sensing [43], to cooperative systems for ˇ grant 18-15970S. This research has been supported by GACR c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 O. Gusikhin et al. (Eds.): ICINCO 2019, LNEE 720, pp. 151–168, 2021. https://doi.org/10.1007/978-3-030-63193-2_9
152
L. Jirsa et al.
indoor tracking [5] and urban vehicular localization [26]. A multi-disciplinary literature on multimodal and multi-sensor data fusion has evolved, with overviews available in [23] and [13], for example. Many of those methods and applications can be well characterized as transfer learning for data-driven inference tasks. The technical approaches adopted for problems of data-driven knowledge transfer and fusion are as diverse as the applications themselves. Wiener-type moment estimation for knowledge representation is adopted in the measurement vector fusion method of [39], as well as in sequential Kalman filter variants for fusion of measurements [7] and states [41]. A Dempster-Shafer evidence-based approach to data fusion has recently been outlined [40]. Traditional machine learning and artificial intelligence tools have, of course, also been deployed, such as support vector machines and neural networks [35], clustering methods [38], functional regression [32], and expert system approaches [25]. Our focus in this chapter is on the Bayesian framework for data-driven knowledge representation and decision-making for transfer learning in sensor networks. In this context, sensors convert local data harvests into probability distributions of quantities of interest, and the computed distributions are transferred from source to target tasks (i.e. sensor nodes), to improve inference in the latter. This objective is related to distributional pooling [1], insofar as local knowledge representations are probabilistic, although pooling criteria are not generally deducible from formal Bayesian-theoretic principles. More recently, fully Bayesian approaches to distributional pooling have been derived, for instance in the context of centralized pooling [2]. A Bayesian transfer learning strategy has been reported for static [10] and dynamic [28] transfer between Kalman filters. In those works, the transfer step is induced between the time- and data-steps of isolated Kalman filtering. Our aim in this chapter is to make progress with the extension of these methods—which we call fully probabilistic design (FPD) [30]—to important classes of filtering tasks beyond the linear-Gaussian case. In particular, there is considerable motivation in sensor network contexts in investigating bounded innovations and/or observation processes [11, 14, 36]. Under Gaussian distributional assumptions, one approach has been to project state estimates onto the constraint surface [9], or to truncate the distribution [33]. Using sequential MonteCarlo sampling methods, constraints are imposed via the accept/reject steps of the algorithm [24]. Meanwhile, non-probabilistic techniques for handling “unknown-butbounded errors” seek to design an approximate set containing the state estimates [3, 4]. The current authors have adopted a strictly Bayesian framework for linear state filtering with uniform stochastic drivers (LSU filtering). Specifically, in [29], the filtering distribution is derived via local distributional approximations which sequentially project the output of the data step and time step into the class of uniform distributions on orthotopic supports (UOS). We will recall this setting in Sect. 2. In recent work, we have developed a Bayesian knowledge transfer mechanism between pairs of these UOS filters, adopting the axiomatically optimal approach of fully probabilistic design (FPD) [30], as already mentioned above. An important characteristic of this approach is that it does not require specification of a complete stochastic dependence model between the source and target tasks. While complete modelling is the conventional approach, allowing knowledge transfer via standard Bayesian conditioning [18], it can be difficult to elicit a complete (hierarchical) model of inter-task dependence, leading to model
Transfer Learning Between Uniform Bayesian Filters
153
robustness issues. Instead, in our work, the source-knowledge-conditional model is designed in an optimal Bayesian decision-theoretic sense. We refer to this FPD-optimal approach to conditioning for knowledge transfer as Bayesian transfer learning [27]. The knowledge transferred from the source to target filters is represented via the one-step-ahead (data) predictor [31]. This was derived for the UOS filter in [15]. The derivation of the approximate sequential statistics, and the performance of the FPDoptimal transfer algorithm, between UOS filters has recently been reported [17]. In this chapter, we report the following progress with FPD-optimal Bayesian transfer learning between LSU filtering nodes: (i) The local projections into the UOS class (after each data and time step) have no guarantees bounding the sequential model approximation error. In particular, the projection of the filtering distribution and the state predictor into the orthotopic support is loose. We have recently derived an algorithm for closure of the isolated LSU distributions (state filtering and prediction) within the class of uniform distributions on (more flexible) parallelotopic supports (UPS), which we will also review in Sect. 2. In this chapter, we will extend the UOS-closed Bayesian transfer learning algorithm of [17] to UPS-closure. We anticipate tighter modelling approximations and correspondingly improved filtering performance. (ii) For the purpose of (i), we must derive the one-step-ahead data predictor via which the UPS-closed (source) filter transfers its knowledge sequentially to the target filter. We therefore adapt the (scalar) data predictor derived for the UOS-closed filter [15] to this more flexible class. (iii) In [17], our simulations indicated only very modest improvements in filtering (tracking) accuracy of the UOS-closed target filter under FPD-optimal transfer from the UOS source filter. By examining the geometric nature of this datapredictive transfer, we were able to propose an informal transfer method involving intersections between the interval supports of the (scalar) source and target data predictors [17]. In this chapter, we will again develop this alternative—now in the context of the UPS-closed filters—as a variant for comparison with the formal (FPD-optimal) transfer method. In Sect. 5, we will present simulation evidence of the comparative performance of these transfer learning algorithms. The paper is organised as follows. In Sect. 2, the UOS and UPS classes are introduced. Then, the optimal local distributional projections (i.e. approximations) within both the UOS and UPS class of data and time updates are presented, as well as the relevant predictors. Section 3 defines the problem of transfer learning in a pair of Bayesian filters and proposes a tractable (i.e. finite-dimensional and recursive) framework for FPD-optimal knowledge transfer of the source data predictor. This framework is then applied to (i) a pair of UOS filters and (ii) a pair of UPS filters. In Sect. 3.2, the informal knowledge transfer method between a pair of UOS filters—inspired by a supportintersection property of FPD [17]—is recalled, and deployed here, too, to transfer between a pair of UPS filters. Section 5 reports the comparative performances of the proposed transfer learning algorithms in the context of UOS and UPS filter pairs. The following notation is used: zt is the value of a column vector, z, at a discrete time instant, t ∈ t ≡ {1, 2, . . . , t}; zt;i is the i-th entry of zt ; z(t) ≡ {zt , zt−1 , . . . , z1 , z0 }; ≡ denotes equality by definition; ∝ denotes equality up to a
154
L. Jirsa et al.
constant factor. Matrices are capitalized (e.g. A), vectors and scalars are lowercase (e.g. b). Aij is the element of matrix A in the i-th row and j-th column. Ai denotes the i-th row of A. z denotes the length of a (column) vector, z, and Z denotes the defined set of z. I is the identity matrix. χz (Z) is the set indicator, which equals 1 if z ∈ Z and 0 1/p z p otherwise. The p-norm zp = zi , particularly, z2 is an Euclidean norm i=1
and z∞ = max(|zi |). No formal distinction is made between a random variable and its realisation.
i
2 Bayesian Filtering and Prediction with Uniformly Distributed Noise Processes In this section, Bayesian filtering and prediction are introduced and applied to a statespace model with uniformly distributed noises (which we call the LSU model). Two cases are distinguished: projection of the states into (i) the class of uniform distributions on an orthotopic support (i.e. the UOS class); and into (ii) the class of uniform distributions on a parallelotopic support (i.e. the UPS class), respectively. 2.1
UOS and UPS Class Definition
We consider a finite-dimensional vector random variable z with realisations in a bounded subset of Rz . We now define appropriate support sets in a z -dimensional space. A polytope is a bounded set defined (bounded) by a finite number of flat facets. In this paper, we specialise this to the following types of convex polytope. A zonotope ZZ is a convex polytope formed by the intersection of k strips, k ≥ z . It can be expressed as (1) ZZ = {z : a ≤ V z ≤ b}, where a and b are vectors of length k, of lower and upper bounds, respectively, which are meant entry-wise; V is a matrix of size k×z with rank z . The i-th strip is therefore given by the inequality (2) ZSi = {z : ai ≤ Vi z ≤ bi }. A parallelotope ZP is a special case of a zonotope (1) with k = z , so that V is a square invertible matrix. An orthotope ZO is a special case of the parallelotope with adjacent facets orthogonal. It then holds that V = I in (1). Furthermore, z ≡ a and z ≡ b, being the lower and upper bounds of z, respectively. Then, the orthotope is specified by ZO = {z : z ≤ z ≤ z}.
(3)
Table 1 presents volumes —equivalent to Lebesgue measure—and degrees-offreedom (dof) for the above defined support sets. The dof corresponds to the minimal number of geometric parameters that unambiguously defines the mentioned set.
Transfer Learning Between Uniform Bayesian Filters
155
Table 1. The degrees of freedom (dof) and Lebesgue measure, i.e. volume, V, of various convex polytope specialisations. dof Orthotope
2z
volume V z (z i − z i ) i=1
Parallelotope z (z + 2) | det V |−1 Zonotope
k(k + 2), k > z
z i=1
(bi − ai ), see [29]
the sum of the Vs of its generating parallelotopes, see [12]
Besides the standard form (1), we introduce another two equivalent descriptions of a parallelotope ZP that will be used further. The [−1, 1]-form of parallelotope, equivalent to standard form (1), is defined as ZP = {z : −1(z ) ≤ W z − c ≤ 1(z ) },
(4)
where 1(z ) is a unit vector of length z and Wij = 2Vij /(bi − ai ), ci = (bi + ai )/(bi − ai ), i, j = 1, . . . , z . We can transform it back to the standard form (1) using a = c − 1(z ) , b = c + 1(z ) , V = W. An expression for the parallelotope (4) in terms of its centroid zˆ is [37] ZP = {z : z = zˆ + T ξ},
(5)
where T = W −1 , zˆ = T c, ∀ξ s.t. ξ∞ ≤ 1. UPS Class — A uniform distribution of z on a parallelotopic support (1), i.e. the UPS distribution, is defined as Uz (a, b, V ) ≡ VP−1 χz (a ≤ V z ≤ b)
(6)
where V is given in the second row of Table 1. The first moment (mean value) of the UPS distribution (6) is E[z|a, b, V ] ≡ zˆ = V −1
b+a . 2
(7)
The second central moment (covariance) of the UPS is cov[z|a, b, V ] = where Gii =
bi −ai 2
1 −1 V GG V −1 , 3
(8)
and Gij = 0, i = j, i, j = 1, . . . , z .
UOS Class — A uniform distribution of z on an orthotopic support (3), i.e. the UOS distribution, is defined as −1 Uz (z, z) ≡ Uz (a, b, I) = VO χz (z ≤ z ≤ z)
(9)
156
L. Jirsa et al.
z=a, z=b are the lower and upper bound on z, respectively, and V is given in the first row of Table 1. The first moment (mean value) of the UOS distribution (9) is E[z|a, b, I] ≡ zˆ =
z+z b+a ≡ . 2 2
(10)
The second central moment (covariance) of the UPS is cov[z|a, b, I] = where Gii = 2.2
bi −ai 2
1 GG , 3
(11)
and Gij = 0, i = j, i, j = 1, . . . , z .
Bayesian Filtering and the LSU Model
In the Bayesian framework [10, 19], the state-space system is expressed stochastically, via the following pdfs: prior pdf: f (x1 )
(12)
observation model: f (yt | xt ) time evolution model: f (xt+1 | xt , ut ) . Here, yt is a scalar observable output, ut is a known system input (optional, for generality), and xt is an -dimensional unobservable system state, t ∈ t∗ . We assume that (i) hidden process xt satisfies the Markov property, (ii) no direct relationship between input and output exists in the observation model, and (iii) the inputs consist of a known sequence u1 , . . . , ut . Bayesian filtering, i.e., state estimation consists of the evolution of the posterior pdf f (xt |d(t)) where d(t) is a sequence of observed data records dt = (yt , ut ), t ∈ t∗ . The evolution of f (xt |d(t)) is described by a two-steps recursion that starts from the prior pdf f (x1 )) and ends by data update at the final time t = t: – Data update (Bayes rule) f (xt |d(t)) = x t
– Time update
f (yt |xt )f (xt |d(t − 1)) , f (yt |xt )f (xt |d(t − 1))dxt
f (xt+1 |d(t)) = f (xt+1 |ut , xt )f (xt |d(t)) dxt .
(13)
(14)
x t
We introduce a linear state space model with a uniform noise (LSU model) in the form yt − r, y˜t + r) f (yt |xt ) = Uy (˜ f (xt+1 |xt , ut ) = Ux (˜ xt+1 − ρ, x ˜t+1 + ρ)
(15)
Transfer Learning Between Uniform Bayesian Filters
157
where y˜t = Cxt , x ˜t+1 = Axt + But , A, B, C are the known model matrices/vectors of appropriate dimensions, νt ∈ (−ρ, ρ) is the uniform state noise with known parameter ρ, nt ∈ (−r, r) is the uniform observation noise with known parameter r. Exact state estimation for the LSU model (15)—following (13) and (14)—leads to an intractable form of posterior pdf, with the dimension of the sufficient statistics unbounded with the number, t, of processed data. In [15, 29], approximate Bayesian state estimation (filtering) within the UOS class is proposed. Recently, [16] has relaxed the orthotopic geometry of the distributional supports, instead proposing an approximate Bayesian filtering flow via sequential local projections of the state predictive and filtering distributions into parallelotopic supports. In both cases, the algorithms are based on projections via minimization of Kullback-Leibler divergence (KLD) [22], achieving closure of the filtering distribution, f (xt |d(t)), within the respective class, at each step of filtering. 2.3 Bayesian Filtering and Prediction Within the UOS Class This section summarizes the results of the filtering and prediction within UOS class as presented in [15, 29]. Data Update. The data update (13) processes f (xt |d(t − 1)) together with the f (yt |xt ) + (15) according to the Bayes rule. It starts in t = 1 with f (x1 ) = Ux (x+ 1 , x1 ). The exact pdf is uniformly distributed on a zonotopic support that results from the intersection of an orthotope (9) obtained during previous time update—or f (x1 ) in the first step— and strips given by new data f (xt |d(t)) ∝ χ(mt − ρ ≤ xt ≤ mt + ρ)χ(Cxt − r ≤ yt ≤ Cxt + r)
I mt − ρ mt + ρ ≤ . x ≤ =χ C t yt − r yt + r
(16)
In [15], the following approximation is proposed f (xt |d(t)) ≈ Uxt (at , Vt , bt ).
(17)
The resulting pdf belongs to UPS class. Nevertheless, the subsequent time update requires pdf from UOS class. Therefore, the support of (17) is circumscribed by the orthotope so that f (xt |d(t)) ≈ Uxt (xt , xt ), (18) Time Update. The time update (14) processes f (xt |d(t)) (18) together with f (xt+1 |xt , ut ) (15). The exact pdf f (xt+1 |d(t)) is non-uniformly distributed on a zonotopic support. In [15], the following approximation is proposed χ(mt;i − ρi ≤ xt;i ≤ mt;i + ρi ) f (xt |d(t − 1)) ≈ mt;i − mt;i + 2ρi i=1
=
i=1
Uxt;i (mt;i − ρi , mt;i + ρi ) = Uxt (mt − ρ, mt + ρ),
(19)
158
L. Jirsa et al.
where mt = [mt;1 , . . . , mt; ] , mt = [mt;1 , . . . , mt; ] , mt;i =
min(Aij xt−1;j + Bi ut−1 , Aij xt−1;j + Bi ut−1 ),
(20)
j=1
mt;i =
max(Aij xt−1;j + Bi ut−1 , Aij xt−1;j + Bi ut−1 ).
j=1
Predictive Pdf for LSU Model. The data predictor of a linear state-space model is the denominator of (13), where f (yt |xt ) is given by (15) and f (xt |d(t − 1)) is the result of the approximate time update (19). The approximate uniform predictor as proposed in [15] is
χ y t ≤ yt ≤ y t
f (yt |d(t − 1)) ≈ , (21) yt − yt where
mt − m t + ρ ˆ+ − r + |C| yt = C x t 2
m − mt t + ρ , yt = C x ˆ+ + r + |C| t 2
(22) (23)
where x ˆ+ xt−1 + But−1 ≡ E [xt |d(t − 1)] and |C| is absolute value by items. t = Aˆ This predictive pdf is conditioned by mt and mt considered as statistics, provided the parameters A and B be known. Mean value yˆt ≡ E [yt |d(t − 1)] of (21) is yt + yt mt + m t E [yt |d(t − 1)] = =C . (24) 2 2 E[xt |d(t−1)]
2.4
Bayesian Filtering and Prediction Within the UPS Class
This section summarizes the results of the filtering within the UPS class, as presented in [16], and deduces the relevant data predictor. Data Update. The data update (13) processes f (xt |d(t − 1)) together with the f (yt |xt ) + + (15) according to the Bayes rule. It starts in t = 1 with f (x1 ) = Ux (a+ 1 , b1 , V1 ). The exact pdf is uniformly distributed on a zonotopic support that results from the intersection of a parallelotope (27) obtained during previous time update—or f (x1 ) in the first step—and strips (9) given by new data + + f (xt |d(t)) ∝ Ux (a+ t , bt , Vt )Uyt (Cxt − r, Cxt + r) ∝
Transfer Learning Between Uniform Bayesian Filters
∝χ
+
+
a+ bt Vt t ≤ xt ≤ . yt − r C yt + r
159
(25)
We approximate (25) by a pdf uniformly distributed on a parallelotopic support [16]. Then, the approximate pdf takes the form f (xt |d(t)) ≈ Ux (at , bt , Vt ).
(26)
Note that the data update step for UPS class is identical to the data update step for UOS class except that last step, i.e, the circumscribing by an orthotope. Time Update. The time update (14) processes f (xt |d(t)) (26) together with f (xt+1 |xt , ut ) (15). The exact pdf f (xt+1 |d(t)) is non-uniformly distributed on a zonotopic support. It has a linear piecewise shape with shaping parameters ρ, at and bt . It is approximated by the uniform pdf, see [16]. The support of f (xt+1 |d(t)) is computed in two steps. At first, the support Xt of f (xt |d(t)) (26) is transformed according to the deterministic part, i.e. x ˜t+1 , of (15). For this, the parallelotope Xt of form (1) is converted into the form (5) and then the linear transformation x ˜t+1 = Axt + But is ˜ t+1 , is then transformed back performed as presented in [16]. The resulting support, X ˜ t+1 is expanded by the set [−ρ, ρ] which to the form (1). Secondly, the parallelotope X corresponds to the stochastic part of (15). The resulting support corresponds to the ˜ t+1 and the set [−ρ, ρ] which is a zonotope (1). Minkowski sum of X The projection of the above mentioned uniform pdf into UPS class is explained in Appendix (Sect. 5.3), giving + + f (xt+1 |d(t)) ≈ Ux (a+ t+1 , bt+1 , Vt+1 ).
(27)
Predictor. The observation predictor of a linear state-space model is the denominator of (13), where f (yt |xt ) is given by (15) and f (xt |d(t − 1)) is the result of the approximate time update (27). Then, equivalently, ≤ Vt+ xt ≤ b+ (28) f (yt |d(t − 1)) ∝ χ(yt − r ≤ Cxt ≤ yt + r)χ(a+ t ) dxt . t x t
a strip in the xt -space
a parallelotope in the xt -space
The integral (28) is positive, if the intersection of the strip and the parallelotope is nonempty, otherwise zero. Its value, i.e. volume of the intersection, is influenced by yt . Changing yt will shift the strip in (28) across the parallelotope, therefore f (yt |d(t − 1)) is trapezoidal. Its functional form can be approximated optimally in Bayesian sense by a uniform pdf on the support of the trapezoidal f (yt |d(t − 1)), as shown in [29]. Therefore, we need the values y t and y t , for which the strip touches the parallelotope from either side. These values will be the bounds of the uniform predictive pdf. + + Let the parallelotope χ(a+ t ≤ Vt xt ≤ bt ) be expressed in the direct form (5). + + ≥ 0, i = 1, . . . , , by multipliAccording to [37], Tt must be adjusted so that CTt;i cation of corresponding columns by −1. Such an adjustment does not change the set
160
L. Jirsa et al.
described by Tt+ . Then, using analysis in [37], yt =
Cx ˆ+ t
yt =
Cx ˆ+ t
−
r+
+ CTt;i
i=1
+
r+
,
(29)
.
(30)
+ CTt;i
i=1
and the predictive pdf, approximated by a uniform distribution, is f (yt |d(t − 1)) ≈ U(y t , y t ).
(31)
Note that if Vt+ is an identity matrix, i.e. the support of f (xt |d(t−1)) is an orthotope then the formulae (29) and (30) correspond to (22) and (23).
3 FPD-optimal Transfer Learning Between Bayesian Filters We consider a pair of Bayesian filters, and distinguish between the primary filter— which we call the target filter or node [34]—and the secondary filter, which we call the source filter or node. In the latter, all sequences and distributions are subscripted by “s”. Each filter processes a local observation sequence, yt and ys,t , t ∈ t , respectively, informative of their local, hidden (state) process, xt and xs,t , respectively. If transfer learning from the source to the target filter is to be effective (the case known as positive transfer [34]), then we must assume that ys (t) is informative of x(t). The core technical problems to be addressed are (i) that there is no complete model relating x(t) and xs (t), and, therefore, (ii) the standard Bayesian conditioning does not prescribe the required target distribution conditioned on the transferred knowledge, since it yields this prescription only in the case of complete modelling [18]. However, an insistence on specification of a complete model of inter-filter dependence can be highly restrictive, and incur model sensitivity in applications. Instead, we acknowledge that the required conditional target state predictor, f˘(xt |d(t − 1), fs ), after transfer of the source data predictor, fs (ys ,t |ds (t − 1)), is nonunique. Therefore, we solve the incurred decision problem optimally via the fully probabilistic design (FPD) principle [30], choosing between all cases of f˘(xt |d(t − 1), fs ) consistent with the knowledge constraint introduced by the transfer of fs (ys ,t |ds (t − 1)) [31]. FPD axiomatically prescribes an optimal choice, f o , as a minimizer of the Kullback-Leibler divergence (KLD) from f˘ to an ideal distribution, f I , chosen by the designer [20]: (32) f o (xt |d(t − 1), fs ) ≡ arg min D(f˘||fI ). f˘∈F
Here, the KLD is defined as f˘ ˘ D(f ||fI ) = Ef˘ ln , fI
(33)
Transfer Learning Between Uniform Bayesian Filters
161
and F in (32) denotes the set of f˘ constrained by fs [30],[31]. The ideal distribution is consistently defined as fI (xt |d(t − 1)) ≡ f (xt |d(t − 1)), being the state predictor of the isolated target filter (27 or (19)) (i.e. in the absence of any transfer from a source filter). In [10], the following mean-field operator was shown to satisfy (32), and so to constitute the FPD-optimal target state predictor after the (static) knowledge transfer specified above: f o (xt |d(t − 1), fs ) ∝ f (xt |d(t − 1)) ⎤ ⎡ ⎥ ⎢ × exp ⎣ fs (yt |ds (t − 1)) ln f (yt |xt ) dyt ⎦ .
(34)
yt∗
Note that the optional input, ut , is known and constant (being a conditioning quantity in d(t − 1)), and so plays no role in the FPD optimization (32). 3.1 FPD-optimal Knowledge Transfer Within the UOS and UPS Classes In [17], the KLD minimiser (34) is presented with fs (yt |ds (t − 1)) (21), and f (yt |xt ) (15) in the form f o (xt |d(t − 1), fs ) ∝ f (xt |d(t − 1)) ⎤ ⎡ ⎥ ⎢ × exp ⎣ χ y s ≤ yt ≤ y s ,t ln χ (Cxt − r ≤ yt ≤ Cxt + r) dyt ⎦ . ,t
(35)
yt∗
The first term in the integral indicates the integration limits are y s and y s ,t . The char,t acteristic function in the logarithm must equal one, which sets the exponent to zero, inducing an additional geometric constraint for xt represented by the strip (2) in the state space1 : (36) y s ,t − r ≤ Cxt ≤ y s + r. ,t
The first term f (xt |d(t − 1)) in (35) has the form (19) for the UOS class and (27) for the UPS class. The subsequent time update of the target filter yields a domain of the particular class in the state space. The intersection of this domain and (36) constitutes the transfer learning between a pair of the filters. However, the subsequent data update requires the prior pdf to be on an support of the same particular class, too. Therefore, the constrained set is now circumscribed by another orthotope (UOS) or parallelotope (UPS). The designs for the UOS and UPS classes are structurally the same, differing only in the detail of the statistics in the data predictors and in state predictive supports. This all is true because of the scalar assumption for yt . 1
Note that a typo occurred in [17], in which the lower and upper bounds, y s ,t and y s , were ,t interchanged. The error is corrected in (36).
162
L. Jirsa et al.
Note that the similarity of (36) with the observation model (15), and the similarity of their processing flows, point to an interesting correspondence between the constraint step and the data update step. The difference is that the knowledge transfer (36) accepts the interval, i.e. the uniform distribution, Uy (y s , y s ,t ), whereas the data update ,t is driven by the realized observation, yt . The latter can be expressed via sifting with the Dirac distribution, δy (y − yt ). In this distributional sense, (34)—and its instantiation in UOS/UPS transfer learning in (35)—is a generalized form of Bayesian conditioning in the incompletely modelled context. 3.2
Informal Knowledge Transfer
The mechanism (35) for knowledge transfer consists of the state predictor (the first term), modulated by the exponential term, the effect of which is to constrain the support of the FPD-optimal state predictor. This geometric insight—that the effect of the transferred knowledge is to impose a support restriction on the conditional state predictor, f o (xt |d(t−1), fs ) (35)—led us to propose an informal transfer learning mechanism: the target state predictor is updated under transfer by projecting it onto the intersection of the source and target state predictor support sets. We reported in [17] that this informal variant achieved more positive transfer than the (disappointing) improvement achieved under the FPD-optimal mechanism (35). Though not interpretable as an optimal transfer learning mechanism, the promising performance, and intuitive and computationally efficient nature of this criterion, have together encouraged us to assess its performance in transfer learning between UPS filters, as reported in Sect. 5. + for the source fil≤ Vs + Practically, having fs (xt |d(t − 1)) ∝ χ as + t t xt ≤ bs t + ter and f (xt |d(t − 1)) ∝ χ at ≤ Vt+ xt ≤ b+ for the target filter, both from the t UPS class, their intersection corresponds to a zonotope (1) given by the inequalities +
+ +
bt Vt at ≤ xt ≤ . Using [37], the zonotope is circumscribed by a paralas + Vs + bs + t t t ⊕ ⊕ lelotope with shaping parameters a⊕ t , bt and Vt that form a support for the prior pdf after the knowledge transfer.
4 Algorithmic Summary The source and target filters run their state estimation tasks in parallel. The source filter is run in isolation. The latter sequentially transfers its one-step-ahead data predictor, fs (ys ,t |ds (t − 1)) (Sect. 3), to the target filter. Note that the stochastic mechanism which generates this data-predictive sequence at the source does not need to be specified, in line with the incomplete modelling which characterizes this Bayesian transfer learning framework. The processing of the source knowledge sequence occurs between the time update and data update steps in the target filter. Initialisation: – Choose final time t > 1, set initial time t = 1. – Set prior values x1 , x1 , noises ρ, r, rs .
Transfer Learning Between Uniform Bayesian Filters
163
On-line: 1. Get new data ut , yt , ys ,t . 2. Data update: add the data strip (15) and approximate the obtained support by a parallelotope (for details see Appendix A.2 in [29]) to obtain the resulting form (26) for each filter. 3. Compute at , bt , Vt (26) for each filter. 4. Compute the point estimate x ˆt (7) for the target filter. 5. If t = t, go to 10. + + 6. Time update: compute a+ t+1 , bt+1 , Vt+1 in (27) for each filter. 7. Knowledge transfer (informal): circumscribe the intersection of the parallelotopes (27) for the source and target state predictors by a parallelotope described ⊕ ⊕ by a⊕ t+1 , bt+1 , Vt+1 and use it as a prior pdf support for the data update step. 8. Set t = t + 1. 9. Go to 1. 10. End.
5 Simulation of Bayesian Transfer Learning for a Constrained Position-Velocity System We consider the case where the target process is a LSU filter simulating a positionvelocity system [8] with uniform drivers, which we also investigated in [17]. We are interested in inferring whether the Bayesian knowledge transfer mechanism prescribed by the formal FPD-optimal scheme (35), as well as via the informal variant in Sect. 3.2, enhances the performance of the target filter, when compared to the isolated filter. We also want to consider the influence of the UOS-closed versus UPS-closed variants. 5.1 Simulation Details We simulate xt ∈ R2 and yt ∈ R, so that the transferred data predictor in both the UOS and UPS cases are uniform on a line segment, differing only in the bounds, (22) and
11 (23) versus (29) and (30), respectively. The known model parameters are A = , 01 C = [1 0], ρ = 1(2) × 10−5 , where 1(2) is the unit vector of length 2, and r = 10−3 . 0 Also, B = and ut = 0, i.e. a system without input/control. The estimation was 0 run for t = 50 time steps. We investigate the influence of the observation noise, rs , of the source filter (15) on the state estimate precision of the target filter, quanitified by t the TNSE (i.e. total norm squared-error), defined as TNSE = ˆ xt − xt 22 . For each t=1
combination of the parameters, the computation was run 500 times and the TNSEs were averaged. The simulated state processes, xt (and so the state noise variance, ρ) are equal in the target and source filters, while the observation processes are simulated conditionally independently of xt , with different noise variances, r and rs , respectively.
164
5.2
L. Jirsa et al.
Comparative Performance
Holding the target observation variance constant at r = 10−3 , as stated above, we investigated the TNSE as a function of the source observation variance, rs . In [17], we reported that positive transfer—i.e. the improved precision of filtering under transfer of high-quality (rs < r) source knowledge—was achieved in the UOS-closed case, for both the FPD-optimal (i.e. formal) transfer (35) and the informal variant (Sect. 3.2). However, the improvement was very limited—particularly for the formal scheme— because of the coarseness of the orthotopic approximation of the support of f o (xt |d(t− 1), fs ) (35). We speculated that a parallelotopic approximation would yield an improved positive transfer, a hypothesis we now wish to validate experimentally. For this purpose, we investigated the performance of the informal Bayesian transfer learning variant, comparing the UOS-closed and UPS-closed variants with the isolated target filter performance (Fig. 1). The UOS-closed performance has already been reported in [17], corroborating the main findings: that (i) the informal variant achieves robust transfer when rs > r (i.e. rejection of poor-quality source knowledge); and (ii) the positive transfer in the rs < r regime is significant, reducing the TNSE by about an order of magnitude when rs r, compared to the isolated target filter. We supplement these findings with the performance of the UPS-closed informal transfer scheme. As anticipated in [17], this significantly outperforms the UOS-closed informal scheme in the positive transfer regime, reducing the TNSE by about a further 1.5 orders of magnitude in the rs r regime.
-5
UPS and UOS transfer, r=0.001
-5.5
log 10 TNSE
-6 -6.5 -7 -7.5 -8
knowledge transfer UPS isolated filter knowledge transfer UOS
-8.5 -8
-7
-6
-5
-4
-3
-2
-1
log 10 r s
Fig. 1. TNSE of the target LSU filter as a function of the observation noise variance rs of the source filter. UPS-closed informal knowledge transfer compared to UOS-closed informal knowledge transfer, and to the isolated target filter (no transfer).
Transfer Learning Between Uniform Bayesian Filters
165
5.3 Discussion and Closing Remarks The impact on the state predictor of the geometric constraint (36) in the transfer step (35) of the FPD-optimal (i.e. formal) scheme requires further study. Cases may exist where higher-variance source predictors may anomalously tighten the target state predictor support, or, indeed, where the strip (36) may not overlap with the support of the pre-transfer target state predictor, f (xt |d(t − 1)), at all. In contrast, the informal variant of the transfer learning scheme (Sect. 3.2) avoids these anomalies, and—as demonstrated in the simulations above—shows promise in achieving strongly positive transfer for high-quality source knowledge, and rejection of low-quality knowledge (robust transfer). It is inspired by the behaviour of the formal transfer in this bounded support setting (i.e. support intersection), but a formal justification is required. Another issue is that the informal variant transfers the source state predictor, rather than the source data predictor, implying that this sequential knowledge is made available by a source LSU filter (as in the simulation setting above), an assumption that is not required in the formal scheme. The isolated LSU target filter performance (TNSE) acts as a reference/datum for assessing all the (informal) transfer algorithms (Fig. 1). It confirms that transfer of a high-quality source predictor reduces the sequential state estimation error in the target LSU filter (i.e. positive transfer). Conversely, low-quality source predictors are rejected by the schemes (i.e. robust transfer), and the target filter performance reverts to that of the isolated filter in this regime. Furthermore, the findings support the hypothesis that the projection of the output of the transfer step (35) into the UPS class is a closer local approximation than projection into the UOS class, owing to the greater number of degrees of freedom in the parallelotopic versus orthotopic support. A similar finding was reported in [16], in the context of isolated UPS-closed LSU filtering, where improved filtering accuracy was reported in comparison to the UOS-closed variant. Perhaps the most significant finding is the robustness of the informal transfer scheme in the context of LSU filtering with bounded knowledge transfer. Our earlier experience with FPD-optimal conditioning on source data predictors for incompletely modelled pairs of Kalman filters indicated that high-variance (Gaussian) source predictors were not rejected, and that performance can deteriorate relative to the isolated Kalman filter in this regime (i.e. non-robust transfer). This is caused by the loss of the second moment of the Gaussian source predictor under the mean-field operator in (34). The problem was overcome via an informal adaptation of the prescribed algorithm in [10, 28], and via a formal scheme involving a hierarchical model relaxation in [27]. While further work is needed to understand the formal scheme in the current LSU setting, there is clear evidence that robustness is intrinsic in this setting, and that no such moment loss occurs. Future work will focus on formal understanding of the conditions under which the FPD-optimal Bayesian transfer learning scheme achieves robust transfer, both in interacting LSU filtering pairs, and for more generally specified sensor nodes. As already implied above, the informal transfer scheme may be formally justifiable as an FPDoptimal transfer learning scheme for adapted settings, but this has yet to be demonstrated. Finally, we plan further investigation of the influence of the geometric supports of the local distributional approximations on the quality of positive transfer in the LSU filtering case, seeking (i) to reduce the number of local approximations per step of the algorithm, and (ii) investigating more flexible geometries, particularly low-order zonotopes.
166
L. Jirsa et al.
Appendix Stochastic Expansion of a Parallelotope After the deterministic transformation of the support Xt , according to (15), within the time update, its expansion corresponding to the stochastic effect must be preformed. For the i-th facet of the parallelotope to be expanded, we define these vectors: ˜ t+1 , orthogonal to the facet. Define si = ωii a unit – ω i , which is the i-th row of W ω 2 vector orthogonal to the i-th facet, – τ i , which is the i-th column of the matrix T˜t+1 . It points from the centre of the i parallelotope to the centre of the i-th facet. Define v i = ττi 2 a unit vector parallel to τ i , – i = α1i ρ1 , α2i ρ2 , . . . , αi ρ , where ρ1 , . . . , ρ are the components of the state noise parameter ρ and α1i , . . . , αi = ±1 so that its j-th element of the i-th vector αji ≡ sign( ij ) = sign(sij ). This arrangement guarantees si and i point to the same half-space given by the i-th facet. Assume the orthotopic set (box) [−ρ, ρ] with its centre on the i-th facet. Then, di = si i is the distance of the most distant vertex of the box and the i-th facet of the parallelotope. Let Δτ i 2 ≡ Δτ i v i be a length increment of the vector τ i pointing to the centre of the i-th facet shifted by the Minkowski sum. Then, (Δτ i v i ) si = di , i.e. Δτ i =
di si i ω i i = = . v i si si v i ω i v i
(37)
+
+ is The i-th column τ i of the new matrix Tt+1 +
τi = τi
τ i 2 + Δτ i 2 . τ i 2
(38)
+ The expanded circumscribing paralelotope is then described as xt+1 = x ˆ+ t+1 + Tt+1 ξ, ˆ˜t+1 (its centre is preserved). Transforming the shaping parameters T + where x ˆ+ t+1 = x t+1 + and x ˆt+1 to the form (1), we get the approximate pdf + + f (xt+1 |d(t)) ≈ Ux (a+ t+1 , bt+1 , Vt+1 ).
(39)
References 1. Abbas, A.E.: A Kullback-Leibler view of linear and log-linear pools. Decis. Anal. 6(1), 25– 37 (2009) 2. Azizi, S., Quinn, A.: Hierarchical fully probabilistic design for deliberator-based merging in multiple participant systems. IEEE Trans. Syst. Man, Cybern. Syst. 48(4), 565–573 (2018) 3. Becis-Aubry, Y., Boutayeb, M., Darouach, M.: State estimation in the presence of bounded disturbances. Automatica 44, 1867–1873 (2008)
Transfer Learning Between Uniform Bayesian Filters
167
4. Chisci, L., Garulli, A., Zappa, G.: Recursive state bounding by parallelotopes. Automatica 32(7), 1049–1055 (1996) 5. Dardari, D., Closas, P., Djuric, P.M.: Indoor tracking: theory, methods, and technologies. IEEE Trans. Veh. Technol. 64(4), 1263–1278 (2015) 6. Diez-Olivan, A., Del Ser, J., Galar, D., Sierra, B.: Data fusion and machine learning for industrial prognosis: trends and perspectives towards industry 4.0. Inf. Fusion 50, 92–111 (2019) 7. Dou, Y., Ran, C., Gao, Y.: Weighted measurement fusion Kalman estimator for multisensor descriptor system. Int. J. Syst. Sci. 47(11), 2722–2732 (2016) 8. Faragher, R.: Understanding the basis of the Kalman filter via a simple and intuitive derivation. IEEE Signal Process. Mag. 29(5), 128–132 (2012) 9. Fletcher, R.: Practical Methods of Optimization. John Wiley & Sons, Hoboken (2000) 10. Foley, C., Quinn, A.: Fully probabilistic design for knowledge transfer in a pair of Kalman filters. IEEE Signal Process. Lett. 25, 487–490 (2018) 11. Goudjil, A., Pouliquen, M., Pigeon, E., Gehan, O., Targui, B.: Recursive output error identification algorithm for switched linear systems with bounded noise. IFAC-PapersOnLine 50(1), 14112–14117 (2017) 12. Gover, E., Krikorian, N.: Determinants and the volumes of parallelotopes and zonotopes. Linear Algebra Appl. 433(1), 28–40 (2010) 13. Gravina, R., Alinia, P., Ghasemzadeh, H., Fortino, G.: Multi-sensor fusion in body sensor networks: state-of-the-art and research challenges. Inf. Fusion 35, 68–80 (2017) 14. He, J., Duan, X., Cheng, P., Shi, L., Cai, L.: Accurate clock synchronization in wireless sensor networks with bounded noise. Automatica 81, 350–358 (2017) 15. Jirsa, L., Kukliˇsov´a Pavelkov´a, L., Quinn, A.: Approximate Bayesian prediction using state space model with uniform noise. In: Gusikhin, O., Madani, K. (eds.) Informatics in Control Automation and Robotics. Lecture Notes in Electrical Engineering, vol. 613, pp. 552–568. Springer International Publishing, Cham (2020) 16. Jirsa, L., Kukliˇsov´a Pavelkov´a, L., Quinn, A.: Bayesian filtering for states uniformly distributed on a parallelotopic support. In: 2019 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) (ISSPIT2019). Ajman, United Arab Emirates (2019) 17. Jirsa, L., Kukliˇsov´a Pavelkov´a, L., Quinn, A.: Knowledge transfer in a pair of uniformly modelled Bayesian filters. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 499–506. INSTICC, SciTePress (2019) 18. Karbalayghareh, A., Qian, X., Dougherty, E.R.: Optimal Bayesian transfer learning. IEEE Trans. Signal Process. 66(14), 3724–3739 (2018) 19. K´arn´y, M., B¨ohm, J., Guy, T.V., Jirsa, L., Nagy, I., Nedoma, P., Tesaˇr, L.: Optimized Bayesian Dynamic Advising: Theory and Algorithms. Springer, London (2005) 20. K´arn´y, M., Kroupa, T.: Axiomatisation of fully probabilistic design. Inf. Sci. 186(1), 105– 113 (2012) 21. Khaleghi, B., Khamis, A., Karray, F.O., Razavi, S.N.: Multisensor data fusion: a review of the state-of-the-art. Inf. Fusion 14(1), 28–44 (2013) 22. Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Stat. 22, 79–87 (1951) 23. Lahat, D., Adali, T., Jutten, C.: Multimodal data fusion: an overview of methods, challenges, and prospects. Proc. IEEE 103(9), 1449–1477 (2015) 24. Lang, L., Chen, W., Bakshi, B.R., Goel, P.K., Ungarala, S.: Bayesian estimation via sequential Monte Carlo sampling – constrained dynamic systems. Automatica 43(9), 1615–1622 (2007) 25. Majumder, S., Pratihar, D.K.: Multi-sensors data fusion through fuzzy clustering and predictive tools. Expert Syst. Appl. 107, 165–172 (2018)
168
L. Jirsa et al.
26. Nassreddine, G., Abdallah, F., Denoux, T.: State estimation using interval analysis and belief-function theory: application to dynamic vehicle localization. IEEE Trans. Syst., Man, Cybern. Part B (Cybernetics) 40(5), 1205–1218 (2010) 27. Papeˇz, M., Quinn, A.: Robust Bayesian transfer learning between Kalman filters. In: 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. Pittsburg (2019) 28. Papeˇz, M., Quinn, A.: Dynamic Bayesian knowledge transfer between a pair of Kalman filters. In: 2018 28th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE, Aalborg, Denmark (2018) 29. Pavelkov´a, L., Jirsa, L.: Approximate recursive Bayesian estimation of state space model with uniform noise. In: Proceedings of the 15th International Conference on Informatics in Control, Automation and Robotics (ICINCO), pp. 388–394. Porto, Portugal (2018) 30. Quinn, A., K´arn´y, M., Guy, T.: Fully probabilistic design of hierarchical Bayesian models. Inf. Sci. 369(1), 532–547 (2016) 31. Quinn, A., K´arn´y, M., Guy, T.V.: Optimal design of priors constrained by external predictors. Int. J. Approximate Reasoning 84, 150–158 (2017) 32. Shamshirband, S., Petkovic, D., Javidnia, H., Gani, A.: Sensor data fusion by support vector regression methodology-a comparative study. IEEE Sens. J. 15(2), 850–854 (2015) 33. Simon, D., Simon, D.L.: Constrained Kalman filtering via density function truncation for turbofan engine health estimation. Int. J. Syst. Sci. 41, 159–171 (2010) 34. Torrey, L., Shavlik, V.: Transfer learning. Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques, pp. 242–264. IGI Global, United States (2010) 35. Vapnik, V., Izmailov, R.: Knowledge transfer in SVM and neural networks. Ann. Math. Artif. Intell. 81(1–2), 3–19 (2017) 36. Vargas-Melendez, L., Boada, B.L., Boada, M.J.L., Gauchia, A., Diaz, V.: Sensor fusion based on an integrated neural network and probability density function(PDF) dual Kalman filter for on-line estimation of vehicle parameters and states. Sensors 17(5), 987 (2017) 37. Vicino, A., Zappa, G.: Sequential approximation of feasible parameter sets for identification with set membership uncertainty. IEEE Trans. Autom. Control 41(6), 774–785 (1996) 38. Vitola, J., Pozo, F., Tibaduiza, D.A., Anaya, M.: A sensor data fusion system based on knearest neighbor pattern classification for structural health monitoring applications. Sensors 17(2), 417 (2017) 39. Willner, D., Chang, C., Dunn, K.: Kalman filter algorithms for a multi-sensor system. In: 1976 IEEE Conference on Decision and Control including the 15th Symposium on Adaptive Processes, pp. 570–574 (1976) 40. Xiao, F.: Multi-sensor data fusion based on the belief divergence measure of evidences and the belief entropy. Inf. Fusion 46, 23–32 (2019) 41. Yang, C., Yang, Z., Deng, Z.: Robust weighted state fusion Kalman estimators for networked systems with mixed uncertainties. Inf. Fusion 45, 246–265 (2019) 42. Zang, Y., Hu, X.: Heterogeneous knowledge transfer via domain regularization for improving cross-domain collaborative filtering. In: 2017 IEEE International Conference on Big Data, pp. 3968–3974 (2017) 43. Zou, T., Wang, Y., Wang, M., Lin, S.: A real-time smooth weighted data fusion algorithm for greenhouse sensing based on wireless sensor networks. Sensors 17(11), 2555 (2017)
Multi-robot Cooperation for Assembly: Automated Planning and Optimization Ludwig N¨agele(B) , Andreas Schierl, Alwin Hoffmann, and Wolfgang Reif Institute for Software and Systems Engineering, University of Augsburg, 86159 Augsburg, Germany {naegele,schierl,hoffmann,reif}@isse.de Abstract. In the field of industry 4.0 and smart factories of the future, dynamic teams of robots will play an important role in the manufacturing of customtailored products with small lot sizes. Especially the planning for such multifunctional robot cells forms a key challenge, finding appropriate tasks for cooperating robots to perform efficient production. In this paper, we present two approaches for automated planning of assembly programs for multiple robots that separate between domain and automation level. While one uses a decomposition approach with plan-space planning and explicit decomposition rules, the other considers heuristic searching in state space. Both approaches are evaluR ated with different planning problems in the blocks world domain (i.e., LEGO R DUPLO ). The evaluation shows that both approaches outperform the results achieved with classical planning based on state-space search such as A∗ . Furthermore, a concept for post-processing programs to optimize execution time by dependency-based interleaving of robot tasks is introduced that fully integrates into both planning approaches. Keywords: Multi-robot · Robotic assembly · Process planning
1 Introduction In the past and now, industrial robots are commonly used in automated production. Their high mechanical flexibility, speed and precision makes them play an important role in efficient mass production with relatively low variability, e.g., in the automotive industry. As a progressing trend, however, the consumer market expects products to be increasingly individual, requiring adaptable production for small lot-sizes. The production of such customized products with high variability requires production cells with drastically increased flexibility and performance. To achieve this amount of flexibility, the proposed approach is based on multi-functional robot cells with dynamic cooperation of robots [3]. Additionally, to solve the software aspect and minimize manual work required for each individual product, an automatic planning approach is used that synthesizes control software for the corresponding robots. This work is partly funded by the German Research Foundation (DFG) under the TeamBotS grant. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 O. Gusikhin et al. (Eds.): ICINCO 2019, LNEE 720, pp. 169–192, 2021. https://doi.org/10.1007/978-3-030-63193-2_10
170
L. N¨agele et al.
This article is an extended version of previous work by the authors [12] which introduced a modular and domain-guided planning approach to manufacture products. There, the products are analyzed by domain-specific modules and broken down into basic components and their relationship, for which possible sequences of domainspecific tasks are found by the planning algorithm. In a further step, these domain tasks are used to find suitable sequences of automation tasks based on the available robots and automation devices as well as their skills, reducing the search space and avoiding potential planning problems such as deadlocks. While the previous work is focused on one specific planning approach that works through search in state space, this article introduces and compares a competing approach based on task decomposition and plan-space planning as an additional contribution. Furthermore, it gives an in-depth description of the orthogonal method that allows to reduce plan execution time through parallelization without having to explicitly handle the timing and concurrency aspects within the planning approach. The paper is structured as follows: After an exploration of related work in Sect. 2, Sect. 3 describes the domain and two case studies that are used to illustrate and evaluate the approach. Section 4 introduces the main concepts of the planning approaches, followed by Sect. 5 and Sect. 6 focusing on planning concepts on different layers. The integration of multi-robot aspects into the planning approach is stated in Sect. 7. The approach is evaluated in Sect. 8, before a conclusion is given in Sect. 9.
2 Related Work Early in computer science, planning has become a topic of high attention and great interest. After numerous approaches for describing planning problems emerged, the STRIPS formalism has been developed at SRI International [1] which was picked up by many other languages that describe planning problems in the field of state-space search. The formalism basically consists of states describing the environment, as well actions with preconditions and results that transition from one state to other. However, planning through state-space search requires a lot of computing time. The amount of possible actions in each state causes a large branching factor and thus exponential complexity. To cope with this difficulty, different approaches have peen proposed in order to optimize planning performance. One of them, the A∗ algorithm [4] uses heuristics to guide the state search. However, it requires a fitness function adequate for the given domain. Another option using automatically derived levels of abstraction in order to minimize state search is used by ABSTRIPS [2, 6], for example. When planning with multiple robots, the number of possible actions in each step vastly increases and the state space grows exponentially. Here, approaches using constraints and resources in order to guide parallel tasks have been presented [18]. Later, the planning domain definition language (PDDL) has been developed [9] to simplify the transfer of planning problems between different planners. Nowadays, many planners accept PDDL as description language for planning problems, while using various of the aforementioned optimization techniques. Nevertheless, more complex planning problems are still not solvable by these approaches as they do not rely on abstractions that naturally exist in the planning
Multi-robot Cooperation for Assembly
171
Fig. 1. Virtual model of the robot cell. Specialized gripper clamps enable process-reliable grasping of duplo bricks [12].
domain which could help solve the problem in a more targeted manner. As an alternative to search-based planning, task decomposition is often used that substitutes complex tasks by an equivalent sequence of simpler tasks. A common and widespread representative for task decomposition is the formalism of Hierarchical Task Networks (HTNs) [13]. In contrast to state-space planning, HTNs require detailed knowledge about the respective domain. However, this makes them perform better for larger planning problems by using abstraction levels present in the domain. For planning in the field of manipulation or assembly, various approaches have been investigated, for example the finding of adequate task sequences in order to build predefined structures [17] or the handover of work pieces between two robots [7]. Besides stationary robots as especially used in industrial applications, these topics have also been applied to mobile robots, where spatial considerations play an important role for task allocation [5, 15, 16]. When looking at current planning systems for multi-robot environments, approaches use task decomposition, often relying on HTNs or even manual definitions, and solve task allocation in a search based manner [20]. Similarly, in combined task and motion planning, HTNs for task planning are mixed with search based approaches for collision-free motion planning [19]. This article extends the authors’ previous work [8, 10–12] in the field of automatic assembly planning by using concepts from the above papers and making the previous solutions applicable to more complex assembly problems.
3 Case Study A domain which fits well in the field of automatic planning and is both, intuitive and R R DUPLO . Here, well-defined bricks can be assemcommonly known, is LEGO bled to structures which form intuitive problem definitions and clearly illustrate specific planning challenges. Because challenges in different domains of assembly are often similar, concepts to solve planning problems for duplo can be transferred to more industrial domains. For example, having to apply support to facilitate an assembly step – as a form of implicit cooperation – appears in both duplo assembly and industrial production.
172
L. N¨agele et al.
(a) Duplo house structure
(b) Deadlock scenario
Fig. 2. A duplo house as an example for assembly planning (a). A missing corner brick as cause for deadlocks (b) [12].
To perform automatic planning steps like task allocation based on reachability and collision free movement planning, spatial information about robots and products in the robot cell is needed. A virtual model representing the real robot cell contributes such information. For the duplo case study, the initial robot cell setup consists of a large duplo ground plate, a smaller plate which is used for supplying duplo bricks, and two lightweight robot arms from KUKA, a LBR iiwa and a LWR 4 (see Fig. 1). Both robots are equipped with a Schunk WSG 50 parallel gripper. The grippers are extended with purpose-built clamps which have been developed to reliably handle duplo bricks (see Fig. 1). They are 3D-printed with a negative brick profile and fitting notches to deterministically adjust the grasped brick to ensure the brick stays in place even when forces apply. Two structures in the duplo domain have been set up to show different difficulties for the planning of assembly tasks. First, Sect. 3.1 illustrates the challenge of efficiently finding adequate plans when building a duplo house. In a second scenario (Sect. 3.2), necessary robot team cooperation needs to be identified and applied in order to build a duplo stairway. 3.1
Duplo House
Figure 2a shows a duplo house built on a base plate, consisting of 26 bricks, having a lower base part with one door and three window sides, and a pyramid-like roof top. Two different kinds of bricks are used: bricks with raster 2 × 2 and 4 × 2. To assemble the house, many sequences of brick placements are possible. All such existing sequences span a graph of alternatives to assemble the house. Figure 2b shows a particular node in this graph where the house base is fully built and the first layer of the roof is almost finished, with just one corner brick still missing. Assuming the placement of that brick as next step in the given scenario, the clamps of the gripper would collide with either of the nearby bricks, making this placement impossible. From the current node in the graph, other steps to place other bricks of following roof layers are still possible, but all ongoing paths of the graph will inevitably fail because of this corner brick. The house scenario denotes a general planning problem about deadlock causes that are not obvious and detectable at the time they originally occur, but instead still have
Multi-robot Cooperation for Assembly
(a) Duplo stairway structure
173
(b) Supporting the structure
Fig. 3. A duplo stairway (a) as an example for the need of cooperation between robots with different skills (b) [12].
many permissible subsequent steps. Such late appearing deadlock causes lead to unnecessarily high planning times with exponential magnitude. Even the rather simple house example might not be solvable in adequate time when having to examine all these possibilities. One can blame an inadequate domain description which should detect such harmful brick constellations early on. However, in the house example an additional gripper with another grasp strategy could resolve the deadlock, for example by grasping the brick from above using vacuum. Thus, the corner-brick deadlock problem is not inherent to the domain, but also related to the availability of automation resources. Such planning challenges must therefore be handled by a proper planning strategy. 3.2 Duplo Stairway The second scenario aims at an assembly problem that is not easy to solve using one single robot. When assembling a stairway as shown in Fig. 3a, the intrinsic stability of the overhanging structure becomes increasingly less with its height. In addition, forces when bricks are being placed on top must be taken into account. For a reliable and non-collapsing assembly, both cases require additional help. One possible solution is achieved by cooperating robots, one positioned below the structure and supporting it while the other robot places the brick on top. Figure 3b shows two robots performing this kind of cooperation. For placing further bricks, the structure needs to be supported at higher positions. The support may nevertheless not be relieved, otherwise the stairway might collapse due to its own weight. Instead, the free robot which just has placed the last brick must take over the support, so that the other robot can relieve its support and place the next brick. In the given scenario, this leads to an alternating change of roles of the robots for assembly. The stairway example depicts different challenges of assembly planning in general. Invalid situations like unstable structures must be identified and avoided during planning. Instead, suitable helper tasks need to be scheduled at adequate times to circumvent such invalid states, guaranteeing a reliable assembly process. When evaluating next possible actions, the correct use of helper tasks is not always obvious to the planner as they do not immediately contribute any progress to the product. Instead, progress
174
L. N¨agele et al.
is only reached through a further step that becomes possible through the helper task. Besides, helper tasks drastically increase the space of available steps and thus possible sequences. Only the fewest of them however express a meaningful cooperation of robots with respect to the final structure being built. Without specific domain knowledge, it forms a major challenge for the planner to find implicit cooperations to successfully build the structure, while coping with the huge space of possibilities. Furthermore, it has to be considered that robots performing support tasks are temporarily occupied and are not capable of performing other tasks during that time period.
4 Core Concepts To automatically plan robotic assembly tasks, the presented approach relies on different concepts in order to minimize manual domain-specific effort and to achieve optimal automation and planning optimization. While domain, automation and process experts usually specify process flow and robot programs by hand, the approach aims at experts transferring their individual know-how into an ecosystem of knowledge modules instead. These modules keep this knowledge structured and introduce further information about either domain or automation into the planning process. This separation between both areas, domain and automation, and their respective role in automated planning is introduced in Sect. 4.1. For planning, not only the construction draft but also different assembly steps must be adequately expressed. Section 4.2 introduces the data structure for expressing these assembly steps. A concept for further expert contributions independent from specific planning steps is shown in Sect. 4.3. 4.1
Domain and Automation
Assembly planning – especially for robots – can mainly be seen from two different perspectives: The more general view is all about what needs to be done in order to build the final product. On the other hand, it is important to determine the automation of each assembly step, given the production resources at hand and their capabilities. For the first view, we assume that some kind of construction draft (e.g., CAD) exists which describes the final artifact and the single parts (products) it consists of, along with some kind of relational descriptions among them. From that description the “what to do” can be answered by finding abstract actions which are to be applied to the products, and by defining a (partial) order of these actions. Imagining a human instead of an automatism building the duplo house, after a look at the cover picture showing the final house (construction draft) he would start creating a mental strategy in which order to place the needed bricks. In industrial applications, the knowledge about special characteristics and restrictions of a domain is held by domain experts who are responsible for such construction decompositions and process definitions. For the automated planning approach, Fig. 4 shows which parts are related to the field of Domain. A Domain Process expresses an abstract type of action, for example a “stick together two bricks” process. When a domain process is performed specifically on one or more products, it is called Domain Task, resulting in a “stick together brick1 and brick2” task. Domain tasks are specific actions which can already be part of a mental assembly plan for the final artifact.
Multi-robot Cooperation for Assembly
175
Automation
Domain Domain Process
Automation Process
1
1
Resource
1..n
Skill 1
Domain Task
1..n
Product
*
Automation Task *
Fig. 4. Overview of the different conceptual parts in both domain and automation [12].
Fig. 5. Duplo diamond structure with four attributes [11].
So far, the assembly plan of domain tasks does not yet involve any assumption about robotic devices. All tasks are abstract and do not have a representation for real execution with robots. For the second perspective – how actions can actually be performed –, each domain task is to be transferred into low-level actions which can be executed by robots. Revisiting the human building the duplo house, for each brick, he grasps it with his fingers, moves it towards the underlying target brick and carefully sticks both together. In industrial applications, the robots performing these tasks are usually programmed by process and automation experts. In analogy to the domain field described in Fig. 4, abstract robot actions in the automation field such as “grasp” or “move” are named Automation Process while their specific application to products is called Automation Task. In contrast to domain tasks, an automation task may be independent from specific products, for example, if it describes a pure reconfiguration of the robot cell like a robot opening its gripper or moving to a new position. Instead, the automation field always relies on actuators like robots (called Resources) which physically execute the task. The capability of a resource or a team of resources to execute a specific automation process is called a Skill. A robot, for example, being capable of performing an automation process “grasp”, is assigned the automation skill “grasp by robot”. Applied to a specific brick, the skill can be used to perform the automation task “grasp brick by robot”. The nomenclature used for the automation field in Fig. 4 is widely based on the definition of PPRS [14], adapted with a separation between domain and automation. 4.2 Attributes and Situations Instead of a proprietary construction draft, planners need an (equivalent) description of the assembled artifact in a standardized form. Especially in automated planning, special requirements apply to this description: Executable solutions can only be generated if additional semantic information about products and processes are available. In previous work, the authors introduced a concept of how a spatial model of robots and products along with semantic annotations can be described by Attributes [11]. Furthermore, a mechanism to derive this form of description from a proprietary construction draft was
176
L. N¨agele et al.
presented. In a nutshell, an attribute is a bit of either spatial or semantic (or even both) information which is related to one or multiple products. One attribute may specify the paint to be applied on a product, for example. Another defines that two products are stacked on top of each other geometrically. The latter could additionally express the type of stacking and thus imply process information. An attribute describing two duplo bricks being stacked defines their position relatively to each other, but also gives an idea about the process needed in order to establish the attribute: an orthogonal movement of the upper brick towards the lower brick, followed by a light pressure. Figure 5 illustrates a duplo diamond consisting of four bricks, and its structure described by four attributes. The set of all attributes present at a time define a certain assembly state, which is called Situation. To plan an assembly, multiple situations are required: Besides a goal situation, which represents the final construction draft, an initial situation represents the state before assembly begins, and during planning many further situations arise that express intermediate steps during assembly. To convert a construction draft to a corresponding goal situation, the authors introduced the concept of Analyzer Modules [11]. After the construction draft is analyzed, its contained products are identified and spatial investigation is used, for instance, to derive their related attributes. The overall concept presented in this work relies on analyzer modules and assumes the availability of construction descriptions as situations in attribute format. 4.3
Verifying Situations
During planning of overall assembly processes, often a huge number of different situations is observed and investigated. When selecting the next step from all available automation tasks possible in a specific situation, usually several checks are performed regarding resources and processed products in order to avoid decisions leading to invalid programs. For example, after each automation task placing a further brick on top of a duplo stairway, the overall stability of the construction has to be checked. At some point, the execution of an automation task leads to a collapse of the stairway construction due to its own overhanging weight. Although every automation task is valid in general from the automation point of view, its effect to domain-specific properties may lead to defective assembly programs due to circumstances in the given situation. An automation task for placing a brick may fail in the described situation above, at the same time the very same task might be successful in another situation where an additional robot supports the construction at an appropriate position. Still, the resulting computation of stability is just a matter of the domain in this case, since it is irrelevant for a supported brick by which robot it is supported. Such checks regarding situational and especially domain-specific properties are usually not inherently considered by each automation task itself. Instead, resulting situations of tasks need to be checked by an orthogonal mechanism. For this purpose, Situation Verifier Modules are introduced which enable such checks in each step of planning, and can lead to the rejection of tasks. Domain and process experts are allowed to contribute several such modules, each covering a particular scope of check. Nevertheless, also the check of automation properties may be encapsulated by situation verifier modules, to avoid duplicate reasoning about one consideration explicitly in every single skill.
Multi-robot Cooperation for Assembly
(a) Duplo exclusion
177
(b) Collision
Fig. 6. Brick exclusion (a) and colliding objects (b) as invalid task results.
For the case study in the field of duplo, three kinds of situation verifier modules have been designed. The first performs a rather domain-only check on the processing order of assembly. Assuming a diamond-like duplo construction as depicted in Fig. 6a, the last brick to be assembled can not be inserted in the free space in the middle layer. Due to the properties of duplo, the brick is excluded in the given situation. More in general, an exclusion exists if the inserted brick would close a geometric attribute cycle in the construction between the brick’s top and bottom. The same applies to the insertion of an already assembled part of bricks instead of a single brick. The situation verifier performs cycle analysis to identify brick exclusions and stops further planning if necessary. The second situation verifier module aims at detecting collisions between objects in the world, including robotic entities. Figure 6b outlines the placement of a duplo brick close to other bricks. This task is valid from the domain’s point of view, but the chosen solution in automation space is not executable because the gripper fingers would collide with bricks close by. This check might be performed by the place skill itself, directly rejecting the tasks with collisions. However, the chosen design with situation verifier modules allows to clearly divide the responsibilities of different experts, in this case between an expert in maths and collision computation and one with knowledge about robots or duplo. During assembly planning, a huge amount of collision checks is performed on resulting situations for every potential task. To efficiently perform collision checks on situations, only objects that changed compared to the previous situation are checked, assuming that all other objects have already been collision free before. As a third type of situation verifier module for the duplo domain, statics analysis is performed on duplo structures in order to detect fragile constructions. While full-fledged physics simulation could be used to identify situations representing fragile and unstable constructions, checks performed on every potential situation would vastly extend planning time. Instead, a rather lightweight approximation for statics analysis is used which suits the presented duplo case studies. In a nutshell, the approximated statics analysis investigates all bricks of a situation from a top view and determines if their underlying support is sufficient. For the approach, single pins of bricks are marked as stable if they are capable of carrying another brick on top, for example by transferring their forces to stable pins of lower bricks. If every brick of a construction has at least one stable pin, the entire structure is assumed to be stable. Figure 7 explains the determination of stable pins by a basic illustrative example with duplo bricks from the
178
L. N¨agele et al.
(a) Fixed bricks
(b) Brick on top
(c) Convex hull
(d) Unstable brick
Fig. 7. Computing of stable pins by fixed bricks (a) and convex hulls of underlying stable pins (b, c). Stable pins are indicated by red dots.
top perspective. Both bricks of the first layer are fixed as they are tightly mounted on the floor, thus all of their pins are stable (Fig. 7a). When a new brick is mounted on top (Fig. 7b), the convex hull of all of its immediately underlying stable pins determines its own stable pins (Fig. 7c). Figure 7d shows an example for a brick mounted in level 3 which is determined to be unstable because it does not have any underlying stable pin. A more detailed description of the algorithm is given in [12].
5 Planning with Domain Tasks The main purpose of a planner is to find a plan – consisting of executable automation tasks using skills available in a robot cell – that is appropriate to build the final product specified as a construction draft. Planners using just one single concept to solve the problem, for example using state-space search, faced exponential complexity with respect to the size of the planning problem. To cope with also larger problems, the presented approaches separate planning into two logically separated layers: While on the one hand valid sequences of domain tasks are identified – appropriate to build the construction and completely decoupled from any robotic device –, on the other hand executable sequences of automation tasks are searched for each domain task, guaranteed to be valid and collision-free in the respective context. Both layers have to be closely coupled, because the chosen order of previous domain tasks affects the possible automation tasks for the next step, due to different robot positions or collision, for example. Conversely, the choice of specific automation tasks as a solution for a domain task can also influence the order of subsequent domain tasks. The result of the first step, planning with domain tasks, are alternative sequences of domain tasks, giving an abstract description of the assembly while excluding any automation aspect. For this planning in the domain field, two different approaches can be applied. Section 5.1 uses plan decomposition to derive dependency graphs describing applicable sequences of domain tasks, while Sect. 5.2 introduces a state based analysis approach to find adequate domain tasks incrementally. 5.1
Dependency Graph Determination
The first way to find domain tasks is in plan space. Here, planning is performed by refining an abstract domain task “build structure” that knows the initial and final situation.
Multi-robot Cooperation for Assembly
179
From this task, a Task Decomposition Module computes a dependency graph expressed as a partial order of domain tasks that have to be performed to transition from the initial into the final situation. For the case study with duplo, this decomposition module creates a task containing a “partial order” task with a domain task “process brick” for each brick that has to processed, and dependencies that all bricks directly below a brick have to be processed before. This “partial order” task, in turn, can be decomposed into all possible two-element sequences consisting of one task for which the partial order does not give any dependencies, and a partial order of the remaining tasks, removing the chosen task from all dependencies. This way, the “build structure” task can be incrementally decomposed into all possible sequences of tasks, i.e., allowed brick sequences. A BNF notation of these decomposition rules is shown in Algorithm 1. Allowed brick sequences found however have to be further decomposed in order to become executable while detecting impossible states caused on the automation level, as described in Sect. 6.
Algorithm 1. Decomposition rules on the domain level. build structure(init, goal) |= partial order(tasks, deps) partial order(tasks, deps) |= ∀t ∈ tasks where t has no predecessors in deps: t partial order(tasks\t, deps\t)
When used with a planning algorithm, the order in which the “partial order” decomposer returns its options may play an important role: Imagine a task that does not have any obvious dependencies, but still only succeeds when executed at a certain time. When the partial order always returns this task in the last place, the planning algorithm has to run through all possible task combinations before it realizes that they will not work. Instead, if the decomposer provides this task (or, in general, tasks that often fail) first, the planning algorithm searches for solutions where this problematic task is solved. There is even a stricter form of optimization if the following condition holds: “If a task allowed by the partial order cannot be executed right now, it will not become possible (again) after executing some more tasks.” In the field of assembly where structures become only bigger over time, further assembly steps do not change any of the circumstances that led to the failure of one domain task, so the condition holds. Thus, this optimization is applicable to the duplo domain. In this case, the partial order decomposer may limit itself to returning only tasks that have failed at least once, unless none of the allowed tasks has ever failed. This optimization vastly improves planning time in problems that are prone eventual deadlock situations as seen in the house example in Sect. 3.1. 5.2 Incremental Analysis In contrast to task decomposition, where refinements of the plan as smallest domain tasks are searched within plan space, the incremental analysis approach rather tries to
180
L. N¨agele et al.
find valid paths of domain tasks towards the goal situation by walking through state space. While not requiring explicit decomposition rules, incremental analysis relies on Domain-Task Identifier Modules instead which calculate possible domain tasks as steps applicable in a specific situation. Each domain task specifies products used for that assembly step, as well as a set of attributes that are to be performed. Domain-task identifier modules thus identify alternative suitable domain tasks needed for the assembly and can furthermore respect constraints such as temporal dependencies among them. These alternatives play an important role especially when planning on automation level and no automation solution is found for a domain task due to the available skills of robots. The planner can then fall back on alternatives on domain level, leading to an interwoven planning of domain and automation. However, the overall attribute change of each sequence of identified domain tasks must be a valid transfer from the initial to goal situation. For a manual implementation of domain-task identifier modules, domain experts could implement a hard coded logic. This way, experts can force specific domain tasks to be applied in according situations, guiding the planning behavior explicitly on domain level. However, planning on this level can also be derived from known information about the domain. For more automated planning, a generic domain-task identifier module is recommended to be used instead that automatically computes all automation tasks possible in the given situation, based on the available products. In the case of assembly, all products need to be placed or mounted at correct positions. One of these products can be chosen to be assembled in the initial situation, leading to a new intermediate situation. This product selection represents an abstract domain task for that product, ideally including everything that needs to be done in order to assemble it. Continuing this procedure and selecting so-far absent products as domain tasks in ongoing steps, every path will finally lead to the goal situation having all products assembled. However, not all paths in the search graph might be valid in the given domain, assuming the exclusion of a duplo brick as illustrated in Fig. 6a. To detect and prevent faulty paths, domain-specific situation verifier modules are applied to the resulting situation after each step, ensuring the construction’s integrity with respect to the domain. Whether to use the automated or a manual implementation of domain-task identifier module depends on the domain and the specific planning problem. While the hard-coded variant requires high initial setup effort, the automated approach in return needs increased planning time due to its unguided search through the domain space. For the case study scenario with duplo, the automated type of domain-task identifier module is used. Here, again, the domain task “process brick” is specified, including everything that has to be done in order to bring a brick to its target position. This may include sub-tasks such as collecting the brick from a magazine, pick up, move and place. Starting with the initial situation, the domain-task identifier module derives possible brick-specific domain tasks, which when solved result in new situations the module can be applied to, until the final goal situation is reached. To avoid invalid sequences of tasks that violate constraints of the domain, the situation verifier module for identifying brick exclusions is applied after each planning step. To improve planning with domain tasks, search on that level is performed in a fail fast strategy, preferring domain tasks that often lead to unsolvable situations over tasks
Multi-robot Cooperation for Assembly
181
that usually succeed. This allows to quickly close potentially large sub-trees of the search graph that have no chance of success. In assembly, when no automation solution can be found for a domain task, it cannot be performed in the current situation with the available skills, and also will not become possible again later. Thus, search on domain level needs to backtrack, continuing with other alternatives. Here, the previously failed domain tasks are preferred as next step if they are valid in that state. If the domain task is still not solvable, search steps back again, repeating the aforementioned procedure. In general, statistics are maintained during search that disclose domain tasks that failed most often. These are preferably chosen as next steps whenever possible.
6 Planning with Automation Tasks Having chosen an abstract domain task in the previous step, a concretely executable solution consisting of automation tasks has to be found, considering the skills and restrictions of the robot cell. As robotic devices have specific end-effector positions, arm attitudes or gripper opening widths between the execution of different automation tasks, the so-far purely domain-specific product situation has to be extended with Actuator States and attributes of robots to form an Automation Situation, containing information about the state of the robot cell besides the product state. When planning an appropriate automation solution for a sequence of domain tasks, the intermediate situations are refined by more detailed automation situations. This can discover that simple domain solutions are unexpectedly complex or actually unsolvable on automation layer when using the given skills and the defined robot cell layout. The finding of a valid and optimal automation solution for a given domain task, and especially for a sequence of domain tasks with degrees of freedom forms a major challenge to the planning mechanism. There are different ways to transfer domain tasks into representative sequences of automation tasks. Section 6.1 outlines an approach by using explicit rules for structural decomposition. In contrast, a rather generic search-based approach is shown in Sect. 6.2. 6.1 Explicit Decomposition When the problem and solution mechanism is well-known, it is possible to describe the transition from domain to automation tasks through a set of explicit rules. As started in Sect. 5.1, decomposition rules can directly define how a complex (domain) task can be solved by a sequence of (simpler) tasks. This decomposition does not have to directly map a domain task to a complete sequence of automation tasks, but can instead introduce multiple levels of abstraction in the form of intermediate domain or automation tasks, that must further be decomposed until executable automation tasks are reached. In the domain of duplo, a task “process brick” can obviously be split into the simpler domain tasks “supply brick” and “position brick”. The first task “supply brick” can be transferred to the automation level into a “supply brick by supplier” task – given that the supplier is capable of supplying the desired brick, and the supplier is empty in the defined initial situation of the task. This rule shows that this decomposition cannot be
182
L. N¨agele et al.
Algorithm 2. Decomposition rules on the automation level. process(b) |= supply(b) position(b) position(b) |= where placing b needs support: support for(b) pick and place(b) | pick and place(b) support for(b) |= ∀s ∈ bricks where supporting s allows to place b: support(s) pick and place(b) |= ∀s ∈ bricks where s is supported by a robot: unsupport(s) pick and place(b) | ∀r ∈ robots, p ∈ positions where picking b with r at p and placing at goal is collision free and result is stable: pick and place(b, r, p) support(b) |= ∀s ∈ bricks where s is supported by a robot: unsupport(s) support(b) | ∀r ∈ robots, p ∈ positions where supporting b with r at p is collision free: support(b,r,p) unsupport(b) |= ∀r ∈ robots where r supports b, motion is collision free and result is stable: unsupport(b,r) supply(b) |= ∀s ∈ suppliers where s is free and can provide b: supply(b,s)
performed merely on an abstract level, but instead requires information about the environment situation before the given task should be executed. Likewise, the automation task “supply brick by supplier” has to give a description about the environment situation reached after executing the task, to allow further planning. In contrast to the previous decompositions, the task “position brick” requires more work: Here, a case distinction is required whether the brick can directly be placed or requires support. If no support is required, the “position brick” task is simply a “pick and place brick” task, whereas otherwise a sequence of “support for brick” and “pick and place brick” is required. Here, “support for brick” describes a task that applies some support in the given structure so that the given brick can be placed. A naive implementation decomposes “support for brick” into a “support brick” for any (other) brick that is below the desired brick. More complex constructions where a single brick combines two parts of the structure may however require the support of two underlying bricks in order to enable placing a brick. On the other hand, not limiting the number of supports for a brick vastly extends the search space if (as in the current definition) supporting
Multi-robot Cooperation for Assembly
183
a brick may as a side effect unsupport another brick, which can lead to unbounded support-unsupport sequences. Thus, for the general case, an intelligently chosen number of supports has to be allowed, while for the examples given in this paper, the simple decomposition into the support of one brick is sufficient. One further design decision about this decomposition rule system is when to release support of a brick. As an explicit definition (e.g., after placing the brick) is not easily possible because the built structure might collapse when releasing the support right after placing, a more implicit solution was chosen. For the given rules, “unsupport brick” tasks may be introduced before any new task a robot should execute (i.e., “pick and place brick” or “support brick”) if corresponding robot is currently supporting a brick. To complete the decomposition rule set, the remaining tasks “support brick”, “unsupport brick” and “pick and place brick” can be decomposed into executable automation tasks for corresponding robots. Here again, preconditions have to be checked (e.g., “unsupport brick” is only allowed when the resulting situation without support is still stable, and for all tasks the robot positions have to be collision free). Additionally, degrees of freedom for positions are introduced at this stage, allowing picking or supporting bricks in different ways. Thus, a single “pick and place brick” task may be decomposed into dozens of different automation tasks using different robots. A complete definition of these decomposition rules is given in Algorithm 2. Using these Task Decomposition Modules (and domain level planning from Sect. 5.1), a task to build a duplo structure can be decomposed into a sequence of executable automation tasks. For this paper, the decomposition is performed using a depthfirst search in plan space. Starting with an initial situation and a high-level “build structure” task, the algorithm applies the possible decomposition rules to find sequences of lower-level tasks. In these sequences, the first task is further decomposed, until an executable automation task is reached. Then, this automation task provides the planning algorithm with the situation that results after execution, which allows continuing decomposition of the next task in the sequence. Alternatively, a task may have no valid decomposition (e.g., if it cannot be executed without collisions). In these cases, the planning algorithm has to backtrack and choose another decomposition for a higher-level task. To better cope with deadlock situations, the planning algorithm also here uses a fail-fast approach: It keeps track of statistics about how often a specific task has been successfully decomposed, and how often the decomposition has failed in one way or another. These statistics allow the “partial order” decomposer to preferably return tasks allowed by the given dependencies that have a track record of often failing, allowing the algorithm to quickly determine if they will also fail in this case. While this optimization vastly improves planning time in deadlock-prone situations (such as the given duplo house example), another optimization allows finding plans with shorter execution time: Whereas the naive approach often finds plans where a single robot performs its work, because the decomposer that provides the decompositions for this robot is always asked first, choosing a more balanced work distribution can vastly reduce execution times. To deterministically achieve this, the planner does not always use the resulting decompositions of one task in the given order. Instead, once a decomposition consisting of only a single automation task is found, it computes the execution time for all decompositions on this level and chooses the fastest. Through the task parallelization mechanism
184
L. N¨agele et al.
described in Sect. 7.2, this method prefers motions of robots that are currently idle, so that in the best case the total execution time does not increase although the plan is extended by an additional motion. Making use of expert knowledge captured in the decomposition rules, this algorithm allows to quickly plan assembly tasks. Still, it has its limitations, allowing only task sequences anticipated by the experts, and may fail in other cases. Using the given rule set, for example, it cannot plan for structures where placing a single brick requires support of two underlying bricks (as described above), or where one robot has to support the structure in one place so that another robot can switch its support position before the first robot can continue assembly. In these situations, the search based approach described in the following section shows its advantages. 6.2
Dynamic Search
Starting with possible domain task sequences for assembly of a given construction, the dynamic search approach tries to find executable automation solutions by falling back on only information provided by the skills of the robot cell. For a domain task applied to a current situation, its impact, i.e., all attributes being established or removed, can be used to derive an ideal resulting situation (goal) from the domain perspective. Analogously, any automation task can be applied to the same situation to derive a detailed automation situation seen after its execution. This resulting situation can then be tested whether it approaches the domain task’s goal. If so, the respective automation task is a good choice as first step of the domain task’s solution. Situation verifier modules are applied to the resulting situation in order to validate the step, e.g., that the structure built is statically stable. Continuing planning this way, a sequence of automation tasks may result in an automation situation that subsumes the goal of the domain task, i.e., contains the same domain-specific attributes, while freely choosing the robot information and attributes. Depending on the number of skills available and the size of the planning problem, many degrees of freedom exist when planning a concrete automation solution for a domain task, exponentially increasing planning complexity. Besides, automation situations as results of solved domain tasks influence the planning of subsequent domain tasks as robot allocations or their attitudes can affect the existence of optimal or even valid solutions. The two-layered planning of domain and automation tasks, however, is not strictly separated into two independent steps of first finding all possible domain task sequences and then performing automation planning. Instead, a rather interwoven approach has been chosen where automation planning on second layer is triggered immediately after a domain task candidate is selected on the first layer. This enables faster finding of solutions, even before the first layer is completely explored. Nevertheless, the huge amount of selectable next steps on both levels leads to an immense search space. To cope with this complexity, concepts are introduced that try to reduce the branching factor and thus make the two-layered planning more efficient. As a first improvement, automation planning is not performed in the full state space, instead only actions relevant to the chosen domain task are allowed in a corridor-like planning. In the duplo scenario, for a specific domain task “process brick1”, picking up and placing any brick might be investigated on automation level, although the domain
Multi-robot Cooperation for Assembly
185
task is just related to one specific brick. Especially exploring combinations of such unrelated tasks make the search quite inefficient. For this purpose, domain and automation tasks are assigned sets of affected resources and products. A task “pick and place brick1”, for example, affects the grasping robot, the grasped brick and the brick or bricks it is assembled upon. Automation tasks that do not affect at least one of the domain task’s related resources and products are then rejected from the search. The definition of affected resources and products is clear for product-focused tasks such as providing, picking or placing a brick, however the affected products of support tasks are less clear. The product they support is not equal to the product which is to be assembled by the domain task. Thus, support tasks in the duplo example are specified to affect any brick above. Additionally, automation tasks that relate to attributes of other domain tasks than the one which is currently planned, are also forbidden in order to avoid assembling of other parts of the structure. This corridor of allowed tasks leads to solutions with domain tasks that are self-contained sequences of automation tasks which do not interfere with each other in the overall global solution. This effectively reduces the search space, simplifies the detection of impossible domain tasks and enables a later planning stage to handle parallel execution of independent domain tasks for optimized execution. As a further improvement, the state space is further reduced by grouping similar automation tasks. If multiple automation tasks result in the very same automation situation, for example a pick and place of a product with different grasp positions, there is no value to consider all of them as it does not influence further planning. For each such group, the tasks with best fitness (e.g., shortest execution time or best quality), that are valid in the current planning step, are considered, while all others are discarded. For the example with duplo, the same four kinds of automation tasks as in Sect. 6.1 are used, each with the definition of its effect on the automation situation and its affected resources and products: An automation task “supply brick” is responsible for making a brick available within the robot cell, for example through a magazine. Once available, bricks can be moved by a task “pick and place brick”, containing a grasp, the movement and the placement of the brick. For statically insecure constructions, the tasks “support brick” and “unsupport brick” are used to temporarily bring additional stability to the construction for further tasks.
7 Multi-robot Planning When planning assembly processes with multiple robots, the overall planning complexity increases with the number of available robots and their skills, whereas execution of the assembly – once planned – can profit from the advantages of combining multiple robots. Different types of robot cooperation exist, with varying roles in the planning approach, which are discussed in Sect. 7.1. Independently, the use of multiple robots allows to simultaneously execute tasks, saving production time and enabling higher product throughput. Section 7.2 introduces a concept for parallelization as postprocessing step for planned solutions.
186
L. N¨agele et al.
7.1
Robot Cooperation
Using multiple robots allows new manufacturing strategies but also requires considerations concerning safe collaboration. Altogether, three different types of robot cooperation are identified that play a role in the planning approach. (1) Two or more robots can explicitly cooperate within one single automation task that may include exact synchronization and realtime-critical execution. This form of cooperation is handled within one skill defined by an automation expert. While the skill is conventionally considered during search, the planner has to ensure that none of the affected robots is busy in other parallel tasks. (2) As a more implicit type of cooperation, a robot may require a specific setting that can be reached when another robot executes or has executed a specific task. For the duplo stairway, the placement of an upper brick may fail because of forces causing the structure to collapse. Here, cooperation might be needed in the form of one robot supporting the existing structure to make it withstand the pressure exerted by the placement, and another robot placing the brick. This type of cooperation becomes visible by a meaningful sequencing of independent automation tasks that has to be found by the planner. (3) Robots working independently and in parallel operate in a special, less apparent form of cooperation. While the planner aims at maximizing the grade of parallelization to reduce overall execution time, all co-working robots need to ensure collision-free movements among each other all the time. The implementation used in this paper applies dependency-based parallelization as described in Sect. 7.2, making sure that collisions do not occur before and after the execution of each task, whereas potential collisions between asynchronous robot movements are not yet analyzed in the implementation used for evaluation. When multiple robots are used, planning becomes accordingly more complex due to the higher availability of options. While explicit types of cooperation (1) can easily be handled by the planner, implicit cooperation (2) is hard to identify in dynamic search since it is a previously unknown combination of a skill constituting a specific setting, another skill that requires this setting in turn, and perhaps situation verifier modules that additionally enforce such setting requirements (e.g., stability). Parallel execution (3), however, gives the planner a way to optimize the result, also leading to a broader range of possibilities but also a wider scope of rules that need to be complied. Especially the fact that robot tasks can interleave obscures the detection of deadlocks and the identification of required combinations of tasks that span over multiple (maybe unrelated or unnecessary) steps, as it is the case with implicit cooperation. The planners described in this work are designed to overcome these challenges, by either giving explicit rules for the otherwise implicit cooperation, or by using concepts for improved planning as described in Sect. 6. 7.2
Task Parallelization
The planning result of the described approach is a sequence of automation tasks that can be executed in the given order, each task waiting for its predecessor to finish. Figure 8
Multi-robot Cooperation for Assembly
187
Supply
Pick‘n‘place
Supply
(a) Duplo structure to be built
Pick‘n‘place
Supply Supply
Supply
Supply
Supply
Pick‘n‘place Pick‘n‘place
Supply Pick‘n‘place
Pick‘n‘place Pick‘n‘place
Pick‘n‘place
(c) Derived execution plan
(b) Identified sequence of automation tasks and dependencies
Fig. 8. Automation task parallelization by resource and product dependencies.
(b) gives an example for a solution in the duplo domain that builds a simple duplo structure as depicted in Fig. 8 (a) using two robots. Each task includes the product (brick) instance it affects and the robot that is used for execution. Supply tasks are used to ensure that bricks exist at a defined position and thus need no actuator in this example. When planning with multiple actuators, however, the strict sequential order of automation tasks is a far too restrictive way for execution, assigning each task a unique slot of time for execution and making even unrelated subsequent tasks wait for them. As an improvement, a post-processing optimization can break the strict sequencing and establish a new partial order that enables tasks to be performed in parallel while keeping important dependencies from the original sequence. These dependencies result from the manipulated products of each task and the actuating elements like robots performing it. Figure 8 (b) summarizes these affected resources and products of each task and defines important dependencies among the tasks. Therefore, each resource or product of a task enforces a dependency to the very last task in the original sequence affecting it. With these newly identified dependencies, an optimized execution plan can be derived that maximizes parallelism. For the duplo example, Fig. 8 (c) shows the resulting execution plan. Assuming all tasks having the same execution time, the example reduces the total execution time by a factor of two. While collision freedom for a task sequence is a matter of finding appropriate tasks in a static and exactly known spatial environment – robots that are unrelated to the current task do not move asynchronously –, parallelization however brings a further challenge to guarantee collision freedom also in a dynamic environment of parallel tasks. To overcome this issue, following steps are proposed:
188
L. N¨agele et al.
1. Automation tasks provide a description for which parts of their movement the specific trajectory does not matter, calling these Replannable Movements. 2. In the derived execution plan with parallel tasks, all combinations of concurrent task pairs are identified. These are analyzed for potential collisions. This is not as easy because there is no definition about when two parallel tasks start relatively to each other. Depending on that, two tasks may be collision-free except for a very specific start time offset. The authors currently work on investigating potential collisions through the comparison of convex hulls of swept volumes. 3. If a potential collision between two tasks is identified, their replannable movements are adapted to avoid the collision. If this still fails, an extra dependency between the two tasks is inserted to ensure strict sequential execution of both tasks, bringing the solution towards its original sequential form. While task parallelization is used in the evaluation, the concepts for detecting collisions are not yet implemented in the current state of the evaluation prototype. Thus, planning runs considered in the evaluation may contain solutions with robot collisions during asynchronous movements.
8 Experimental Results For evaluating the presented approach for assembly planning, the introduced concepts R R DUPLO . All are applied to different planning problems in the field of LEGO concepts are implemented in a multi-robot planning algorithm (MRPA) that exists in two different types, one using the decomposition approach and the other planning by dynamic search. For each planning problem, multiple evaluation runs are performed by both MRPAs and furthermore by a solver using a rather classic strategy based on A∗ . All results are statistically investigated for overall planning performance and solution quality. The A∗ implementation uses automation situations as states and automation tasks as transitions. In contrast to the MRPA with dynamic search, the domain planning level is left out in A∗ , applying automation tasks as step options in only one layer. For rejecting invalid situations, all planners use the same situation verifier modules as described in Sect. 4.3. All thee planners furthermore use task costs as criteria for optimization. The cost of an automation task is derived from its quality, for example the stability of chosen grasp positions, and from the overall execution time. A∗ , however, requires an estimator that guesses the future costs of tasks to reach the goal from a given state. Along with the costs of all previous tasks, the estimation is integrated into a fitness value that is used by the heuristics of A∗ . For duplo, the estimation has been implemented in a very domain and robotic specific way using slightly underestimated average cost of task types that still need to be performed on not-yet completed bricks. In case of equivalent task costs, all three planners will end up with random decisions for task selection. This leads to indeterminism for the solution as well as for the overall planning time. For this reason, each planning problem is evaluated 250 times on each planner in order to derive statistically meaningful results. Planning is stopped with the very first valid solution, thus A∗ does not necessarily terminate with a global optimum, allowing a standard deviation other than zero for the solution’s execution time.
Multi-robot Cooperation for Assembly
(a) House
(b) Stairway
(c) House & stairway
189
(d) House and house
Fig. 9. Four duplo structures as differently complex evaluation cases.
For evaluation, four problem definitions have been specified. The duplo house (Sect. 3.1) and the duplo stairway (Sect. 3.2) are taken as individual evaluation cases. To examine also larger planning problems, two further evaluation cases are created as combinations of them. Figure 9 shows the combination of house and stairway (c) and of two houses (d). The planning approach is fully implemented in Java. All evaluation runs have been executed in a JVM running on a Linux machine equipped with 16 GB RAM and an Intel(R) Xeon(R) CPU E31230 (3.20 GHz), with 4 cores and hyper-threading enabled. The maximum planning time is set to 300 s. Longer runs are canceled and not considered in the statistics. Table 1 summarizes the results of all evaluation runs. The main purpose of the house case study is to investigate the performance of planners coping with deadlocks that can not be identified in the moment they arise. All planners successfully find solutions in all their 250 runs within 300 s. While the solution length (number of automation tasks) is the same for all runs with a standard deviation of zero, i.e., “supply” and “pick and place” for each of the 26 bricks of the house, the planning time varies in a larger scale. Both MRPA approaches terminate with similar results in less than half a second. A∗ , however, needs an average time of 6 s to find equivalent results. Its extremely high standard deviation of more than 9 s indicates a kind of “luckiness” of runs. Unlucky runs seem to end up in the corner-brick deadlock wasting time with planning on wrong premises. Both MRPAs seem to get over this deadlock challenge successfully, having only little standard deviations. This clearly shows the profit of the concepts for detecting and preventing deadlocks. In the field of multi-robot planning, both case studies address different types of robot cooperation explicitly. The duplo house with its numerous stacked and close-by bricks aims at provoking collisions of robots and bricks in a close workspace, while the stairway use case requires the planner to discover implicit cooperation for bricks that need support. Here, the planner is challenged to wisely select support tasks that do not involve any immediate profit in fitness. Instead, only in a later state, e.g., when a placement becomes possible, the profit appears. This makes planning to a temporary blind search for all planners in state space. While A∗ needs a bit more time, the single stairway is planned properly by all three planners. This is because team planning with only two available robots and exactly one brick placement order does not leave many wrong options. Combining the stairway with a house, though, significantly increases the planning time needed by the planners, with A∗ running into a timeout in 65% of runs. Looking at two duplo houses, A∗ is only able to solve 72 runs, MRPA decomposition,
190
L. N¨agele et al.
Table 1. Evaluation runs for the four duplo problem definitions, each solved with A∗ and both multi-robot planning algorithm types. MRPA MRPA decomposition dynamic search Planning problem A∗ Duplo house Successful runs 250 6.18 s Planning time 52.0 Automation tasks 113.51 s Execution time Duplo stairway
/ ± ± ±
250 9.71 s 0.0 1.86 s
250 0.47 s 52.0 113.36 s
/ ± ± ±
250 0.16 s 0.0 1.90 s
250 0.40 s 52.0 113.36 s
/ ± ± ±
250 0.07 0.0 1.88 s
Successful runs 250 0.47 s Planning time 37.0 Automation tasks 103.59 s Execution time Duplo house and stairway
/ ± ± ±
250 0.03 s 0.0 0.17 s
250 0.23 s 37.0 103.74 s
/ ± ± ±
250 0.03 s 0.0 0.00
250 0.20 s 35.0 97.93 s
/ ± ± ±
250 0.02 s 0.0 0.29
Successful runs 88 139.05 s Planning time 87.9 Automation tasks 232.93 s Execution time Duplo house and house
/ ± ± ±
250 176 89.85 s 24.39 s 1.0 88.0 13.80 s 232.36 s
/ ± ± ±
250 246 50.29 s 11.56 s 1.0 87.7 20.68 s 229.77 s
/ ± ± ±
250 15.84 s 1.0 20.83 s
Successful runs 72 62.09 s Planning time 104.0 Automation tasks 221.42 s Execution time
/ ± ± ±
250 205 64.65 s 6.40 s 0.0 104.0 2.02 s 222.03 s
/ ± ± ±
250 250 23.83 s 2.21 s 0.0 104.0 2.03 s 221.77 s
/ ± ± ±
250 0.50 s 0.0 1.97 s
however, finishes 205 runs with a solution and MRPA dynamic search is capable of all of the 250 runs with an excellent standard deviation. Both MRPAs, decomposition and dynamic search, score similarly well compared to A∗ in the evaluation results. They show their strength in handling complex assembly planning for multi-robots. The advantage here for MRPA with dynamic search is the comparatively little planning scope for each domain task. Having a corridor-like planning on domain task level, the modified depth-first search is very fast on these duplo problems. Although MRPA decomposition reaches minimally worse evaluation results compared to the dynamic search approach, it is supposed to perform better in bigger planning problems with higher numbers of skills, where automation planning is far more exhaustive. This profit, however, needs to be earned by an additional effort to design explicit decomposition rules. Thus, for planning with few skills, MRPA dynamic search is preferred. Having many skills and thus more complex automation planning, or when domain and automation experts prefer to introduce their process knowledge explicitly, it is advised to use the MRPA decomposition approach.
Multi-robot Cooperation for Assembly
191
9 Conclusion In this work, two approaches for automated planning of robot assembly tasks are described that abstract concerns between domain and automation. One approach using state-space search is referenced from previous work, while the second planning method using plan-space planning and decomposition by explicit decomposition rules is newly introduced. Both planning concepts are evaluated by applying them to different planR R DUPLO with challenges like deadlockning problems in the domain of LEGO handling as well as coping with implicit cooperation of two robots having independent skills. The outcomes are compared to the results of a classic implementation for planning using A∗ . Furthermore, a dependency-based mechanism for parallelizing and interleaving robotic tasks is presented which can be used orthogonally by the planners to retrieve interleaved programs. This allows for optimizing the resulting execution plans for more concentrated capacity utilization of robots and thus improved overall execution performance. Further work focuses on an integration of collision-avoidance into the planning approach, not only for static situations during planning, but also considering concurrent robot movements. Furthermore, the application of both approaches will be tested on planning problems in other domains different to duplo, such as the automated production with carbon-fibre reinforced polymers (CFRP).
References 1. Fikes, R., Nilsson, N.J.: STRIPS: a new approach to the application of theorem proving to problem solving. Artif. Intell. 2(3/4), 189–208 (1971) 2. Giunchiglia, F.: Using abstrips abstractions - where do we stand? Artif. Intell. Rev. 13(3), 201–213 (1999). https://doi.org/10.1023/A:1006507609248 3. Gl¨uck, R., Hoffmann, A., N¨agele, L., Schierl, A., Reif, W., Voggenreiter, H.: Towards a toolbased methodology for developing software for dynamic robot teams. In: 15th International Conference on Informatics in Control, Automation and Robotics, pp. 615–622 (2018) 4. Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IIEEE Trans. Syst. Sci. Cybern. 4(2), 100–107 (1968) 5. Knepper, R.A., Layton, T., Romanishin, J., Rus, D.: Ikeabot: an autonomous multi-robot coordinated furniture assembly system. In: 2013 IEEE International Conference on Robotics and Automation, pp. 855–862 (2013). https://doi.org/10.1109/ICRA.2013.6630673 6. Knoblock, C.A.: Learning abstraction hierarchies for problem solving. In: 8th National Conference on Artificial Intelligence, pp. 923–928. AAAI Press (1990) 7. Koga, Y., Latombe, J..: On multi-arm manipulation planning. In: 1994 IEEE International Conference on Robotics and Automation, pp. 945–952 (1994). https://doi.org/10.1109/ ROBOT.1994.351231 8. Macho, M., N¨agele, L., Hoffmann, A., Angerer, A., Reif, W.: A flexible architecture for automatically generating robot applications based on expert knowledge. In: 47th International Symposium on Robotics (2016) 9. McDermott, D., Ghallab, M., Howe, A., Knoblock, C., Ram, A., Veloso, M., Weld, D., Wilkins, D.: PDDL — the planning domain definition language. Technical report, CVC TR98-003/DCS TR-1165, Yale Center for Computational Vision and Control (1998)
192
L. N¨agele et al.
10. N¨agele, L., Macho, M., Angerer, A., Hoffmann, A., Vistein, M., Sch¨onheits, M., Reif, W.: A backward-oriented approach for offline programming of complex manufacturing tasks. In: 6th International Conference on Automation, Robotics and Applications, pp. 124–130. IEEE (2015) 11. N¨agele, L., Schierl, A., Hoffmann, A., Reif, W.: Automatic planning of manufacturing processes using spatial construction plan analysis and extensible heuristic search. In: 15th International Conference on Informatics in Control, Automation and Robotics, pp. 576–583 (2018). https://doi.org/10.5220/0006861705860593 12. N¨agele, L., Schierl, A., Hoffmann, A., Reif, W.: Modular and domain-guided multi-robot planning for assembly processes. In: 16th International Conference on Informatics in Control, Automation and Robotics (2019). https://doi.org/10.5220/0007977205950604 13. Nau, D., Au, T.C., Ilghami, O., Kuter, U., Murdock, J.W., Wu, D., Yaman, F.: SHOP2: an HTN planning system. J. Artif. Int. Res. 20(1), 379–404 (2003). http://dl.acm.org/citation. cfm?id=1622452.1622465 14. Pfrommer, J., Schleipen, M., Beyerer, J.: PPRS: production skills and their relation to product, process, and resource. In: 18th IEEE Conference on Emerging Technologies and Factory Automation. IEEE (2013) 15. Schoen, T.R., Rus, D.: Decentralized robotic assembly with physical ordering and timing constraints. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5764–5771, November 2013. https://doi.org/10.1109/IROS.2013.6697191 16. Stein, D., Schoen, T.R., Rus, D.: Constraint-aware coordinated construction of generic structures. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4803–4810, September 2011. https://doi.org/10.1109/IROS.2011.6094682 17. Thomas, U., Wahl, F.M.: A system for automatic planning, evaluation and execution of assembly sequences for industrial robots. In: 2001 IEEE/RSJ IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1458–1464, October 2001. https://doi.org/10. 1109/IROS.2001.977186 18. Wilkins, D.E.: Domain-independent planning representation and plan generation. Artif. Intell. 22(3), 269–301 (1984) 19. Wolfe, J., Marthi, B., Russell, S.: Combined task and motion planning for mobile manipulation. In: 20th International Conference on Automated Planning and Scheduling, pp. 254–257. AAAI Press (2010). http://dl.acm.org/citation.cfm?id=3037334.3037373 20. Yan, Z., Jouandeau, N., Cherif, A.A.: A survey and analysis of multi-robot coordination. Int. J. Adv. Robot. Syst. 10(12), 399 (2013). https://doi.org/10.5772/57313
Author Index
A Alberts, Jan, 61 Almarkhi, Ahmad A., 118 Ansarieshlaghi, Fatemeh, 133 B Bjornsson, Hjortur, 99 C Cacace, Simone, 16 Cadenat, V., 79 Chinesta, Francisco, 1 D Di Giamberardino, Paolo, 35 Durand-Petiteville, A., 79 Duval, Jean Louis, 1 E Eberhard, Peter, 133 F Falcó, Antonio, 1 H Hafstein, Sigurdur, 99 Hilario, Lucia, 1 Hoffmann, Alwin, 169 I Iacoviello, Daniela, 35
J Jirsa, Ladislav, 151 K Kleinschmidt, Sebastian P., 61 Kuklišová Pavelková, Lenka, 151 L Lai, Anna Chiara, 16 Le Flécher, E., 79 Loreti, Paola, 16 M Montés, Nicolás, 1 Mora, Marta C., 1 N Nadal, Enrique, 1 Nägele, Ludwig, 169 Q Quinn, Anthony, 151 R Reif, Wolfgang, 169 S Schierl, Andreas, 169 Sentenac, T., 79 W Wagner, Bernardo, 61
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 O. Gusikhin et al. (Eds.): ICINCO 2019, LNEE 720, p. 193, 2021. https://doi.org/10.1007/978-3-030-63193-2