243 88 6MB
English Pages 196 [208] Year 2008
Springer Tracts in Advanced Robotics Volume 41 Editors: Bruno Siciliano · Oussama Khatib · Frans Groen
Michael John Milford
Robot Navigation from Nature Simultaneous Localisation, Mapping, and Path Planning Based on Hippocampal Models
ABC
Professor Bruno Siciliano, Dipartimento di Informatica e Sistemistica, Università di Napoli Federico II, Via Claudio 21, 80125 Napoli, Italy, E-mail: [email protected] Professor Oussama Khatib, Robotics Laboratory, Department of Computer Science, Stanford University, Stanford, CA 94305-9010, USA, E-mail: [email protected] Professor Frans Groen, Department of Computer Science, Universiteit van Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands, E-mail: [email protected]
Author Dr. Michael John Milford The University of Queensland School of Information Technology and Electrical Engineering (ITEE) Axon Building Cpy Lucia, Queensland 4072 Australia Email: [email protected]
ISBN 978-3-540-77519-5
e-ISBN 978-3-540-77520-1
Springer Tracts in Advanced Robotics
ISSN 1610-7438
Library of Congress Control Number: 2007942163 c 2008
Springer-Verlag Berlin Heidelberg
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable for prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting by the authors and Scientific Publishing Services Pvt. Ltd. Printed in acid-free paper 543210 springer.com
Editorial Advisory Board
EUR ON
Herman Bruyninckx, KU Leuven, Belgium Raja Chatila, LAAS, France Henrik Christensen, Georgia Institute of Technology, USA Peter Corke, CSIRO, Australia Paolo Dario, Scuola Superiore Sant’Anna Pisa, Italy Rüdiger Dillmann, Universität Karlsruhe, Germany Ken Goldberg, UC Berkeley, USA John Hollerbach, University of Utah, USA Makoto Kaneko, Osaka University, Japan Lydia Kavraki, Rice University, USA Sukhan Lee, Sungkyunkwan University, Korea Tim Salcudean, University of British Columbia, Canada Sebastian Thrun, Stanford University, USA Yangsheng Xu, Chinese University of Hong Kong, PRC Shin’ichi Yuta, Tsukuba University, Japan
European
***
***
Research Network
***
***
STAR (Springer Tracts in Advanced Robotics) has been promoted ROBOTICS under the auspices of EURON (European Robotics Research Network)
Foreword
At the dawn of the new millennium, robotics is undergoing a major transformation in scope and dimension. From a largely dominant industrial focus, robotics is rapidly expanding into the challenges of unstructured environments. Interacting with, assisting, serving, and exploring with humans, the emerging robots will increasingly touch people and their lives. The goal of the new series of Springer Tracts in Advanced Robotics (STAR) is to bring, in a timely fashion, the latest advances and developments in robotics on the basis of their significance and quality. It is our hope that the wider dissemination of research developments will stimulate more exchanges and collaborations among the research community and contribute to further advancement of this rapidly growing field. The monograph written by Michael Milford is yet another volume in the series devoted to one of the hottest research topics in the latest few years, namely Simultaneous Localization and Map Building (SLAM). The contents expand the author’s doctoral dissertation and describe the development of a robot mapping and navigation system inspired by models of the neural mechanisms underlying spatial navigation in the rodent hippocampus. One unique merit of the book lies in its truly interdisciplinary flavour, addressing a link between biology and artificial robotic systems that has been open for many years. To the best of my knowledge, this is the most thorough attempt to marry biological navigation in rodents and similar, with robotic navigation and SLAM. A very fine addition to our STAR series!
Naples, Italy September 2007
Bruno Siciliano STAR Editor
Preface
This book describes the development of a robot mapping and navigation system inspired by models of the neural mechanisms underlying spatial navigation in the rodent hippocampus. Computational models of animal navigation systems have traditionally had limited performance when implemented on robots. The aim of the work was to determine the extent to which hippocampal models can be used to provide a robot with functional mapping and navigation capabilities in real world environments. The focus of the research was on achieving practical robot performance, rather than maintaining biological plausibility. The mapping and navigation problem is a broad one and to completely solve it a robot must possess several key abilities. These include the ability to explore an unknown environment, perform Simultaneous Localisation And Mapping (SLAM), plan and execute routes to goals, and adapt to environment changes. The most successful conventional solutions are based on core probabilistic algorithms and use a variety of map representations ranging from metric to topological. These probabilistic methods have good SLAM performance, but few have been successfully integrated into a complete solution to the entire mapping and navigation problem. In contrast, biological systems are generally poorly understood and computational models emulating them have had very limited application to robot platforms in challenging environments. However, many animals solve the entire mapping and navigation problem without the precise sensors and high resolution maps that are typical of robotic methods. Models of the mapping and navigation process in relatively well understood animals such as rodents offer much potential for practical improvement. The book describes a series of studies which implemented computational models inspired by the mapping and navigation processes in the rodent hippocampus. The initial study was based on conventional state of the art models of place and head direction cells. However, as the research moved further towards the creation of a functional robot navigation system, these biological models were extended and adapted to provide better robot navigation performance. The aim was to create an integrated solution to not just the mapping problem, but also the related challenges of exploration, goal navigation, and adaptation. The end result of this process was a system now known as the RatSLAM system, and an algorithm known as experience mapping. At each stage of the research the system was tested experimentally, with the environments becoming larger and more challenging as its capability improved. The combined system displayed successful, although sometimes suboptimal, performance in a range of real robot tests lasting up to two hours. In the final experiments a RatSLAM
X
Preface
equipped robot, acting in a completely autonomous manner, explored an unknown environment while performing SLAM, then navigated to goal locations while adapting to simple lasting environment changes. Robot Navigation from Nature is intended to demonstrate the validity of an interdisciplinary approach combining biological navigation in rodents and other animals with robot navigation and SLAM. The majority of the work was performed during my doctorate studies, with this book being an adapted version of my PhD thesis. Through writing and publishing this book, I hope to spur further research in this direction. I believe it may offer a path to robots in the future that move around their environments in a truly intelligent manner. As with any research, I am indebted to the many researchers before me on whose work I have based much of the work presented in this book. The references throughout the book indicate the main groups and people from which I have drawn inspiration. I also wish to acknowledge the funding from the Australian Research Council and the Australian government, without which I would never have been able to perform this research, my PhD supervisor Dr. Gordon Wyeth, who was the driving force behind the establishment of this research project, and my colleague David Prasser, who worked with me on the project and created the vision systems. With regards to the development and production of this book, I thank Dr. Thomas Ditzinger, Engineering Editor at Springer Verlag, for helping me through the book production process, Dr Peter Corke from the CSIRO, Prof. Siciliano from the University of Naples, and the anonymous reviewer who provided much useful feedback.
Queensland Brain Institute & School of ITEE The University of Queensland August 2007
Michael John Milford
Table of Contents
List of Figures……………………………………………………………… XV List of Abbreviations…………………………………………………...…. XIX 1.
Introduction…………………………………………………………...…… 1.1 Mobile Robots………………………………………………….……… 1.2 Simultaneous Localisation and Mapping……………………………… 1.3 Exploration, Goal Navigation and Adapting to Change……………….. 1.4 Practical Performance from a Biological Model………………………. 1.5 Book Outline……………………………………………………………
1 1 3 5 6 6
2.
Mapping and Navigation………………………………………………….. 2.1 The Mapping and Navigation Problem………………………………… 2.1.1 Localisation and Mapping……………………………………... 2.1.2 SLAM: The Chicken and the Egg Problem…………………… 2.1.3 Dealing with Uncertainty……………………………………… 2.1.4 Exploring Unknown Environments……………………………. 2.1.5 Navigating to Goals……………………………………………. 2.1.6 Learning and Coping with Change…………………………….
9 10 10 11 11 12 12 13
3.
Robotic Mapping Methods………………………………………………… 3.1 Probabilistic Mapping Algorithms…………………………………….. 3.1.1 Kalman Filter Methods…………………………………………. 3.1.2 Expectation Maximisation Methods……………………………. 3.1.3 Particle Filter Methods…………………………………………. 3.2 Topological Mapping Methods………………………………………… 3.3 Exploration, Navigation, and Dealing with Change…………………… 3.3.1 Exploration……………………………………………………... 3.3.2 Navigating to Goals…………………………………………….. 3.3.3 Dealing with Dynamic Environments………………………….. 3.4 Discussion………………………………………………………………
15 15 15 17 18 21 22 23 25 26 28
4.
Biological Navigation Systems…………………………………………….. 4.1 Rodents and the Cognitive Map……………………………………….. 4.1.1 Head Direction and Place Cells………………………………... 4.1.2 Exploration, Navigation, and Dealing with Change……………
29 29 31 33
XII
Table of Contents
4.2 Other Animals and Insects……………………………………………... 4.2.1 Bees……………………………………………………………. 4.2.2 Ants…………………………………………………………….. 4.2.3 Primates………………………………………………………... 4.2.4 Humans………………………………………………………... 4.3 Discussion………………………………………………………………
34 34 36 36 38 39
5.
Emulating Nature: Models of Hippocampus…………………………….. 5.1 Head Direction and Place Cells – State of the Art…………………….. 5.1.1 Attractor Networks…………………………………………….. 5.1.2 Path Integration………………………………………………… 5.1.3 Head Direction Correction Using Allothetic Information……... 5.1.4 Place Cells – State of the Art…………………………………... 5.1.5 Place Cells Through Allothetic Cues………………………….. 5.1.6 Place Cells Through Ideothetic Information…………………… 5.1.7 Navigation……………………………………………………… 5.2 Discussion………………………………………………………………
41 41 41 42 44 45 45 47 51 53
6.
Robotic or Bio-inspired: A Comparison…………………………………. 6.1 Robustness Versus Accuracy…………...……………………………… 6.2 Map Friendliness Versus Map Usability……………………...……….. 6.3 Sensory Differences…………………………………………………… 6.4 Capability in Real World Environments………………………………. 6.5 One Solution……………………………………………………………
55 55 56 57 58 59
7.
Pilot Study of a Hippocampal Model……………………………………... 7.1 Robot and Environment………………………………………………... 7.2 Complete Model Structure……………………………………………... 7.3 A Model of Spatial Orientation………………………………………... 7.3.1 Representing Orientation……………………………………….. 7.3.2 Learning Allothetic Cues……………………………………….. 7.3.3 Re-localisation Using Allothetic Cues…………………………. 7.3.4 Internal Dynamics………………………………………………. 7.3.5 Path Integration Using Ideothetic Information…………………. 7.4 Model Performance……………………………………………………. 7.4.1 Experiment 1: Path Integration Calibration……………………. 7.4.2 Experiment 2: Localisation and Mapping in 1D……………….. 7.5 A Model of Spatial Location…………………………………………... 7.5.1 Representing Location………………………………………….. 7.5.2 Learning Allothetic Cues……………………………………….. 7.5.3 Re-localisation Using Allothetic Cues…………………………. 7.5.4 Internal Dynamics………………………………………………. 7.5.5 Path Integration Using Ideothetic Information…………………. 7.6 Model Performance……………………………………………………. 7.6.1 Experiment 3: Localisation and Mapping in 2D………………. 7.7 Discussion and Summary……………………………………………… 7.7.1 Comparison to Biological Systems…………………………….
61 61 63 64 64 66 67 68 69 70 70 71 74 74 75 75 76 78 79 79 81 83
Table of Contents
XIII
7.7.2 Comparison to Other Models…………………………………... 7.7.3 Conclusion……………………………………………………...
84 86
8.
RatSLAM: An Extended Hippocampal Model…………………………... 8.1 A Model of Spatial Pose……………………………………………….. 8.1.1 Complete Model Structure……………………………………… 8.1.2 Biological Evidence for Pose Cells…………………………….. 8.1.3 Representing Pose………………………………………………. 8.1.4 Internal Dynamics………………………………………………. 8.1.5 Learning Visual Scenes………………………………………… 8.1.6 Re-localising Using Familiar Visual Scenes…………………… 8.1.7 Intuitive Path Integration……………………………………….. 8.2 Generation of Local View……………………………………………... 8.2.1 Sum of Absolute Differences Module…………………………. 8.2.2 Image Histograms……………………………………………… 8.3 Visualising SLAM in a Hippocampal Model……………….…………. 8.4 SLAM in Indoor and Outdoor Environments………………………….. 8.4.1 Experiment 4: SLAM with Artificial Landmarks……………… 8.4.2 Experiment 5: SLAM in a Loop Environment…………………. 8.4.3 Experiment 6: SLAM in an Office Building…………………… 8.4.4 Experiment 7: SLAM in Outdoor Environments………………. 8.4.5 Path Integration only Performance……………….…………….. 8.4.6 SLAM Results………………………………………………….. 8.5 Summary and Discussion…………………………………………….... 8.5.1 RatSLAM Requirements………………………………………. 8.5.2 The Nature of RatSLAM Representations……………………..
87 87 87 87 89 90 91 92 93 94 94 96 99 101 101 104 107 112 113 113 115 115 115
9.
Goal Memory: A Pilot Study……………………………………………… 9.1 Enabling Goal Recall Using RatSLAM……………………………….. 9.2 Learning………………………………………………………………... 9.3 Recall…………………………………………………………………... 9.3.1 Experiment 8: Small Environment Goal Recall………………... 9.3.2 Goal Recall Results…………………………………………….. 9.3.3 Experiment 9: Large Environment Goal Recall………………... 9.3.4 Goal Recall Results…………………………………………….. 9.4 Summary and Discussion……………………………………………… 9.4.1 Creating Maps Suited to Goal Recall…………………………...
117 117 118 119 120 121 124 125 126 127
10. Extending RatSLAM: The Experience Mapping Algorithm……………. 10.1 A Map Made of Experiences…………………………………………. 10.2 Linking Experiences: Spatially, Temporally, Behaviourally………… 10.3 Map Correction……………………………………………………….. 10.4 Map Adaptation and Long Term Maintenance………………………. 10.5 Indoor Experience Mapping Results…………………………………. 10.5.1 Experiment 10: Large Pose Cell Representation……………. 10.5.2 Experiment 11: Small Pose Cell Representation…………….. 10.6 Experiment 12: Outdoor Experience Mapping……………………….. 10.7 Summary and Discussion……………………………………………..
129 129 131 133 134 136 137 139 140 142
XIV
Table of Contents
11. Exploration, Goal Recall, and Adapting to Change……………………... 11.1 Exploring Efficiently…………………………………………………. 11.1.1 Experimental Evaluation…………………………………….. 11.1.2 Discussion…………………………………………………… 11.2 Recalling Routes Using a Temporal Map……………………………. 11.2.1 Temporal Map Creation……………………………………… 11.2.2 Route Planning………………………………………………. 11.2.3 Behaviour Arbitration………………………………………... 11.2.4 Route Loss Recovery………………………………………… 11.3 SLAM and Navigation in a Static Environment……………………… 11.3.1 Experiment 13: Goal Recall with Minor Pose Collisions……. 11.3.2 Experiment 14: Goal Recall with Major Pose Collisions……. 11.3.3 Discussion……………………………………………………. 11.4 Adapting to Environment Change……………………………………. 11.4.1 Experiment 15: Indoor Map Adaptation……………………... 11.4.2 Results……………………………………………………….. 11.5 Discussion……………………………………………………………. 11.5.1 Conclusion……………………………………………………
145 145 147 147 149 149 150 151 153 153 154 156 156 158 158 159 160 161
12. Discussion…………………………………………………………………... 12.1 Book Summary……………………………………………………….. 12.1.1 Mapping and Navigation…………………………………….. 12.1.2 Pilot Study of a Hippocampal Model………………………... 12.1.3 RatSLAM: An Extended Hippocampal Model………………. 12.1.4 Goal Memory: A Pilot Study………………………………… 12.1.5 Extending RatSLAM: Experience Mapping…………………. 12.1.6 Exploring, Goal Recall, and Adapting to Change…………… 12.2 Contributions…………………………………………………………. 12.2.1 Comparative Review of Robotic and Biological Systems…… 12.2.2 Performance Evaluation of Hippocampal Models…………… 12.2.3 Implementation of an Extended Hippocampal Model………. 12.2.4 An Experience Mapping Algorithm…………………………. 12.2.5 An Integrated Approach to Mapping and Navigation……….. 12.3 Future Mapping and Navigation Research…………………………… 12.4 Grid Cells…………………………………………………………….. 12.5 Conclusion…………………………………………………………….
163 163 163 164 164 165 165 165 166 166 167 167 167 168 168 170 170
Appendix A - The Movement Behaviours………………………………... Search of Local Space………………………………………………………. Pathway Identification………………………………………………………. Velocity Commands…………………………………………………………
173 173 173 176
References…………………………………………………………………... 179 List of Reproduced Figures……………………………………………….. 187 Index………………………………………………………………………… 191
List of Figures
3.1 3.2 3.3 3.4
Maps produced with and without the EM algorithm………...…...….. The effect of the motion model on the robot’s pose belief………..…. Candidate selection using the NBV algorithm………………….....… Exploration method using the Spatial Semantic Hierarchy………......
18 20 24 25
4.1 4.2 4.3
The rodent hippocampus..……………………………………...……. Firing rate of cells as a function of head direction……………...…… Anatomy of a worker bee……………………………………..…...…
30 31 35
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11
Attractor network model of the rodent head direction system….…… A coupled attractor model of rodent head direction…………..…...… Bearing and place cell models…………………………………….…. Model relating cues, head direction cells and place cells……..…...… A model of view cells………………………………………..……..... Allothetic and ideothetic information influences place cells…..……. Place cell firing rates during a robot experiment……………..…….... Two-dimensional continuous attractor model of place cells…...…..... Minimalist LV-PI model as implemented by Browning……..…….... Generating a theta rhythm……………………………………..…….. Learning to navigate to a goal…………………………………..........
42 43 44 46 47 48 48 49 51 52 52
6.1 6.2
High resolution occupancy maps…………………………..……..…. Illustrative example of place cell firing and occupancy maps..…...…
56 57
7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10
Equipment used during indoor experiments……………...…..……… The Pioneer robot in the two by two metre experimental arena........... Overall structure of the hippocampal model…………………..…….. Head direction network………………………………………..…...... Activity profile in the head direction network…………………....…. Active view cells are associated with active head direction cells…… Connectivity of local view cells with head direction cells……........... Head direction cells excite themselves and other cells…..………….. Path integration cells in action for counter-clockwise rotation…........ Odometry calibration……………………………………………...…
62 63 64 65 65 66 67 68 70 71
XVI
List of Figures
7.11 7.12 7.13 7.14 7.15 7.16 7.17 7.18 7.19 7.20 7.21 7.22 7.23
Environment for 1D localisation and mapping experiments……....… Results of orientation calibration………………………………….…. Re-localisation under visual input………………………………....… Visual reinforcement of the current orientation hypothesis……...….. View cell connections to the place cells……………………….......… Sample activity profiles…………………………………………....… View cell-place cell weight map…………………………………...... Path integration in the place cells………………………………….… Localisation using path integration………………………………...... Positional tracking error…………………………………………...… Learning under visual ambiguity…………………………….……..... Re-localisation failure under visual ambiguity……………….……… Activity packets after further robot movement…………..…..………
72 72 73 74 76 77 77 78 79 80 82 82 83
8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11 8.12 8.13 8.14 8.15 8.16 8.17 8.18 8.19 8.20 8.21 8.22 8.23 8.24 8.25 8.26 8.27 8.28 8.29
The RatSLAM system structure…………………………..….....…… The eight-arm maze used in rodent experiments…………..……..….. The three-dimensional pose cell model…………………….......……. Snapshot of pose cell activity during an experiment…………...……. The pose cell and local view cell structures…………………...…….. High resolution and down sampled greyscale camera images…......... Local view cell activity during a 77 minute experiment………….…. Region of the panoramic image used to generate histograms….......... Sample 32 × 32 normalised red-green histograms…………….......… Hybrid topological-metric map………………………………...……. Floor plan of testing arena……………………………………...……. Testing arena and Pioneer robot with artificial landmark……...……. Results without visual re-localisation…………………………....…... SLAM results with unique landmarks……………………....…...…... Results with ambiguous landmarks……………………….......……... Floor plan of loop environment……………………………...………. Robot camera captures from loop environment……………...……… Robot localisation without visual re-localisation……..…….……….. RatSLAM trajectory for loop environment…………..……….……... Results of a robot kidnap……………………………..…..…...……... Complete floor environment…………………………….…...………. Robot camera captures from the complete floor environment…...….. Inconsistent pose cell trajectory without visual re-localisation……… RatSLAM trajectory for the office environment (η = 0.20)…..……... RatSLAM trajectory for the office environment (η = 0.25)…..……... CSIRO robot tractor…………………………………………...…….. Route for outdoor experiment………………………………...……... Robot localisation using only path integration…………..…….…….. Robot localisation with visual re-localisation…………..…….....…...
88 88 89 90 92 94 95 96 98 100 101 102 103 103 104 105 105 106 107 108 108 109 110 111 111 112 113 114 114
9.1 9.2 9.3 9.4
Active goal cells are linked forwards and backwards in time……...... Floor plan and robot trajectory for Experiment 8…………………..... RatSLAM trajectory just before goal navigation commenced…......... The temporal map cells after recall of the first goal……………....….
118 121 122 122
List of Figures
XVII
9.5 9.6 9.7 9.8 9.9 9.10
The path the robot followed to reach the first goal…………..…........ The temporal map cells after recall of the second goal………...……. The path the robot travelled to reach the second goal………...……... Floor plan of large indoor environment…………………….…...…… Dominant packet path for a 40 × 20 × 36 pose cell matrix….......…... Temporal map for the large indoor environment…………….......…..
123 123 124 124 125 126
10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 10.10 10.11
Experience map co-ordinate space…………………………..…….… A transition between two experiences……………………..……….... An example of the behavioural and temporal information…..…......... Experience map growth over time……………………………....…… Floor plan for the experience mapping experiments………..….....…. Path of dominant activity packet (100 × 40 × 36 cell matrix)…...…... Snapshots of the experience map at various times…………….......… Experience map produced using a 100 × 40 × 36 cell matrix……...... Path of dominant activity packet (40 × 20 × 36 cell matrix)…...……. Experience map produced using a 40 × 20 × 36 cell matrix…..….…. Experience map for outdoor environment…………………........……
130 132 133 135 137 137 138 139 140 140 142
11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 11.10
Robot movement at an intersection over 80 minutes………..….....… An example of a route calculated using a temporal map…..……..….. Behaviour arbitration during goal navigation………………....……... Floor plan showing the robot’s path and goal locations………...…… Temporal maps, planned and actual routes for experiment 13……..... Temporal maps, planned and actual routes for experiment 14…...….. Floor plan for adaptation experiment…………………………..….… Trial 1 goal navigation results………………………………….......... Trial 2 goal navigation results……………………………………...... Trial 3 goal navigation results……………………………..…...…….
148 150 152 154 155 157 158 159 159 160
A.1 A.2 A.3 A.4
Search tree node……………………………………...……..…..…… Search tree structure…………………………………...…..…..…….. Path extraction process…………………………………...….………. Velocity dependence on angle……………………………..…..……..
174 175 176 176
List of Abbreviations
APN ASR ATN CA CAL CAN CF CSIRO DARPA DG DP (M)EC EKF ELDEN EM GM GPS HD KF LCS LPM LTP LV LWF MCL MFIS nbhd NBV PC PET PI POMDP PoS, poSC PTZ RGB
Adaptive Place Network Absolute Space Representation Anterior Thalamic Nuclei Cornu Ammonis Calibration cell Competitive Attractor Network Centreline Following Commonwealth Scientific and Industrial Research Organisation Defence Advanced Research Projects Agency Dentate Gyrus Distinctive Place (Medial) Entorhinal Cortex Extended Kalman Filter Exploration and Learning in Dynamic ENvironments Expectation-Maximisation Goal Memory Global Positioning System Head Direction Kalman Filter Local Control Strategies Local Perceptual Map Long Term Potentiation Local View Left Wall Following Monte Carlo Localisation Memory For the Immediate Surroundings Neighbourhood Next-Best-View Place Cell(s), also Pose Cell(s) Positron Emission Tomography Path Integration Partially Observable Markov Decision Process Postsubiculum Pan-Tilt-Zoom (camera) Red-green-blue
XX RWF SAD SEIF SIFT SLAM sLEC SSD VIS
List of Abbreviations
Right Wall Following Sum of Absolute Differences Sparse Extended Information Filter Scale Invariant Feature Transforms Simultaneous Localisation And Mapping Superficial Lateral Entorhinal Cortex Sum of Squared Differences Visual bearing cell
1 Introduction
This book describes the design and implementation of a vision-based Simultaneous Localisation And Mapping (SLAM) system using an extended model of the rodent hippocampus, and a mapping algorithm that integrates with this system to provide goal recall and environment adaptation capabilities. Computational models of biological systems have traditionally had limited mapping and navigation capabilities when implemented on real robots. Conventional probabilistic techniques can solve specific components of the problem such as SLAM, but do not provide an integrated solution to the entire problem of exploring, mapping, navigating to goals, and adapting to change. The aim of this research was to demonstrate that by analysing and extending models of the rodent hippocampus within a pragmatic robotics context, it is possible to create a biologically inspired model that can solve many of the key mapping and navigation problems. This chapter describes the mapping and navigation challenge for mobile robots, the rationale for the approach taken in this body of work and the research contributions. Sections 1.2 and 1.3 discuss four major requirements of a mobile robot: the ability to perform SLAM, explore an environment, navigate to goals, and adapt to environment change. Section 1.4 describes the rationale behind developing a biologically inspired mapping and navigation solution, and the methodology employed. The chapter concludes in Section 1.5 with an outline of the book structure.
1.1 Mobile Robots Mobile robots are being employed in ever-increasing numbers in both domestic and industrial applications. The rapid uptake of mobile robots especially in consumer applications has ensured that the industrial robots used in manufacturing lines since the 1960s are no longer the most common form of robot. Over a short period of only a few years mobile robots have come to hold this distinction (UNECE 2005). The key to this explosion in mobile robot usage has been a transition in their mainstream image from novelty luxury items to practical, labour-saving machines. A good example of these new labour-saving robots is the vacuum cleaner robots produced by companies such as iRobot and Electrolux. By the end of 2004, there were an estimated 1.2 million personal domestic robots in service around the world according to M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 1–7, 2008. springerlink.com © Springer-Verlag Berlin Heidelberg 2008
2
Introduction
the United Nations Economic Commission for Europe survey (UNECE 2005), including more than one million vacuum cleaner robots. Of these robots about 550,000 were installed during 2004, indicating the rapid growth in this area. Although not as prolific as vacuum cleaner robots, lawn mowing robots are also in widespread use with an estimated 46,000 units worldwide as of the end of 2004 (UNECE 2005). The UNECE predicts that approximately seven million personal service robots will be sold during the period 2005 – 2008. This figure includes 4.5 million domestic robots with an estimated value of three billion US dollars, and 2.5 million entertainment and leisure robots with an estimated value of 4.4 billion US dollars (UNECE 2005). Longer term predictions include one of 39 million personal domestic robots by the year 2010, as given by the semiconductor analyst company Future Horizons (Horizons 2004). While mobile robots are becoming a normal part of a growing number of domestic homes, they are still quite limited in their capability. Many vacuum cleaner robots are, at their core, wheeled vehicles that combine a set of reactive movement behaviours with a range of pre-programmed path behaviours, using some simple back-off and turn behaviours to extract themselves from obstacles or dead ends. Although reactive techniques can be used to solve a large number of navigational problems, there are limits to what can be achieved without more advanced navigation methods. For example, there are many environment geometries such as box canyons that present a significant challenge to reactive techniques. The introduction of hysteresis mechanisms (Browning 2000) or local spatial memories (Balch and Arkin 1993) can increase the potential of reactive systems but only to a certain extent. Without state information and a map these robots have no higher level means of reasoning about their environment. Some of the characteristics of domestic robots are a product of needing to use cheap, mass produced sensors and computing power in order to achieve an affordable market price. Personal service robots need to be affordable because of the price sensitivity of consumers in that area (Schofield 1999). To produce a robot retailing for $100, it is quite likely that the factory will need to produce the system from components costing a small fraction of that amount. This financial restriction significantly limits the sensor equipment that can be used, ruling out popular robot sensors such as the SICK laser scanner (Schofield 1999). According to Colin Angle, “It is easy to build a robot that is prohibitively expensive”, but to build one that is affordable to mass consumers is much more challenging (Kanellos 2004). The more fundamental problem limiting the sophistication of mobile robots is the algorithms required to equip a mobile robot with advanced navigation abilities. A robot using only reactive behaviours has intrinsically limited navigation capabilities. To function intelligently in real world environments under anything but the most basic of assumptions about the environment, a mobile robot requires some way of mapping the environment and using that map to navigate and perform tasks (Thrun 1998). The development of mapping and navigation techniques that can be robustly applied on real robots will facilitate progress towards the next generation of mobile robots for industrial and domestic settings. This generation will include robots capable of exploring and mapping their surrounds, performing global tasks such as navigating to goals, and adapting to changes in the environment. The remainder of this chapter provides an introduction to the major mapping and navigation problems that must be solved before this next generation of mobile robots
Simultaneous Localisation and Mapping
3
becomes a reality, the current methodologies for solving these problems, and the motivation of the approach described in this book.
1.2 Simultaneous Localisation and Mapping One of the core problems facing autonomous mobile robots is the Simultaneous Localisation And Mapping (SLAM) problem. Although it is not the sole problem it is one of the most significant (Dissanayake, Durrant-Whyte et al. 2000; Newman and Leonard 2003) and has been the subject of much research over the past two decades. The core SLAM problem is the requirement that a robot, starting in an unknown environment, explore in order to learn the environment (map), while simultaneously using that map to keep track of the robot’s position (localise) within the environment. SLAM is, strictly speaking, a problem, although throughout the literature it is also used as a description of a process that a robot performs in solving the problem (Thrun 2002). The SLAM problem is difficult for many reasons. Mobile robots are essentially moving sensor platforms – sensors vary in type and capability but all have limitations and all are noisy to some degree. Consider the case of a laser range finder, widely regarded as one of the most accurate and reliable sensor types in robotics. A laser range finder measures the distance to obstacles with a high degree of accuracy, but is neither exact nor infallible. Measurements are only accurate within a margin of error. In addition, measurements may be incorrect in real world environments – the laser may ‘see’ through glass walls, and beams may reflect off multiple surfaces before returning to the sensor. In outdoor settings bright sunlight may affect the sensor and introduce faulty readings. The problem becomes worse when examining the performance of sonar range sensors. Unlike the relatively focused light beam from a laser, sound waves from a sonar sensor project in a cone-like manner from the point of origin. Returned ranges are obtained for the closest reflecting object within this cone. The wide beam results in significant uncertainty regarding the exact position of a detected obstacle (Moravec and Elfes 1985). Surfaces with certain textures or surfaces that do not transect the sonar cone at ninety degree angles may not reflect sufficient sound energy back to the sensor to register a reading. As with a laser range finder, readings may be caused by reflections from multiple surfaces and not represent the range to the closest object. The sensor problem is not limited to range finding sensors – vision-based sensors are subject to their own set of limitations. Cameras are notoriously poor at representing colours in images. If the image domain is reduced to greyscale images, there is still the problem of dealing with light intensity variation, especially in outdoor environments. There are a handful of techniques for dealing with lighting variation but none are completely satisfactory (Buluswar and Draper 2002; Tews, Robert et al. 2005). Camera images can also be distorted, especially the images obtained from panoramic cameras or using low cost wide angle lenses. Internal robot sensing is just as limited. Rotational encoders on robot wheels may accurately measure the movement of the wheel, but factors such as wheel slip ensure that this measurement does not correspond exactly to the movement of the robot along the ground. It is impossible to calibrate a system perfectly meaning that even with a
4
Introduction
minimum of wheel slippage, over any significant time period these slight measurement errors will accumulate to unmanageable proportions. This phenomenon is well known within the robotics community and is often referred to as ‘odometric drift’. The problem is made more severe in certain situations, such as when the robot’s environment contains a variety of ground surface types. The significance of many of these external and internal sensing problems can be reduced through engineering. Using more expensive cameras can reduce image distortion. Sensor fusion between multiple types of range sensors can help negate the effects of occasional incorrect measurements from a particular sensor type. Engineers can design robots and their control systems to minimise wheel slippage. The use of a Global Positioning System (GPS) can help avoid the need to integrate wheel encoder information in order to update a robot’s position over time. These approaches can solve or reduce the severity of some of these problems but are often not practical to implement or introduce new problems. For instance, standard GPS does not work indoors, and is not completely reliable outdoors, or available in all locations. The first problem can be solved by buying an indoor GPS, but this may only be suitable for certain types of indoor environments, making it a specific rather than general solution. In outdoor environments when the GPS ‘drops out’ the downtime can last long enough to severely test any alternative system. Cost concerns rule out many of the possible solutions, especially in areas such as domestic robots, where expensive sensors cannot be used. Underlying the entire sensory issue is the inescapable fact that humans and animals manage to navigate and learn environments with a set of inexact, imperfect sensors. Using expensive sensors to increase accuracy and reliability is a partial engineering solution to the problem, but it is obvious that nature has managed to solve the problem in an entirely different manner. From a commercial and practical perspective it is desirable that robot systems, especially those for personal use, be able to function with simple, inexpensive sensors in order to be commercially attractive. Research into ways of solving the SLAM problem can be divided into two broad categories. One category consists of a number of mathematical probabilistic techniques that have been developed over the last twenty years. These methods use Kalman Filters, particle filters, and Expectation Maximisation algorithms as the basis for robotic SLAM techniques. This approach has been extensively researched, resulting in a wide variety of world representation (map) types and sensor domains. The research in this field has primarily focused on producing functional navigation and mapping systems on mobile robots in real world environments, or in simulation with the goal of eventual deployment on a robot. The field has produced a number of SLAM methods that can function in large environments under varying assumptions. Typical assumptions are that the robot’s environment is static and that the robot has an accurate range sensor. Most of the methods with the best practical performance require costly sensing equipment and significant computational resources (Montemerlo, Thrun et al. 2003; Grisetti, Stachniss et al. 2005). In recent work range information has been extracted from stereo vision sensors and used successfully to perform SLAM, but the process has required fairly complex management of Scale Invariant Feature Transforms (SIFT) (Sim, Elinas et al. 2005). The other area of research has focused on emulating the biological systems thought to be responsible for mapping and navigation in animals. The most studied animal has
Exploration, Goal Navigation and Adapting to Change
5
been the rodent, with a focus on the rodent hippocampus as the probable region involved with spatial navigation tasks. Research in this area has developed a consensus on general characteristics of the hippocampus with respect to navigation in rats, although many aspects of the issue remain the subject of debate. The research trend in this area has been towards validating and improving models of the brain function, rather than producing practical, robot navigation systems (Franz and Mallot 2000). Few models have been implemented on actual robots, and those that have only in small artificial environments (Arleo 2000; Krichmar, Nitz et al. 2005). This research has however yielded some preliminary ideas on how a biologically inspired system might solve the SLAM problem, without the use of expensive range sensors or complex probabilistic algorithms. In summary, the SLAM problem is one of the major challenges that must be solved by higher level autonomous mobile robots. SLAM is a difficult problem because of the inherent uncertainty in the sensors a robot uses to gain knowledge about its environment. Robotics research has produced a number of solutions to the SLAM problem that perform quite well in large environments but which often rely on environment assumptions, expensive range sensors, and computationally intensive probabilistic algorithms. Existing computational models of biological systems are much less capable of solving the SLAM problem but offer the potential for an alternative solution incorporating more affordable sensory equipment.
1.3 Exploration, Goal Navigation and Adapting to Change Solving the SLAM problem is only one of many capabilities that a fully autonomous mobile robot must possess if it is to function in real world environments. A robot must also be able to explore an unknown environment, navigate to goals, and adapt to changes in the environment. However, these tasks cannot be performed efficiently without the core SLAM problem being solved. Mapping and navigation research on real robots has been biased towards solving the SLAM problem, since it is difficult to properly address these other challenges without a solution to the SLAM problem. Exploration and goal navigation techniques have been extensively researched in simulation, often with the assumption that the SLAM problem has been solved. There has been relatively little research on actually implementing these techniques on robots. There has also been only limited investigation into methods of adapting to changing environments. Because so much of the mapping and navigation research on actual robots over the past two decades has focused on the core SLAM problem, the subsequent solutions have been developed in isolation, without due consideration of other challenges such as goal navigation and adaptation to environment change. The focus on the SLAM problem has led to piecemeal attempts to implement goal recall and adaptation abilities after solving the SLAM problem, rather than the consideration of the entire mapping and navigation problem from the start of each investigation. Consequently, while there are individual techniques for solving each aspect of the problem, there are no systems that successfully solve the entire problem on robots in real world environments.
6
Introduction
1.4 Practical Performance from a Biological Model The work described in this book has been motivated by the desire to fill the gap between the relatively poor practical performance of biological navigation models and the better, but more specific, performance of current robotic navigation systems. Biological models are based on animals with competent all round mapping, navigation, and adaptation abilities, whereas robotic methods tend to solve each of the problems in isolation. While there has been a significant amount of research into creating models of biological systems, only a small fraction has focused on the practical usefulness of such models in robotics (Arleo 2000; Browning 2000). Most of the research has centred on testing models of biological mechanisms, rather than on finding an optimal solution to the robot mapping and navigation problem (Franz and Mallot 2000). Models of biological systems have been formed almost entirely from evidence obtained from rat experiments in small, highly controlled environments. Typical rodent experiments involve a rat moving around a small cylindrical, rectangular, or simple maze arena with a number of well-defined artificial visual cues (McNaughton, Barnes et al. 1983; Knierim, Kudrimoti et al. 1995; Skaggs and McNaughton 1998; Hampson, Simeral et al. 1999; Kobayashi, Tran et al. 2003). Testing of computational models developed using this experimental evidence has occurred in similar types of environments. For example, in Arleo’s doctoral work, the two arenas were a 600 × 600 mm arena with bar-coded walls and an 800 × 800 mm arena with an external visual cue but no walls (Arleo 2000). Many of the major mapping and navigation problems such as closing the loop and dealing with significant ambiguity are not apparent in such small, constrained environments, so the practical usefulness of biological models has remained untested. As well as being a relatively unexplored research topic, creating computational models of biological systems for use on robots is attractive for two reasons. The first reason is the mapping and navigation capabilities of the animals upon which the models are based. Animals such as rodents function in complicated environments without the use of high precision range finding equipment. Biological models offer the potential for autonomous mobile robots without the expensive laser range finder sensors used in so many SLAM systems. The second reason is the nature of how animals solve the entire mapping and navigation problem. Animals appear to have an integrated solution that allows them to explore, map, navigate to goals, and adapt to environment change. Modelling biological systems involves attempting to represent these combined mechanisms as one coherent whole, rather than as a number of isolated algorithms and techniques.
1.5 Book Outline The book chapters proceed as follows: To create a biologically inspired mapping and navigation system with practical performance requires an examination of the robotic and biological literature in this field. Chapter 2 defines the mapping and navigation problem and discusses the individual problems that it constitutes. Chapters 3 to 5 review the three major types of system that address this problem; conventional probabilistic systems, biological
Book Outline
7
systems, and computational models of biological systems. Chapter 6 provides a comparative analysis of these systems, resulting in the identification of key issues and a proposal for the research path taken in this work. The first stage of the research consisted of a pilot study implementation of a classical computational model of the rodent hippocampus, with the purpose of evaluating its mapping and navigation capabilities in a robotics context. Chapter 7 describes the model implementation and its mathematical foundations. The model was tested in a number of spatial localisation and mapping experiments. These experiments revealed the model had some fundamental mapping limitations. The chapter concludes with a review of the experimental research on rats to determine whether this limitation is a product of the model or the process being modelled. The outcomes of this research led to the extension of the hippocampal model which is described in Chapter 8. Developing a SLAM system based on models of the hippocampus is an iterative process. Chapter 8 describes the implementation of an extended model of hippocampal mapping and navigation called RatSLAM, which is based on the orthodox model implemented in Chapter 7. This is followed by a presentation of the RatSLAM model’s performance in a range of indoor and outdoor SLAM experiments. A map is only useful if it can be used by a robot to perform some sort of purposeful behaviour. Chapter 9 describes the implementation of a goal recall system that uses the RatSLAM maps directly to perform navigation to goals. The method uses a new cell structure known as the goal memory to create temporal maps for navigation. The system was tested in a sequence of progressively larger environments under increasingly challenging conditions. This increase in complexity introduced a number of phenomena into the RatSLAM maps including discontinuities and collisions that make them unsuitable for direct use. To create a more usable representation, an algorithm known as experience mapping was developed. Chapter 10 presents the experience mapping algorithm, which uses the RatSLAM maps to create spatio-temporal-behavioural maps that are suitable for exploration, goal navigation, and adaptation strategies. The algorithm is described in detail, including the core map correction process by which it produces experience maps that are free of the discontinuities and collisions of the original RatSLAM representations. Results are presented from experiments in both indoor and outdoor environments. With the establishment of a suitable mapping algorithm, methods for exploration, goal recall, and adaptation methods were implemented. Chapter 11 describes the development and implementation of these methods. Results are presented to evaluate each of these methods in a number of indoor robot experiments. The final experiments tested the ability of the combined system to autonomously explore, map, navigate to goals, and adapt to environment change. The final stage of the research consisted of a review of the work performed, an analysis of the research outcomes and their significance within the research field, and a discussion of the remaining research challenges. Chapter 12 starts with a summary of the research presented in each chapter and highlights the key findings that led to the next stage of the work. This is followed by an assessment of the major research contributions, which are related back to the literature on mapping and navigation. Section 12.3 discusses future challenges for the mapping and navigation research field and ways in which they may be addressed. The book then concludes with a summary of the work performed and its contribution to the field of robot mapping and navigation.
2 Mapping and Navigation
Mobile robots and animals alike require the ability to move around in their environments. For a large number of animals such as insects, this ability is provided by reactive movement schemes. With an appropriate set of reactive behaviours an insect can function quite well in complex environments. However, there is a limit to what can be achieved through pure reactivity. A whole class of navigation problems become solvable if an animal or robot is able to form some sort of map of its environment. Research on the topic of mapping and navigation can be split into two areas. One area has focused on how animals move around and function in their habitats, whether the animal is an ant scurrying around on desert sand dunes (Muller and Wehner 1988), or an English taxi operator driving passengers around London (Maguire, Frackowiak et al. 1997). Popular study subjects in this area of research include rodents, primates, humans, and bees. By examining the capabilities of these biological agents both in nature and controlled environments, researchers have formed a number of models of the actual physical and neurological mechanisms driving their navigation systems. The extent of the research has varied, from purely theoretical analysis (Abbott 1994; Eichenbaum, Dudchenko et al. 1999; Hahnloser, Xie et al. 2001), to validation through testing of software models or testing in simulated environments (Borenstein 1994; Touretzky, Wan et al. 1994; Balakrishnan, Bhatt et al. 1998; Stringer, Rolls et al. 2002; Chavarriaga, Sauser et al. 2003; Koene, Gorchetchnikov et al. 2003), to deployment of models on actual robotic platforms (Arleo 2000; Krichmar, Nitz et al. 2005). Roboticists have generally had a different motivation driving their mapping and navigation research – the desire to produce functioning robots. The inherent uncertainty of the sensors used on robots has led to a convergence of the most successful robotic mapping and navigation systems – they are all probabilistic to some degree (Thrun 2000). Most methods have as their basis one or more of three different probabilistic algorithms: Kalman Filter, Expectation Maximisation, or particle filter algorithms. However, there is a large range of methods within this banner of probabilistic systems (Thrun 2002). The robot platform, sensor types, and environment all affect the way in which a mapping and navigation system is developed. M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 9–13, 2008. springerlink.com © Springer-Verlag Berlin Heidelberg 2008
10
Mapping and Navigation
This and the following 4 chapters examine the research that has been conducted in both the robotic and biological mapping and navigation fields. Their purpose is as follows: 1. Introduce the mapping and navigation problem and discuss the various components of the problem. 2. Review the core probabilistic algorithms and the robotic mapping methods that use them. 3. Investigate the capabilities of biological systems. 4. Review the mapping and navigation performance of state of the art models of the rodent hippocampus – the most thoroughly understood biological system in this field. 5. Identify key issues in the field of mapping and navigation and use them to compare and contrast robotic and biological systems. 6. Highlight the middle ground between the probabilistic and biologically-based research areas, and discuss the advantages of developing a method inspired by biological systems but with a primary focus on practical performance.
2.1 The Mapping and Navigation Problem Mobile robots by definition move around in their environments. Low level robots may function quite adequately in their environment using simple reactive behaviours and random exploration, but more advanced capabilities require some type of mapping and navigation system (Thrun 1998). The purpose of this section is to clarify exactly what is implied by the words ‘mapping’ and ‘navigation’ and to discuss the various capabilities a robot must possess to achieve these tasks. 2.1.1 Localisation and Mapping When a robot localises, it determines its location within some representation of the environment. The act of localisation can vary greatly in complexity. Take the example of a robot that has just detected it is within a certain range of an infrared beacon. Armed with the prior knowledge that there is a single infrared beacon somewhere in its environment (and hence equipped with a map), the robot now knows that it is within a certain distance of that beacon, effectively localising itself within its map. In another environment an aerial robot may have measured its distance from four artificial landmarks placed on the ground. Armed with the prior knowledge of the landmarks’ positions within the environment (a map), the robot can localise by calculating its position from the distance measurements. It is important to note that the localisation process is entirely dependent on the map the system possesses – without a map the process has no meaning. Mapping involves creating a representation of the environment. The representation or map can vary; one map might store a set of coordinate locations of trees in a park; another might keep a millimetre resolution grid showing which parts of an office floor have objects on them; yet another may represent the environment as a sequence of images and motor actions. Mapping is inherently linked with localisation – just about all mapping processes require that the robot is able to localise itself at least some of
The Mapping and Navigation Problem
11
the time. It is this inter-dependence of the two processes that leads to one of the most significant and challenging problems in mobile robotics. 2.1.2 SLAM: The Chicken and the Egg Problem To create a map of the environment based on its sensory information, a robot must know where it is. Conversely, a robot can only calculate where it is located if it has some sort of map or world representation. When a robot is placed into an unknown environment, it starts with neither a map nor any idea where it is. In this situation a robot must create a map while simultaneously localising itself within this map. This is known as the Simultaneous Localisation And Mapping problem, or SLAM for short (Dissanayake, Newman et al. 2001). The problem is also referred to in the literature as the Concurrent Mapping and Localisation (CML) problem (Leonard, Newman et al. 2001). It is generally considered to be one of the most challenging problems in mobile robotics and one of the most significant that must be solved before truly autonomous moving robots become a reality (Dissanayake, Durrant-Whyte et al. 2000; Leonard, Newman et al. 2001; Newman and Leonard 2003). Unlike mapping and localisation systems that require a priori knowledge of the environment or modification of the environment, SLAM systems can function in situations where map information or robot position is not initially known and environment modification is not possible. This ability makes these systems significantly more autonomous and ideal for a large range of applications including mobile mining robots, aerial vehicles, submersible robots, and planetary rovers (Dissanayake, Newman et al. 2001). Even in environments where it is possible to hand craft a map beforehand or place artificial markers, the effort involved is a significant motivator for using an autonomous map building system. 2.1.3 Dealing with Uncertainty From the perspective of a robot, the world is an uncertain place. This uncertainty comes from two main sources (Thrun 2000): • Sensors are noisy, producing measurements with a certain expected error range, and only work properly under a limited set of conditions. • The world is an inherently uncertain, ambiguous, and unpredictable place. To learn about an environment a robot must use some type of sensing device; because sensors are always noisy to some degree this introduces uncertainty into the robot’s perception of the world. For instance, a laser range finder may measure distances to a precision of 10 millimetres – it has high precision, but is not exact. Other less sophisticated range sensors may have less precise specifications. Vision sensors may exhibit undesirable characteristics in varying light conditions (Tews, Robert et al. 2005). Global sensors such as GPS can have ‘jitter’ – a robot’s reported position may rapidly move around in jumps of several metres. Environments are another source of uncertainty. Ambiguity is one main cause – separate locations in an environment can appear the same through a robot’s sensory devices. Environments can also be unpredictable – unforseen changes to the environment can occur, such as the removal or repositioning of furniture. A robot may have
12
Mapping and Navigation
to deal with the uncertain movement of dynamic objects such as people in an office building (Biswas, Limketkai et al. 2002). 2.1.4 Exploring Unknown Environments Robots placed in unknown environments have no initial knowledge of their surroundings. To acquire this knowledge they must move around their environment, which is known as the process of exploration. Exploration is a crucial capability for any robot that must function in new environments it has never experienced before. Some researchers consider it the fundamental problem for mobile robots, because it involves localisation, navigation, and planning (Oriolo, Vendittelli et al. 2004). A good exploration process allows a robot to efficiently gain knowledge about the environment. It also removes the time intensive task of building a priori maps by hand (Wullschleger, Arras et al. 1999), which is one of the keys to truly autonomous robots. Exploration is often a critical part of the localisation and mapping process. Some methods may require an exploration scheme that regularly returns the robot to familiar locations, limiting the continuous time spent in new parts of the environment (Arleo 2000). The exploration technique can also be linked closely to the nature of the world representations that the robot builds (Kuipers and Byun 1991). Even after the environment has been learned, the robot can discover changes in the environment by re-exploring. A robot may also use routes it has traversed during exploration to navigate to goals. As well as enabling a robot to efficiently learn the environment, a good exploration method may facilitate higher level functions such as goal navigation and adaptation to environment change. 2.1.5 Navigating to Goals “How do I get there” is one of the essential navigation functions (Maguire, Burgess et al. 1998). Mobile robots need to be able to get from place A to place B in an efficient manner. The problem is a complex one; to solve it in all but the simplest of situations involves much more than just following a reactive collision avoidance procedure. The robot must act despite the inherent uncertainty in its sensors and the robot’s perception of its location in the environment. At certain times the lack of information or certainty may require the robot to blend goal navigation and information acquisition activities in order to improve its state information and hence its planning ability (Latombe 1991). If a robot needs to plan and execute paths starting with only an incomplete map or even no map at all, it must explore and build an incremental solution to the goal navigation problem (Yahja, Singh et al. 1998). The type of robot and environment can dictate the desirable characteristics of the goal navigation process. The challenge is often to navigate to the goal as quickly as possible or via the shortest route. However certain robots may have different constraints; for instance an interplanetary rover robot may need to navigate to goals using the least power or via the safest route (Guo, Parker et al. 2003). Military or domestic guard robots may need to navigate via the stealthiest route in order to minimise the chances of detection (Marzouqi and Jarvis 2004). In some situations, practical limitations of robot actuators, safe movement restrictions and fragile payloads dictate the goal navigation requirements; routes with smooth movement trajectories and restricted acceleration (Lepetic, Klancar et al. 2003; Wei and Zefran 2005). Common to all these situations is the need for a method of navigating to locations with some notion of optimality.
The Mapping and Navigation Problem
13
2.1.6 Learning and Coping with Change Being able to autonomously explore, SLAM, and navigate to goals are challenging tasks for any robot, but are only part of the problem faced by robots moving around in real world environments. Robots must also be able to cope with change in their environment (Fox, Burgard et al. 1999; Biswas, Limketkai et al. 2002). This is a particularly challenging problem because of the vast range of timescales involved. Change in an environment can be short-term and transient, such as a person walking past a robot, or long term and lasting, such as a rearrangement of the furniture in an office (Yamauchi and Beer 1996). A localising and mapping process must be robust to transient change and be able to incorporate lasting change into its maps. Changes can also be defined as either perceptual or topological, or a combination of both (Yamauchi and Beer 1996). Perceptual change is a change in the sensory appearance of the environment to the robot. The type of sensors the robot is using determines what constitutes a perceptual change. For instance, a change in wall colour and texture may not be noticed by a mapping system that uses a laser range finder, but would be a significant change for a vision-based SLAM system. Conversely the movement of an object against a similarly coloured background might not be noticed by an appearance-based vision system, but would constitute a significant change to a range-based mapping system. In either situation, when the change is noticed, a previously familiar part of the environment suddenly appears different to the robot’s perception system. The robot must be able to consolidate these new changes with its previous experience in a rational manner. The problem of perceptual changes is compounded by physical changes to the structure of the environment. This is also referred to as topological change (Yamauchi and Beer 1996), and often includes perceptual change as well. In an indoor environment doors may be opened or closed, corridors may be blocked by temporary obstacles, and entire spaces can have their structural layout reorganised. In an outdoor setting vehicles may move around, roads or pathways may be blocked and the fundamental shape of the landscape may change, such as at a mine site. Paths that the robot is accustomed to taking may become blocked, or change in shape. New paths through previously inaccessible areas may be introduced. In order to function under these conditions a robot must be able to recognise and learn these structural environment changes and alter its movement through the environment accordingly. Each aspect of the problem is by itself a challenging task. They combine to provide a problem which is unsolved and will probably remain unsolved for years to come. The difficulty of the problem is not surprising given that humans often struggle in the same situations. For instance a human may struggle to adapt to topological changes in their workplace due to renovations or reorganisations. When a worker shifts offices, their co-workers may travel to their previous office many times before finally learning to go to the new office location. In outdoor environments visual changes can cause navigation problems, such as those caused by the transition from day-time to nighttime driving, or from a summer environment to a snow-covered winter environment. Given the difficulty of navigating in dynamic environments, it is not surprising that robotics research to date has only focused on solving limited subsets of the problem.
3 Robotic Mapping Methods
Sensor and environment uncertainty has caused most robotic mapping and navigation methods to converge to probabilistic techniques. The key strength of probabilistic techniques is their ability to deal with uncertainty and ambiguity in a robot’s sensors and environment. Any technique that has some means of handling the uncertainties faced by a mobile robot has an immense advantage over techniques that do not. Probabilistic techniques allow a robot to appropriately use sensory measurements based on their modelled uncertainties. Ambiguous features or landmarks in the environment become useful (albeit less useful than unique features) rather than becoming failure points for the mapping algorithm.
3.1 Probabilistic Mapping Algorithms While most of the successful mapping and navigation techniques are probabilistic (Thrun 2000; Dissanayake, Newman et al. 2001; Montemerlo, Thrun et al. 2003), there is a wide range of ways in which a probabilistic approach can be implemented. There are several types of probabilistic algorithms that have been the focus of much research, with no clear ‘best’ one. Techniques vary in their complexity, assumptions about the environment, sensor usage, computational requirements, flexibility across robot platforms, scalability, and many other characteristics. Some are explicitly probabilistic down to the core mathematical foundations, while others are more implicit in their probabilistic characteristics, having been developed in a more heuristic manner. Some can be combined with others to form hybrid methods. This section will focus on the three most successful probabilistic algorithms, and their use in mapping and navigation methods on robots in the real world. The three types are: • Kalman Filter (KF) algorithms, • Expectation Maximisation (EM) algorithms, and • particle filter algorithms. 3.1.1 Kalman Filter Methods Kalman Filter-based methods track the locations of landmarks or features as the robot moves around the environment. Computationally they are one of the few techniques that M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 15–28, 2008. springerlink.com © Springer-Verlag Berlin Heidelberg 2008
16
Robotic Mapping Methods
can maintain the full notion of uncertainty in the map, but are limited by a Gaussian assumption of uncertainty. This limitation has been partially addressed by the introduction of Extended Kalman Filters (EKFs). The complexity of the solution is proportional to
( )
O N 2 , with N being the number of landmarks or features in the map. Dissanayake et al. (2001) introduced the contemporary Kalman Filter technique for solving the SLAM problem and proved that solutions would converge. To implement a vanilla Kalman Filter method, a few assumptions and a movement model are required. The usual assumption is that of a robot starting at an unknown location in an unknown environment. To model the motion of the robot through the environment, a kinematic model of the robot is required. Typically this model also incorporates some type of allowance for noise. Many methods use a linear discrete-time model to model the movement of the robot through the environment e.g. (Fabrizi, Oriolo et al. 1998). Although the actual robot movement is usually non-linear, when segmented into short enough time periods the movement can be considered linear with minimal inaccuracy. The updated state of the robot, xv ( k + 1) , can be calculated as follows:
xv (k + 1) = Fv (k )xv (k ) + uv (k + 1) + vv (k + 1)
(3.1)
v subscript denotes the variable is specific to the robot (v)ehicle, k is the step number, Fv (k ) is the state transition matrix, uv (k + 1) is a vector of control inputs, xv (k ) is the state of the robot, and vv (k + 1) is a vector describing the errors where the
in the movement process. The movement noise is usually modelled as having a mean of zero. Robots require a sensor that can measure the relative locations of landmarks with respect to the robot. Observations, like movement, are characterised by uncertainty. The output of the sensor for the
i th landmark, zi (k ) , is given by:
zi (k ) = H i xa (k ) + wi (k ) = H pi pi − H v xv (k ) + wi (k ) where
(3.2)
xa (k ) is a vector containing the vehicle and landmark states, wi (k ) is a vec-
tor storing the observation errors, usually with a mean of zero as with the motion process model, H i is the observation model that maps the sensor output zi (k ) to the robot state vector, and
pi is the absolute location of the ith landmark.
Vanilla Kalman Filter techniques require linear process models with Gaussian noise. To extend their applicability to non-linear processes the Extended Kalman Filter algorithm was developed (Smith, Self et al. 1990). Extended Kalman Filters use the partial derivatives of the process and measurement functions to continually linearise around the current estimate. This technique allows the original Kalman Filter techniques to be extended to simple systems with non-linear characteristics. The method however is not practical when applied to more complex non-linear systems.
Probabilistic Mapping Algorithms
17
Standard Kalman Filter methods cannot handle the detection of two indistinguishable landmarks at separate locations in the environment. The detection of two such landmarks introduces a multimodal distribution representing that landmark’s location. This does not fit within the unimodal Gaussian noise assumption of a vanilla Kalman Filter technique. Most practical implementations sidestep this issue by ignoring nonunique environment features, with the obvious disadvantage of discarding a large amount of potentially useful information about the environment. Extensions such as the Lu/Milios algorithm described by Lu et al. (1997) use maximum likelihood data association to recover from incorrect correspondences. The method works well when errors in the robot pose remain small, but fails when the error in pose is large. The complexity of Kalman Filter methods scales in a quadratic manner with the number of landmarks or features in the environment. Computation can rapidly become quite significant as environment sizes or experiment durations increase. In response to this problem the Sparse Extended Information Filter (SEIF) method was developed (Thrun, Koller et al. 2002; Eustice, Walter et al. 2005). The SEIF method is based around pruning of the information or inverse covariance matrix used in the Kalman Filter update equations, and exploitation of the sparse nature of the resulting matrix. The complexity of the SEIF methods generally scales with O ( N ) making them more computationally tractable. There is, however, generally some degradation of the resulting map’s correspondence to the environment. For instance, Eustice, Walter et al. (2005) report that there is significant global map inconsistency using SEIF methods. They propose a modified procedure that solves the global inconsistency problem but that is no longer a constant-time algorithm. 3.1.2 Expectation Maximisation Methods The Expectation Maximisation algorithm has somewhat complementary attributes to Kalman Filter approaches (Dempster, Laird et al. 1977). Its most significant advantage is the ability to solve the correspondence problem – in ambiguous, cyclic environments robots using EM techniques are still able to generate correct maps. The EM algorithm involves two steps: • an expectation step that calculates the posterior over the robot poses for a single map, and • the maximisation step that calculates the most likely map given the expected robot poses. This two step process is iterated to provide increasingly accurate maps of the environment. Each point in the estimated path of the robot is calculated using all temporal data, including both past and future data relative to the time associated with the pose point being calculated. The maximisation step finds the map that maximises the log likelihood of both the sensor measurements and the estimated robot’s path. EM methods solve this problem for discrete (x, y) locations since there is no continuous solution. EM techniques solve the correspondence problem by generating multiple hypotheses about the robot’s path during the expectation step. Each path represents different correspondences between the sensory data and features in the environment. During the maximisation step these correspondences create the features in the resulting map.
18
Robotic Mapping Methods
In the next expectation step features that correctly represent the environment help maintain correct estimates of the robot’s path. Incorrect features generate unsupported pose estimates which gradually disappear through further iterations. This process gives EM methods the ability to map large cyclic environments as long as the motion model is representative of the potential cumulative errors in odometry obtained at the end of a long loop in the environment. Fig. 3.1 shows an experiment run by Thrun (2002), in which a robot moved around a large cyclic indoor environment containing indistinguishable landmarks. The results of applying the EM algorithm are contrasted with the raw data. The EM algorithm was able to solve the correspondence problem between ambiguous landmarks and produce the topologically correct map show in Fig. 3.1c.
Fig. 3.1. Maps produced with and without the EM algorithm. (a) Uncorrected trajectory of robot in a large cyclic environment with indistinguishable landmarks (shown by circles) (b) Occupancy grid map built using uncorrected odometry and sonar sensors. (c) Corrected trajectory and landmark locations using EM algorithm. (d) Resultant occupancy grid map (Thrun 2002). Reprinted with permission from Elsevier.
The main weakness of the EM algorithm is that, unlike Kalman Filter approaches, it does not maintain a full notion of uncertainty. This means that the algorithm can get ‘stuck’ in local maxima during its iterative two step map correction process. The algorithm is also computationally expensive, relegating it to offline use. The results shown in Fig. 3.1 took several hours to generate on a low-end PC. 3.1.3 Particle Filter Methods Particle filters represent the belief about the robot’s pose using a set of samples or particles distributed according to the posterior distribution over robot poses. In the
Probabilistic Mapping Algorithms
19
field of robot localisation these methods are often referred to as Monte Carlo Localisation (MCL) (Thrun, Fox et al. 2000). Rather than approximating the posterior in parametric form, MCL techniques use a distribution of weighted particles that approximate the desired distribution. This removes any restrictions regarding the shape or characteristics of the posterior distribution. MCL recursively uses a Bayes filter to estimate the posterior distribution of the robot poses. The filter estimates a probability density over the state space, using the sensory data. The belief set of robot poses, Bel ( xt ) , is given by:
Bel (xt ) = p (xt | d 0...t ) where
(3.3)
xt is the robot’s pose state at time t, and d 0...t is the sensor data from time 0
up to time t. In the context of robot localisation there are usually two types of data – internal and perceptual. Internal data is produced by sensors such as wheel encoders and inertial motion units and can be used to update information about the robot’s motion. Perceptual data is produced by external sensors such as laser range finders, sonar arrays, and cameras. By splitting the data into perceptual and internal types, exploiting some assumptions, and performing repeated integration one can arrive at the recursive update equation in Bayes Filter. One critical assumption is the Markov assumption – that the future state is independent of past states as long as the current state is known. By exploiting this assumption multiple times, the update equation for the robot’s possible pose states, Bel ( xt ) , is obtained as:
Bel (xt ) = η p(ot | xt ) where t, and
∫ p(x | x t
t −1 , at −1
) Bel (xt −1 )dxt −1
(3.4)
η is a normalisation constant, ot is the set of perceptual observations at time at −1 is the set of internal observations at time t-1 (Thrun, Fox et al. 2000). This
equation forms the basis for many of the Monte Carlo Localisation algorithms. If updates are based only on internal observations the belief in robot pose will gradually spread out due to uncertainties in the motion of the robot. Fig. 3.2 shows a progression of sample approximations to a robot’s pose as it moves around a rectangular path during an experiment by Thrun, Fox et al. (2000). The robot starts welllocalised but becomes less and less certain of its location with each move it makes. The distribution shapes reveal the characteristics of the sampling model – uncertainties in angular motion are more critical to the localisation process than uncertainties in translational motion. The p (ot | xt ) component of Equation 3.4 is the key to avoiding endless smearing of the sample approximation of the robot’s pose. A range of probability densities are incorporated into the calculation of p (ot | xt ) to allow for uncertainty. Thrun, Fox et al. (2000) describe three types of probability density for use with range sensors; a Gaussian distribution used to model uncertainty in a sensor range measurement, an
20
Robotic Mapping Methods
Fig. 3.2. The effect of the motion model on the robot’s pose belief. As the robot moves around a rectangular path, the samples representing the belief in robot pose spread out according to the robot’s motion model (Thrun, Fox et al. 2000). Reprinted with permission from Elsevier.
exponential density to model random readings from dynamic objects in the environment such as people, and a narrow uniform probability density located at the maximum range specification of the sensor, for maximum range scans when no obstacle is detected. Once the robot has a partial map of the environment, repeated observations using the robot sensors support the particles located in plausible locations, while unsupported particles quickly disappear. Robots can localise rapidly in large complex environments using this method, with reasonable computational requirements. Another advantage of this method is the ease with which the number of samples can be changed, to suit whatever computational resources are available. The main limitation of particle filters in their raw form is that they are only a localisation technique, not a simultaneous localisation and mapping technique. Extensions of vanilla MCL methods and integration with other probabilistic algorithms have however yielded efficient solutions to the full SLAM problem. One such algorithm is FastSLAM, developed by Montemerlo, Thrun et al. (2002). Like standard MCL methods, this algorithm uses a particle filter to estimate the posterior over robot paths. This is only one of a number of estimates however; to estimate the positions of landmarks in the environment, each particle possesses as many Kalman Filters as there are landmarks. Each Kalman Filter is used to estimate the location of a landmark conditioned on the path estimate. The resultant map stores the location of landmarks such as rocks (Montemerlo, Thrun et al. 2002) or trees (Montemerlo, Thrun et al. 2003). Computationally, FastSLAM can be a lot less demanding than Extended Kalman Filter techniques. The only Gaussian distributions used are two-dimensional ones representing the (x, y) locations of landmarks. EKF methods need to maintain a fivedimensional Gaussian because the distribution needs to represent the robot’s (x, y, θ)
Topological Mapping Methods
21
state in addition to the landmarks’ (x, y) states. This advantage alone allows the computation of the FastSLAM algorithm to scale with O (MK ) , with M the number of particles and K the number of landmarks (and hence Kalman Filters per particle). Montemerlo, Thrun et al. (2002) use a binary tree to represent the set of Gaussians associated with each particle to reduce the computation time to O (M log K ) . A second version of the FastSLAM system has also been developed, known as FastSLAM 2.0 (Montemerlo, Thrun et al. 2003). This method has been proven to converge with a single particle, and is more efficient than the first FastSLAM implementation. A recent approach by Grisetti, Stachniss et al. (2005) uses Rao-Blackwellized filters (Murphy and Russell 2001), with each particle representing a possible map and robot trajectory. In contrast to FastSLAM, the Rao-Blackwellized approach is used to build grid maps showing the occupancy state of the environment in a fine-grained grid. Although computationally similar to the FastSLAM algorithm, this method does not rely on predefined landmarks. The method was able to generate accurate high resolution grid maps in a number of large indoor and outdoor environments. The three probabilistic algorithms – Kalman Filters, Expectation Maximisation, and particle filters – form the core of most of the top robotic mapping systems. They are typically used to form occupancy grid (Grisetti, Stachniss et al. 2005) or landmark-based maps (Nieto, Guivant et al. 2003). However, one other major type of world representation has been successfully used by mobile robots. The following section reviews topological mapping methods, which represent places in an environment and their connectivity.
3.2 Topological Mapping Methods Topological approaches produce maps that represent places in the environment and how these places are connected. The concept of ‘place’ varies between different techniques. For some, a place is a local area of open space bounded by entrances (Kuipers, Modayil et al. 2004). Other techniques determine what constitutes a place by analysing the visual scene obtained from the robot’s camera. The representation of place connectivity also varies – ranging from the binary knowledge of whether two places are connected to the detailed knowledge of the motor commands or movement behaviours required to move between the two places. In one of the seminal pieces of work on topological mapping, Matarić (1990) describes a topological system that constructs a node based map of an environment using landmarks detected by a robot’s sonar sensors. The map is literally a graph of landmarks in the environment. Links are formed between nodes to represent topological adjacency. The lack of explicit Cartesian information in the graph map resulted in the optimality problem of choosing between the shortest topological and the shortest physical path when navigating to a goal. To address this problem a notion of time-asdistance was introduced, which allowed landmark nodes to be assigned an approximate physical size based on the robot’s transition time across the landmark. Vision-based topological approaches learn the visual scenes associated with places in the environment. Often only important places in the environment are learned – for instance in a maze environment the robot may only learn visual scenes at critical junctions
22
Robotic Mapping Methods
in the maze, but not during corridor traverses (Franz, Scholkopf et al. 1998). Other algorithms base the learning of places directly on the visual sensory input – unique or feature rich snapshots of the environment trigger the learning of a place. Often it is effective to simply learn places at regularly spaced intervals. Rybski, Zacharias et al. (2003) describe a mapping system that learns places every 1.07 meters the robot travels, with each place associated with a panoramic image. Other systems learn new places only when the previously stored image is insufficient for representing the environment, using difference methods such as the Sum of Squared Differences (SSD) (Vassallo, SantosVictor et al. 2002). Hybrid topological methods use range sensors to generate local Cartesian occupancy maps which are used as places in the map, even though the global representation is topological. Kuipers, Modayil et al. (2004) describe local perceptual maps (LPMs), which are bounded occupancy grids generated using a laser range finder. LPMs are inter-connected by gateways, which are areas of minimum obstacle clearance around each LPM, and the travel actions required to move between the LPMs. This approach has the advantage of producing accurate local metrical maps while taking advantage of the computational advantages of having a global topological map. However, the recognition and use of LPMs does require the use of a high accuracy range sensor and needs to be customised to suit the type of environment. A similar approach by Jefferies and Yeap (2000) combines local Absolute Space Representations (ASRs) with a global Memory For the Immediate Surroundings (MFIS). The computational requirements of hybrid topological methods can be significantly reduced by storing space histograms in x and y , rather than storing local occupancy grid maps (Duckett and Nehmzow 1999). Localisation is achieved by convolving the current histograms with the stored histograms to find the best matching location. This method requires a compass to remove the problem of self-orientation, but results in only minimal information loss when compared with storing the full local occupancy grids. To address the problems of perceptual aliasing and sensor noise, an iterative localisation algorithm is used to maintain multiple robot location hypotheses. Even after becoming completely lost in an ambiguous environment, a robot was able to use the algorithm to re-localise. Transition information between places can be learned in a number of ways. Vassallo, Santos-Victor et al. (2002) describe a topological mapping system where the robot learns a motor vocabulary representing the robot’s set of actions for moving around an environment. Links in the final topological map are associated with a specific motor word that describes the movement action required to go between the linked nodes. This approach integrates the motion abilities of the robot into the topological map and provides a more intuitive form of navigation, when compared to routes that are specified by ( x, y ,T ) co-ordinates. Other hybrid systems such as the MFIS represent transitions between places by exits, which are physical gaps at the boundary of an ASR that the robot can move through into another ASR.
3.3 Exploration, Navigation, and Dealing with Change Many methods can solve the SLAM problem using an appropriate set of assumptions, such as the assumption that the environment is static. However, SLAM in a static
Exploration, Navigation, and Dealing with Change
23
environment is only one part of the complete mapping and navigation problem. This section describes the additional problems of exploring an unknown environment, navigating to goals, and learning and coping with the dynamic aspects of an environment. 3.3.1 Exploration Exploration has been defined as “…the task of guiding a vehicle during mapping such that it covers the environment with its sensors” (Stachniss and Burgard 2003). Research into exploration methods has usually addressed the problem in isolation, rather than together with the SLAM problem. Frontier-based exploration is one of the simplest techniques and has been widely tested (Edlinger and von Puttkamer 1994; Yamauchi 1997). Using this exploration technique the robot moves towards boundaries or frontiers between known and unknown parts of the environment. The greedy implementation of this technique guides the robot towards the closest unknown frontier, which has been shown to result in reasonably constrained travel distances (Koenig, Tovey et al. 2001). Frontier-based exploration is relatively easy to integrate into most existing mobile robot architectures, although its performance in realistic long term robot experiments with uncertainty is untested. More sophisticated methods take into account other factors such as environment uncertainty and the knowledge gain potential of candidate locations for exploration. For instance, the Next-Best-View (NBV) algorithm evaluates new candidate locations based on the expected information gain, the motion cost required to move there, and the overlap between the current and candidate ‘safe region’ (Gonzalez-Banos and Latombe 2002). Unlike the greedy frontier-based technique, the robot may choose a more distant location if it considers that location to have more information gain potential than closer places. Fig. 3.3 shows the NBV candidate selection process. The overlap criterion is an example of incorporating the requirements of the localisation and mapping system into the exploration process. In this case, the robot will only move to new regions that overlap somewhat with the current observable region, so it can remain localised. In experiments, the NBV algorithm was able to perform at a comparable level to a human operator. Other metrics apart from knowledge gain can be used to evaluate potential exploration paths, such as sensing time and planning time (Moorehead, Simmons et al. 2001). Most of the methods described so far rely on some form of Cartesian occupancy grid map. The binary ‘occupied or not’ representation is unsuitable for representing the uncertainty the robot may have regarding a cell’s state. Coverage maps address this uncertainty by storing a posterior about the amount of a cell that is covered by an object (Stachniss and Burgard 2003). To explore the environment the robot must reduce the entropy of the posterior for a cell in the coverage map. This measure also allows the robot to determine how thoroughly a cell has been sensed, by monitoring the reduction in cell entropy on consecutive viewings. In metric, feature-based maps exploration can be performed by navigating towards the open end of the feature closest to the robot’s starting location, such as a wall segment (Wullschleger, Arras et al. 1999). Under this exploration method the starting location is the position in which the robot starts at the beginning of the experiment, not
24
Robotic Mapping Methods
Fig. 3.3. Candidate selection using the NBV algorithm. After five sensing operations the robot has identified an unexplored frontier (a), which is sampled for candidate exploration points (b). Each candidate point has a potential visibility gain (c), but is also evaluated based on other criteria such as motion cost to produce an optimal candidate (d) (Gonzalez-Banos and Latombe 2002). Reprinted with permission from Sage Publications.
the starting position before each exploration segment. This method ensures the robot regularly traverses a well known area that is suitable for re-localisation before exploring each new area. Large complex environments can cause many SLAM methods to fail due to their inability to close the loop under significant odometric error (Lu and Milios, 1997). Exploration techniques that rely purely on metric information can subsequently fail. One solution to this problem is to use exploration methods that use a hybrid map combining metric and topological information. One example is the exploration method shown in Fig. 3.4, which uses the Spatial Semantic Hierarchy (SSH) (Kuipers and Byun 1991). Distinct places in the environment are stored as nodes in the global topological map, with each node storing a local metric map. The metric map can be analysed to find the possible exploration routes from each distinct place (for instance, paths through doorways). The robot maintains an exploration agenda for each node, which stores information about where and in which direction it should explore. The method’s success relies on the ability of the mapping module to distinguish between similar places. The method can fail if the uniformity of the environment is too great.
Exploration, Navigation, and Dealing with Change
25
Fig. 3.4. Exploration method using the Spatial Semantic Hierarchy. DP – distinctive place, nbhd – neighbourhood, LCS – local control strategies (Kuipers and Byun 1991). Reprinted with permission from Elsevier.
3.3.2 Navigating to Goals Planning a route to a goal and following it in a simulated world is a relatively easy task, and has been the subject of much research. However, success in simulation is by no means an indicator of whether a method will work well on a robot in a complex environment. This section’s primary focus is on goal navigation techniques that have been tested on robots in real world environments. Approaches to equipping a robot with a goal navigation capability have varied, depending on the underlying localisation and mapping systems. One of the few methods used on a real robot in a complex environment is the motion planning algorithm used by the interactive museum tour-guide robot Minerva (Thrun 2000). Minerva operated in large, open area environments that were densely populated by moving people. Maintaining localisation was very important in such an environment, so the navigation system used a coastal planner to stay close to known objects (Roy, Burgard et al. 1999). The coastal planner method used Partially Observable Markov Decision Processes (POMDPs) to minimise the expected cumulative cost of the actions required to reach a goal. In this case, cost was the robot’s ability to localise along the path. The algorithm emphasised minimisation of localisation error at the expense of path length. The planning algorithm also relied on the Cartesian correctness of the map, which was generated a priori. Minerva was however able to robustly navigate around its environment during a two week period using this motion planning algorithm. The Exploration and Learning in Dynamic ENvironments (ELDEN) system uses an adaptive place network (APN) to learn the topology and geometry of an environment (Yamauchi and Beer 1996). Place units in the network represent regions in Cartesian space and links between place units represent the spatial relationship between
26
Robotic Mapping Methods
the units and the robot’s confidence that the link is traversable. Links are bidirectional and store both the transition from place unit A to place unit B and from B to A. When a goal is specified, the ELDEN system assigns links a cost that is inversely proportional to the robot’s confidence about the traversability of the link. Dijkstra’s algorithm is used to find the lowest cost path from each place unit to the goal location. Each place link is then assigned an orientation indicating the movement direction required to move closer to the goal. After this algorithm has been applied each place unit has only one link that it can traverse to move closer to the goal. The robot uses its reactive movement behaviours to follow these links to the goal. ELDEN was tested in a 14 × 8 metre laboratory area containing a range of static and dynamic objects, including people, boxes, chairs, desks, and carts. After twenty minutes of exploration, the system navigated successfully to several goal locations even in the presence of dynamic obstacles. However the re-localisation procedure underpinning this system is relatively crude. Localisation success relies on the robot being able to navigate back into the vicinity of its starting location, before performing a dead reckoning manoeuvre to the centre of this region and re-localising using a hillclimbing grid matching routine. The dependency on successful homing means the system cannot re-localise from large odometric errors, or close the loop. To address this limitation, Yamauchi (1996) proposed a system of storing local evidence grids with each unit, with re-localisation achieved by matching the current evidence grid to the stored grids. The proposed approach has similarities to the Spatial Semantic Hierarchy’s use of local perceptual maps and would also require that local regions be uniquely identifiable. In conducting a review of conventional goal navigation methods for robots, it quickly becomes apparent that there is little published literature on techniques that have been integrated into SLAM systems. There has been related path planning research in simulated environments (Hu, Khang et al. 1993; Marzouqi and Jarvis 2004; Niederberger and Gross 2004), or in structured physical environments (Browning 2000; Tews 2003). Very little research has addressed the problem in scenarios where the environment is initially unknown and complex enough to effectively test a system’s SLAM abilities, such as the ability to close large loops. This is somewhat surprising given the range of algorithms with good SLAM performance, such as the Spatial Semantic Hierarchy or FastSLAM. There are two possible reasons for this lack of research. It is possible that the conventional SLAM systems have fundamental characteristics that make them unsuitable for use in goal recall; this is discussed further in Chapter 6. The other reason may be that these areas have simply not yet been explored; only in the last few years has the robot mapping and navigation field reached the stage where several capable SLAM systems exist that are suitable for testing goal navigation methods. Either possibility is a good reason for the further investigation carried out in this book. 3.3.3 Dealing with Dynamic Environments Most real world environments are dynamic to some extent. To function fully in dynamic environments, a mapping and navigation system must address the extra issues arising from changes in the environment. The dynamic nature of an environment can
Exploration, Navigation, and Dealing with Change
27
vary significantly in timescale and characteristics, making the problem of mapping and navigating challenging. The difficulty of the problem has restricted the amount of research into mapping and navigation in dynamic environments – the vast majority of SLAM research assumes the environment is effectively static. This section focuses on mapping and navigation methods that specifically address at least some aspects of the dynamic problem. In environments where moving people are common, dynamic objects may be filtered out from the sensory data so as to avoid corruption of the localisation process. This is a relatively simple process in that, although the dynamic objects are being identified, they are not actively being learned or used as part of the mapping and navigation process. Range sensor readings can be used or discarded based on the calculated probabilities of the reading representing a known or unknown dynamic object. This approach has been successfully implemented using a robot equipped with a 2D laser range finder in a museum environment with dense crowds of people (Fox, Burgard et al. 1999). Other approaches have tracked the location of people over time using a conditional particle filter and a 2D laser range finder (Montemerlo, Thrun et al. 2002). Using a pre-learned map, the algorithm was able to localise, track moving people and also detect new permanent obstacles in the environment. In this case the robot started with a reasonably current map, and only performed localisation, rather than full simultaneous localisation and mapping. Some methods attempt to track the discrete movement, appearance, and disappearance of objects in the environment over time. These techniques focus on learning the different locations of dynamic objects rather than on tracking their actual movement. One such technique known as ROMA uses the Expectation Maximisation algorithm to learn models of non-stationary objects. It uses a map differencing technique that detects changes in an environment (Biswas, Limketkai et al. 2002). The technique is sufficient to identify and track the movement of unique objects using a laser range finder, and can handle variation in the total number of objects over time. This method is limited to obstacles detected in the scanning plane of the laser and requires the cross section of the object to remain constant after movement, allowing only rotations around the object’s vertical axis. Objects that are placed close together also cause recognition problems. These problems are not necessarily solvable using a laser range finder, suggesting that the approach may be overly precise and that a more robust and approximate process may be more suitable. The other main area of research into mapping and navigation in dynamic environments has focused on adapting topological representations of an environment to represent changes. The ELDEN system uses the concept of places and confidence links between them to represent the robot’s certainty the link can be traversed (Yamauchi and Beer 1996). The links also store the direction required to traverse the link. Three situations can result in the link confidence value being reduced when the robot attempts to get from A to B: the robot leaves A to get to B but through motor error or sensor noise ends up somewhere else; an unexpected obstacle causes the robot to end up in a different location than B; and the pathway between A and B is completely blocked and the robot cannot make any progress. This type of approach differs significantly from those described earlier in that it implicitly rather than explicitly encodes changes in the environment. The robot’s representation of the environment is updated based on the robot’s success in moving
28
Robotic Mapping Methods
around it, rather than through explicit identification of dynamic obstacles. The implicit learning of change has some advantages over explicit approaches, such as removing the dependence on accurate recognition of objects. The robot does not need to know the exact shape, size, and appearance of an obstacle placed in its path, but only that the obstacle has caused it to take a slight detour.
3.4 Discussion In summary, there has been a great deal of research into developing robotic mapping and navigation systems. The majority of the most successful current robotic mapping methods are to some degree probabilistic in order to deal with the inherent uncertainty of sensors and the environment. Under appropriate conditions some of these systems can solve the core SLAM problem in large, real world, static environments. There has been relatively little research into the concurrent problems of real world exploration, path planning, and adaptation to environment change, which are also integral parts of the mapping and navigation problem. Each of these challenges has been extensively addressed in isolation and in simulation, but rarely in parallel with the SLAM problem on an actual robot. It remains to be seen whether the mapping algorithms used by the ‘best’ SLAM systems are indeed optimal when the entire mapping and navigation problem is considered.
4 Biological Navigation Systems
Many animals possess formidable navigation abilities. Conducting research into the exact capabilities and limitations of the mapping and navigation systems used in nature provides two valuable outcomes; knowledge of what animals and insects can achieve using their own sensory and computing equipment, and theories and models of the mechanisms they use in the process of doing so. The first outcome is important because it demonstrates that many navigation tasks can be achieved with a surprisingly simple set of sensors. For instance, it is not necessary to have a high fidelity laser range finder to navigate effectively in a range of environments – most animals do not possess sensors that can tell them the range to obstacles varying in range from a few millimetres to ten or more metres, to a precision of ten millimetres. Experiments have demonstrated that rats can continue to navigate effectively even in pitch blackness, with the implication that some navigation systems can cope without a constant stream of feature rich information (Quirk, Muller et al. 1990; Mizumori and Williams 1993). Understanding the achievements of biological agents, particularly given their sensory input, provides much insight into the robot mapping and navigation problem faced by researchers. This chapter examines the mapping and navigation capabilities in a variety of animals. The following chapter then covers attempts by researchers to create computational models of these biological systems.
4.1 Rodents and the Cognitive Map In the field of biological mapping and navigation, rodents are one of the most studied animals. There is a well established base of research into their capabilities in a large range of experimental environments and situations. In addition, the rodent hippocampus is one of the most studied brain regions of any mammal (Fig. 4.1). This has led to a number of well established models of mapping and navigation in rodents, although there are still many areas of debate and controversy. At the core of this research is the concept of a cognitive map, first introduced by Tolman (1948). The cognitive map concept embodies the hypothesis that an animal or insect can learn information not specifically related or useful to the current task being M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 29–39, 2008. springerlink.com © Springer-Verlag Berlin Heidelberg 2008
30
Biological Navigation Systems
Fig. 4.1. The rodent hippocampus. CA1, CA2, CA3: cornu ammonis fields 1–3; DG: dentate gyrus; EC: entorhinal cortex; f: fornix; s: septal hippocampal pole; S: subiculum; t: temporal hippocampal pole (Amaral and Witter 1995; Cheung and Cardinal 2005). Reprinted with permission from Elsevier.
performed. Tolman discussed this latent learning with respect to rats. Rodents released in an environment with no reward food source were able to use the information learned at that time to later navigate to food sources placed in the environment. In the context of rodent navigation, the cognitive map is often theorised to be a spatial reference that some animals may create and use to navigate in their environments. Early work with rats identified place cells in the rodent hippocampus, which appeared to respond to the spatial location of the animal (O'Keefe and Dostrovsky 1971; O'Keefe and Conway 1978). Later work also identified another set of neurons that responded to the animal’s orientation, which were subsequently called head direction cells (Ranck 1984). Recent research has also uncovered evidence of grid cells, which
Rodents and the Cognitive Map
31
fire at regular grid-like intervals in the environment (Fyhn, Molden et al. 2004; Hafting, Fyhn et al. 2005). However, most of the models of the rodent hippocampus to date have focused on head direction and place cells. The following sections describe theories of the mechanisms by which rats map and navigate around their environments. 4.1.1 Head Direction and Place Cells Head direction cells are sensitive to the orientation of the rodent’s head, with each cell’s firing rate peaking when the rodent’s head is facing in a specific direction (Ranck 1984; Taube, Muller et al. 1990). The orientation of peak firing is often referred to as the cell’s preferred direction or orientation. A cell’s firing rate reduces as the rodent’s head rotates away from the cell’s preferred orientation (Fig. 4.2). The firing rates are dependent on the allocentric orientation of the head, and do not depend on the orientation of the body (Taube, Muller et al. 1990). Firing rates appear to be mostly independent of the rat’s location or current behaviour (motionless, walking, running, eating, and grooming), although some head direction cells appear to depend on factors such as the animal’s translational and angular velocity. For instance, recent work by Sargolini (2006) found a positive relationship between rodent speed and head direction cell firing rate for a majority of the observed head direction cells.
Fig. 4.2. Firing rate of cells as a function of head direction. Recording took place over an eight minute period using 6° bins (Taube, Muller et al. 1990). Copyright 1990 by the Society for Neuroscience.
The firing rates of head direction cells are controlled by both allothetic and ideothetic information. Vestibular and proprioceptive signals provide the rat with the ability to integrate rotation movement over time. This allows the firing profile in the head direction cells to be updated with changes in orientation of the head during movement in an environment. Experiments described by Mizumori and Williams (1993) observed the directional firing qualities of cells for rats in the dark. Directional information was retained even in the absence of any visual cues for short periods of time, and was updated by ideothetic input. Allothetic cues also influence the firing of head direction cells. Experiments have shown that rotating a dominant visual cue within an experimental arena can cause approximately the same shift in the preferred orientation of the head direction cells (Taube, Muller et al. 1990; Taube, Muller et al. 1990). Like robots, rats that are forced to rely only on self-motion information inevitably build up large position errors over
32
Biological Navigation Systems
time (Etienne, Maurer et al. 1996). So while it seems that ideothetic sensory information may be the primary means by which a rat updates its perceived position (Arleo 2000), it is likely that rats also rely on external cues to maintain the accuracy of their head direction system (Knierim, Kudrimoti et al. 1995; Etienne, Maurer et al. 1996). Place cells are the two dimensional cousins of head direction cells, corresponding to the rat’s location within the environment. Place cells fire maximally when the robot is located at a certain location in the environment, and fire to a lesser degree as the rodent moves away from this location. The area of the environment within which a particular place cell fires is known as the place field of that cell. However, firing of place cells is also affected by a number of other factors, such as visual cues, barriers, and food reward locations (Knierim, Kudrimoti et al. 1995; Gothard, Skaggs et al. 1996; Knierim 2002). Like head direction cells, place cell activity can change to reflect movement of the rodent even in the absence of visual cues (O'Keefe and Conway 1978; Muller and Kubie 1987; Quirk, Muller et al. 1990). In experiments with rats that had been blind from an early age, Save, Cressant et al. (1998) demonstrated that the place cells still exhibited normal functionality. Firing rates were, in general, lower in the blind rats than sighted rats, with one possible reason being that place cell firing rates depend on a combination of sensory inputs, with the deprivation of visual input resulting in a lower overall firing rate. Place cells also respond to visual cues. When distal cues are rotated, place fields have been observed to rotate a corresponding amount (O'Keefe and Conway 1978; Muller and Kubie 1987). Local visual cues appear to have less influence, except when they are proximal to a reward location of some kind (Redish 1999). Inconsistent path integration and external cue information can also compete to control place cell firing, as demonstrated in experiments by Gothard, Skaggs et al. (1996). Place fields have different shapes in geometrically different environments; for instance, Muller, Kubie et al. (1987) describe experiments in cylindrical and rectangular environments that produce differently shaped place fields. These experiments also introduced barriers of varying size into an arena. The barrier’s effect depended on its size; small barriers that did not change the rat’s movement significantly had little effect, but large barriers disrupted place fields that overlapped the barrier. This suggested that locomotion information could influence the formation and shape of place fields (Arleo 2000). One important characteristic of the hippocampal electroencephalogram is the theta rhythm, an observable sinusoidal rhythm of between 7 and 12 Hz (O'Keefe and Recce 1993). This rhythm manifests itself in the place cell firing through a phase correlation. As the rat enters the place field of one specific place cell, the place cell initially fires late in the theta period. As it moves through the place field, the place cell fires at progressively earlier times in the theta period. This phase shift is known as phase precession. While the firing of a place cell merely locates the rat somewhere within that cell’s place field, the phase precession gives the location of the rat within that place field. The theta rhythm is thought to possibly play a role in the goal navigation processes, which is discussed in the following section. Many of these theories of mapping and localisation in the rodent hippocampus have been tested using computational models. This work is explored in detail in Chapter 5.
Rodents and the Cognitive Map
33
4.1.2 Exploration, Navigation, and Dealing with Change Compared to the wealth of experimental evidence regarding spatial mapping in the hippocampus, there is relatively little information suggesting how a rat uses these internal representations to navigate between places. The nature of research regarding goal recall tends to be more speculative. It is relatively easy to observe the correspondence between firing rates of different cells and the location of a rat, but much harder to link these firing rates to a mechanism of purposeful navigation. The literature on this topic contains a number of interesting observations and proposed models of task orientated behaviour. In T-Maze experiments, rodent hippocampal cells were found to fire differently while the rat was traversing the central maze stem depending on whether it turned left or right at the T-intersection (Wood, Dudchenko et al. 2000). Further tests and statistical analysis found that the firing difference was most dependent on the turn taken rather than any other factors such as speed, heading, or position. These results suggest that the hippocampus may encode specific episodic memories which might be used in performing navigation tasks. Other studies have found cells that rely on a larger number of variables: the rodent’s location, significance of that location, and behaviour (Markus, Qin et al. 1995). In tests in alternating dark and light environments, some place cells appeared to fire based on the recent experience of the rat, indicating the hippocampus has access to short term memory (Quirk, Muller et al. 1990). When a rat repeatedly navigates between two reward locations, some hippocampal neurons gradually change their firing patterns and eventually fire maximally when near one of the reward locations (Kobayashi, Tran et al. 2003). When placed in a different environment without rewards, the firing patterns were maintained if the rat navigated purposefully towards an expected goal location, but were not observed in randomly moving rats. The experimental results supported the theory that during purposeful behaviour the rat develops hippocampal cells that encode a specific action of the rat at a specific location – in effect place-action cells. It is quite difficult to account for the range of experimental observations of rats using only one of the spatial or non-spatial models of hippocampal function; combined models are the most successful at explaining the wide range of observations. Some research has focused on developing a plausible framework relating to how this information is distributed in the hippocampus. One model dictates distinct clusters of neurons distributed longitudinally along the hippocampus (Hampson, Simeral et al. 1999). In the model, each cluster contains neurons that respond to a specific task involving spatial navigation. Spread throughout all these clusters there are also cells that encode the non-spatial aspects of the tasks. The stronger anatomical structuring of spatiallyrelated cells suggests their use as a base mode of spatial-only navigation. The authors suggest that repeated specific activity would form neuronal firing patterns within each of these clusters in response to behaviourally significant events, through behavioural or cognitive manipulations. Some simulated models have used the hippocampal theta rhythm as the basis for goal-directed spatial navigation (Gorchetchnikov and Hasselmo 2002). In simulation experiments long term potentiation (LTP) of excitatory recurrent connections between two bodies of cells occurred while the simulated rat performed food foraging behaviour in a T-maze (Koene, Gorchetchnikov et al. 2003). To recall known paths cell activity
34
Biological Navigation Systems
was introduced into these two bodies of standard cells in order to trigger predictive firing of a third body of cells that coded for the goal direction. The model produced several useful insights into learning and goal recall mechanisms. For instance it suggests that the theta rhythm can cause distinct learning and retrieval phases, and that it can help maintain a buffer of cell activity which is required for episodic memory.
4.2 Other Animals and Insects Since this book concerns a pragmatic rather than biologically faithful approach to solving the mapping and navigation problem, it is useful to examine the capabilities of other animals apart from the rats. This section describes the mapping and navigation abilities of bees, ants, primates, and humans. 4.2.1 Bees Insects are generally examined slightly differently to other animals due to their size and lack of a mammal brain – much of the research focuses on their navigation capabilities. Theories regarding the models or mechanisms by which they navigate are mostly inferred from their actions in various experiments. Honey bee workers fly many hundreds of kilometres during their lifetime foraging for nectar and pollen. They are equipped with a range of different sensors to help them move around in their environment. Their primary sensors are two compound eyes, with each eye made up of thousands of light sensitive cells. The compound eyes can detect both quantities of light and colour. In contrast, the three simple eyes arranged in the shape of a triangle on the top of the bee’s head are thought to primarily detect the amount of light present. The antennae on the bee’s head are thought to be used for both odour detection and flight speed measurement. Fig. 4.3 shows some of the basic anatomy of a honeybee worker. Foraging honeybees are known as central-place foragers because they have a common point to start and finish their foraging trips. Typically, the locations to which they forage are within two to three kilometres of this central location, although they have been observed to forage up to 13.5 km from their colony location (Frisch 1967; Ratnieks 2000). A large body of work has focused on what type of navigation methods bees might use to perform their impressive feats of navigation. Most of the models of bee navigation have been developed by subjecting bees to carefully devised sets of tests aimed at isolating various aspects of the navigation problem. This process of reverse engineering is not an exact science, and seemingly identical tests by different researchers have produced significantly different results. As a consequence, there is a wide range of theories of bee navigation. Researchers do agree on some components of the bee’s navigation process such as the method of spatial communication between bees (Frisch 1967; Michelsen 1993). Upon returning to the hive, scout bees perform either a round dance or waggle dance to inform other foragers about the direction and distance of the food from the hive. The waggle dance conveys this information in two different ways. The direction of the straight dance segment, relative to the vertical surface the bee is dancing on, tells other bees the direction of the food source relative to the sun. The duration of the
Other Animals and Insects
35
Fig. 4.3. Anatomy of a worker bee
waggle part of the dance indicates how far away the food source is from the hive. The simpler round dance is used for food sources close to the hive and indicates only direction. Models of a bee’s mapping and navigation processes vary from one extreme to another. Wehner, Bleuler et al. (1990) propose the local rule concept; that bees navigate using a number of independent memories. Under this model, bees navigate using a combination of dead-reckoning and route-specific landmark memories. Route-specific landmarks are used in the temporal order that they were learned. They are not stored in a general map context and hence cannot be used to generate novel behaviour. Instead the local cue memories affect what behaviour occurs in situations by prompting a change of behaviour. The cognitive map models as proposed by Gould (1986) are at the other end of the model spectrum. Gould presents experiments that suggest that it is not only vertebrates which are able to form and use some sort of cognitive map, but also some invertebrates such as bees. Because of the number of possible factors involved, hypothesising models of bee navigation is somewhat speculative. There has been extensive debate in the literature about many aspects of the field, such as whether bees use a cognitive map or instead rely on dead reckoning and cue-based navigation (Gould 1986; Gould 1988; Wehner, Bleuler et al. 1990). Often there is an inability to replicate the results of earlier experiments. In other cases new aspects of the problem have led to the previous experiments being rendered inappropriate or not conclusive. Only after rigorous and exhaustive testing can some aspects of models be validated. One example of this is extensive work by Esch and Burns (1996) and Srinivasan, Zhang et al. (2000) on the theory of bee distance estimation. Results from a large number of experiments
36
Biological Navigation Systems
strongly suggest that honeybees use image motion to determine their flight distances, rather than energy consumption as suggested by early work in the field (Heran 1956). 4.2.2 Ants In order to be able to return to their home, desert ants perform their own form of path integration when they leave the nest. Experiments by Muller and Wehner (1988) using the species Cataglyphis fortis suggest that an ant keeps track of the average of all directions in which the ant moves, weighted by the distance moved in each direction. The reference vector used for calculating the new movement angle is the previous homeward course direction. This model matches the experimental results obtained using real ants to a very high degree and also predicts the systematic errors that an ant’s navigation system accumulates. This navigation system leads to only small errors in most situations, except when an ant performs sudden backward turns, which the experiments show happens only rarely. Since the angular errors are systematic, an equally large clockwise and anticlockwise turn results in the two generated errors cancelling out. The experiments also reveal that the ants do not usually have a direction bias during their foraging activities. Wehner, Gallizzi et al. (2002) describe the navigation strategy that desert ants have adopted to deal with errors in their path integration process. When returning to the nest from a foraging location, the ants always use their home vector to return to the supposed location of the nest. The home vector is calculated during the outbound trip using weighted angular averaging as described previously. Usually for the longer trips there is some path integration error and the ant returns to a location near but not at the nest. It then embarks upon a systematic search to find the nest within that local area. It is interesting to note that repeated foraging trips to the same location result in only minimal correction of the home vector. However, through repeated trips the area searched during the systematic search process does become biased towards the true nest location. It seems that the desert ants rely heavily on the systematic search process even when given the opportunity to correct errors in their home vector. Desert ants also seem to adjust their navigation behaviour in a number of ways based on landmarks in their environment. Collett, Collett et al. (2001) describe experiments where an extended landmark (rather than a point landmark) was placed between the ant’s nest and a food source. By kidnapping ants at various stages during their journey from the food source back to the nest, the researchers were able to test the effect of the visual cues on the ant’s navigation. When the local movement vector as dictated by the remembered path back to the nest had approximately the same orientation as the vector suggested by the visual cue, the ants used a weighted average of the two to determine their movement. However, when there was a significant discrepancy between the two vectors, the ant appeared to ignore the local movement vector and use the vector suggested by the visual cue. 4.2.3 Primates Studies of insects have focused on analysing their actions during experiments and proposing plausible models of mapping and navigation that support the experimental
Other Animals and Insects
37
observations. Mammals have much larger and more accessible brains than insects, which allows researchers to examine the neuronal activity during experiments. This has resulted in a large amount of research focusing on the neuronal mechanisms by which mammals achieve mapping and navigation. Primates possess head direction cells, located in the presubiculum of the Macaca mulatta monkey. Primate head direction cells are quite similar to those found in the rodent hippocampus. The firing rates of cells are dependent on the head direction of the monkey, but not the eye direction of the monkey, as shown in experiments by Robertson, Rolls et al. (1999). Cells fire maximally for a certain head direction, with their firing rate dropping off in a Gaussian-like manner as the monkey’s head direction moves away from the cell’s preferred direction. Rolls (2005) describes the mean half-amplitude width of the cells’ tuning curves as being 76º. Head direction cells are not affected by the location of the monkey within the environment, and will give similar responses for the same monkey head direction even if the monkey has been moved and the visual scene is completely different. Primates also possess spatial view cells. These cells have responses based on where the monkey is looking. The information regarding where the monkey is looking is encoded in allocentric coordinates, storing the absolute location of the place rather than the location relative to the monkey’s current position and head orientation. This makes them quite different to the place cells observed in rats, which encode the rat’s current location within the environment. Spatial view cells exhibit useful characteristics in certain situations. Experiments by Robertson, Rolls et al. (1998) tested the response of these cells when the view details were partially or completely obscured by either curtains or darkness. Spatial view cells in the CA1 region of the hippocampus, parahippocampal gyrus, and presubiculum responded when the monkey was gazing towards a certain region even if none of the original view details could be seen. Other cells however, such as those in the CA3 region, appeared to rely on the familiar visual input, and hence did not respond when the view details were obscured. The characteristic of the first set of cells suggests the possibility of episodic memory, for instance remembering where an object was last seen. Because the memory is independent of the location of the monkey within the environment, it is potentially quite useful – the monkey can remember where something was regardless of where the monkey is located. Cells that respond to the monkey looking at a certain spatial location regardless of what the current view is suggest the possibility of recalling where an object is through memory, even though there is no current visual confirmation. O'Mara, Rolls et al. (1994) describe an experiment where a monkey was passively moved on a robot platform in a cue-controlled 2 × 2 × 2 metre test chamber. The experiments determined that there are cells in the primate hippocampus that respond to whole-body motion. Cells responded selectively to different types of motion. Some cells were sensitive only to linear motion, while others responded to axial motion. Different linear movement directions triggered responses in different cells. Some cells responses were modulated by visual input – when the visual input was removed some cells continued to respond to whole body motion, while the responses of others were much diminished or non-existent. One neuron only responded to external rotation of the environment and cues while the monkey was stationary. The researchers pointed out the relevance of these results to situations where part of the environment moves
38
Biological Navigation Systems
but with no associated movement of the observer, or extended periods of movement when there are no vestibular signals (such as on a train journey). 4.2.4 Humans Humans generally possess good vision sensors but relatively poor self-motion sensors. Their main advantage over animals and insects is the sophistication of their brain and higher level thinking processes. Despite the lack of absolute sensory equipment such as a magnetic field detector, and inaccurate path integration ability, most humans are able to navigate effectively in huge complex and dynamic environments. The mapping and navigation processes in humans are relatively unknown, especially when compared with well studied animals such as rodents. The hippocampus, although strongly associated with episodic memory, is also believed to possibly play a role in navigation, but the specifics remain unknown. One hindrance to further understanding of navigation processes in humans is the inability to conduct invasive examination in order to use the standard brain analysis techniques employed on rodents or monkeys. Instead, less invasive techniques such as Positron Emission Tomography (PET) are used. Ethical and safety considerations mean only a subset of the techniques used on animals can be applied to humans. Despite these limitations some initial findings have been published in the literature regarding proposed models of navigation in the human brain. In experiments involving taxi drivers in London (Maguire, Frackowiak et al. 1997), PET was used while drivers recalled and drove along complex routes around the city. Only experienced taxi drivers were used, whose knowledge of the streets had been reinforced over the years through many hours of taxi work. Drivers were also required to recall famous landmarks from locations they had never visited, and describe plot progression and individual movie scenes in movies. Each task involved either topographical (route recall) or non-topographical knowledge (landmark recall), and either could contain a sequential component (plot progression) or not (scene recall). The right hippocampus was active during route recall tasks that involved some use of a spatial map. Straight landmark recall in a non-spatial context did not activate the right hippocampus. However, the right hippocampus was also found to be activated during topographical unlearning. Maguire, Burgess et al. (1998) further describe a set of experiments involving humans navigating in a virtual reality town (a level from the computer game Duke Nukem 3D). After familiarisation periods of exploration in the environment, human subjects performed a number of navigation tasks; navigation directly to an unobscured goal, indirect navigation to a goal around obstacles placed in the virtual world, and path-following navigation along a trail of visual arrows. The right hippocampus was found to be more active during navigation than trailfollowing. Better navigation performance resulted in higher levels of activation in the right hippocampus. The researchers proposed that this part of the brain enabled navigation to unseen goals. The left hippocampus was generally more active during successful navigation than unsuccessful navigation, whether trail-following or goal finding. This activity was interpreted as corresponding to maintenance and recall of possible routes to the goal.
Discussion
39
The real world environment was significantly more complex than the simulation environments, both in size and complexity. For example, there were many more possible routes in the London environment than in the virtual reality town. Maguire, Burgess et al. (1998) proposed that the right hippocampus may be involved in higher level spatial tasks and decision making. The semantic memory tasks where a driver was required to recall a film plot activated mainly the left side of the brain, and had different brain activity to that displayed during route recall.
4.3 Discussion Within the body of knowledge on mapping and navigation in animals, it is clear that the rodent is both the most studied animal and the best understood in terms of the neural mechanisms by which it navigates and creates a map, and its real world capabilities. As a result the rodent is the ‘animal of choice’ in attempts to create artificial models of biological systems. Although navigation capabilities of insects are well known, little is known about the neuronal mechanisms they use to perform that navigation, which makes modelling these mechanisms a speculative process. Because of ethical considerations, relatively little is known about the neuronal mapping and navigation mechanisms in primates or humans either. Given the focus of this book on models of biological systems that can be applied in a practical robot context, it seems reasonable that further work focus on the rodent hippocampus. As such, the following chapter describes state of the art models of mapping and navigation in the rodent hippocampus.
5 Emulating Nature: Models of Hippocampus
This chapter covers the state of the art models of the rodent hippocampus and their ability to solve each component of the mapping and navigation problem. Rodent hippocampal models can be separated into two groups. The first contains models that have been developed using theories of the neuronal mechanisms by which rats navigate and create a map. By testing these models in simulation or on a robot these theories can be validated or disproved. These models can also be used to make predictions that can be tested by further rodent experiments. Most of the models described in this chapter fall into this group. The second group of models is much smaller and contains models that have been developed for practical purposes, such as for controlling a mobile robot. These models vary in their trade-off between biological faithfulness and practical performance, with none matching the mapping performance of the top probabilistic techniques. This chapter also describes these models, with Chapter 6 discussing the implications of their limited practical performance.
5.1 Head Direction and Place Cells – State of the Art Models of the rodent hippocampus often consist of head direction and place cell structures. Because head direction cells relate to the rodent head’s one-dimensional orientation state, the computational algorithms required to simulate them are generally less complex than those required to simulate place cells, which relate to the rat’s twodimensional location state. 5.1.1 Attractor Networks Models of head direction and place cells often use some form of attractor network (Redish, Elga et al. 1996; Zhang 1996; Samsonovich and McNaughton 1997; Stringer, Rolls et al. 2002; Stringer, Trappenberg et al. 2002). Typically an array of cells is used, with nearby cells linked by strong excitatory connections, and distant cells linked by inhibitory connections. The stable state of such a network consists of a single cluster of active cells in a shaped peak distribution. The orientation represented by the current activity state can be determined in several ways, such as through population vector decoding or by simply picking the most highly activated cell. M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 41–53, 2008. springerlink.com © Springer-Verlag Berlin Heidelberg 2008
42
Emulating Nature: Models of Hippocampus
Fig. 5.1. Attractor network model of the rodent head direction system. Activity in the outer ring of head direction cells encodes orientation. The inner two rings are vestibular cells that respond to angular velocities. Links between visual cells and head direction cells are modified using a Hebbian learning rule. Head direction cells also have strong intrinsic connections that stabilise the system to a single localised cluster of active cells (Skaggs, Knierim et al. 1995). Reprinted with permission from The MIT Press.
5.1.2 Path Integration Activity injected into an attractor network near either side of the current activity peak will cause the peak to shift towards the injected activity through the network dynamics. This characteristic is commonly used as a means of path integration, although the technique by which new activity is injected can vary. One method by Skaggs, Knierim et al. (1995) uses two sets of ‘rotation’ cells, one set each for clockwise and counter clockwise directions of rotation, as shown in Fig. 5.1. Cells in these sets have both a preferred orientation and a preferred angular velocity direction. When the orientation and angular velocity matches the cell’s preferred state, the cell projects energy to nearby cells. Clockwise cells
Head Direction and Place Cells – State of the Art
43
project energy to head-direction cells in a clockwise direction, and vice-versa for counter clockwise. Although no results were presented for this model, a more complex implementation has been tested on a robot and found to be successful (Arleo 2000). The coupled attractor model of rodent head direction produces tuning curves that closely match those observed in two sections of the rodent brain (Redish, Elga et al. 1996). This model uses two populations of cells to represent the postsubiculum (PoS) and anterior thalamic nuclei (ATN) mammillary bodies in the rodent brain (Fig. 5.2). When there is no angular rotation, strong reciprocal matching connections between cells of the same preferred orientation in the PoS and ATN ensure stable synchronised Gaussian activity peaks in each cell body. Offset connections point from cells in the PoS to offset cells in the ATN, and have their strength modulated by the sign and magnitude of the angular velocity. During turning, activity in the ATN population leads that in the PoS population. This lead behaviour is supported by experimental findings that ATN cell activity is correlated with the head direction of a rodent approximately 20 – 40 ms in the future (Blair and Sharp 1995) .
Fig. 5.2. A coupled attractor model of rodent head direction. A turn to the right results in active right-offset connections but no left-offset connections from the PoS cells to the ATN cells. Strong matching connections are always active between cells in the two bodies with the same preferred orientation (Redish, Elga et al. 1996). Reprinted with permission from Informa Healthcare.
Path integration can be achieved in other ways, such as by adding a slight asymmetry to the intrinsic head direction excitatory links (Zhang 1996). This introduction of an ‘odd’ weight component causes activity to shift at a rate dependent on the magnitude of this odd component. For Zhang’s model particular care was taken to ensure that the shift in the activity profile of the head direction cells preserved its shape and that the angular speed corresponding to the shift was proportional to the magnitude of the odd component. Zhang supported his decision to focus on creating a symmetric activity profile even during movement by experimental findings on rats (Blair and Sharp 1995). Other experimental evidence however suggests that firing curves are distorted as a function of the angular velocity (Blair, Lipscomb et al. 1997), and that firing rates are at least somewhat dependent on the magnitude of the angular velocity (Taube 1995).
44
Emulating Nature: Models of Hippocampus
5.1.3 Head Direction Correction Using Allothetic Information Purely ideothetic updates are vulnerable to cumulative error and require the addition of some sort of allothetic correction mechanism in order to provide long term functionality. One such allocentric correction mechanism involves associating visual cues with head direction cells corresponding to the cue’s direction in an egocentric rather than allocentric reference frame (Skaggs, Knierim et al. 1995). This process only works for distant cues, when the relative orientation to the cue is the same in all environment locations. When visual cues are not sufficiently distant their relative bearing can change depending on a robot’s location within the environment. However, if visual cues are rotationally symmetric (appear the same from all directions) and the mapping and navigation system retains a Cartesian sense of the robot’s pose, then the egocentric bearing to a visual cue can be converted into an allocentric bearing. In Arleo’s (2000) hippocampal model visual bearing cells are used to encode the robot’s egocentric bearing; cells fire as a function of the angle between the robot’s heading and the bearing to a light cue. During learning, active visual bearing cells and place cells are associated with calibration (CAL) cells, which are in turn associated with head direction (poSC) cells (Fig. 5.3a). Later after learning is complete, during what is called the ‘non-learning’ stage, CAL cells fire only when the robot has a specific orientation relative to the visual cue and a specific spatial location (Fig. 5.3b). In effect the CAL cells associate the allocentric head directions of the robot with robot locations and relative bearings to a landmark. The calibration process consists of aligning the activity in the head direction cells with the activity in the CAL cells, which is achieved through the CAL-head direction connections. The learning of place-orientation contexts and the calibration of head direction are separate processes that do not occur simultaneously.
Fig. 5.3. Bearing and place cell models. (a) by Both visual bearing (VIS) and place cells (sLEC) are connected to calibration (CAL) cells, and each calibration cell projects to all head direction (poSC) cells (b) The calibration cells learn place-orientation contexts through VIS and sLEC inputs passing through separate neuroreceptors nr1 and nr2 (Arleo 2000). Reprinted with permission of the author and Verlag-Dissertation.
Head Direction and Place Cells – State of the Art
45
5.1.4 Place Cells – State of the Art Apart from their dimensional encoding differences, place cells and head direction cells have some similarities. By extending the one-dimensional head direction algorithms to two-dimensions, it is possible to create a simple place cell model that emulates some of the observed properties of place cells. However place cells also have many unique observed properties. Consequently many of the state of the art place cell models have been developed independently of the head direction models in an attempt to emulate some of these unique properties. 5.1.5 Place Cells Through Allothetic Cues Many of the early place cell models form place cell maps through sensory input from visual cues in the environment. For instance, one model uses the distance to cues in the environment to form the firing fields of simulated place cells during exploration (Burgess, Recce et al. 1994). A set of ‘sensory’ cells is assigned to each cue in the environment, with each cell responding maximally to the cue at a different distance. The cells use a broad tuning curve so that they still respond to a lesser degree for similar distances. An intermediate layer of entorhinal cells (EC) receives input from the sensory cells to produce a large firing field, as shown in Fig. 5.4. The entorhinal cells project to the place cells in the model, which are connected to subicular cells. Place cells and subicular cells have forward and backward inhibition; however, the place cell inhibition is stronger, resulting in smaller place fields than in the subicular cells. The model differs from many others in that cells have discrete firing responses measured in terms of firing spikes rather than a graded response. Cells are grouped around those with the largest inputs – the cell with input in each group is active but the others are silent. A simple set of goal cells was used to associate directions to goals with the current simulated rat’s location as given by the subicular cells. This model was tested in a simulated 1.5 × 1.5 m environment including a 150 mm border where up to 16 cues were placed. The simulated rat moved at a constant speed of 600 mm / s around the environment to simplify parts of the modelling process, rather than in a more biologically plausible stop-start manner. The simulated rat was allowed to explore the environment for 30 seconds before a goal was inserted, although the simulated rat would not always encounter the goal location during this time meaning exploration periods were often longer than 30 seconds. After encountering the goal the simulated rat was required to navigate to the goal from eight different starting locations around the edge of the environment. This procedure when repeated five times with different goal locations was known as a block of trials. On average only 30 to 60 seconds of exploration in the absence of a goal enabled the simulated rat to then navigate back to the goal after encountering it once. The average time taken to reach the goals in each block of trials, or the escape latency, reduced as successive goal recall trial blocks were performed with no further exploration. The average escape latency also reduced as the number of cues in the environment increased from 4 to 16.
46
Emulating Nature: Models of Hippocampus
Fig. 5.4. Model relating cues, head direction cells and place cells. Environmental cues determine the input to the network which feeds through to the paired place and subicular cell networks to produce place field firing patterns (Burgess, Recce et al. 1994). Reprinted with permission from Elsevier.
The main limitation of the model created by Burgess, Recce et al. (1994) was its dependence on visual cues and its lack of an update mechanism such as path integration in the absence of visual input. Unique point-like cues were used and were fairly evenly distributed, removing the problem of coping with ambiguity or real world features. However its rapid one-shot learning of goals, latent learning ability and short cut ability are all characteristics that are desirable on a mobile robot. Place cell firing activity can become somewhat direction dependent in narrow environments (Muller, Bostock et al. 1994; Markus, Qin et al. 1995). This has prompted the development of directional place cell or view cell models. In these models, such as the one shown in Fig. 5.5, the view cells merge the ‘where’ and ‘what’ information (Gaussier, Revel et al. 2002). Perirhinal (Pr) and parahippocampal (Ph) cells take the ‘what’ from an object recognition network as input and the ‘where’ from an object azimuth detection network. The Pr-Ph activity is calculated by multiplying the recognition level of the feature by the angular distance to the learned azimuth. Place cell activities are determined by normalising the sum of these activated Pr-Ph cells during place learning.
Head Direction and Place Cells – State of the Art
47
This model differs from experimental observations because it produces very large place fields of about two metres width, which are much larger than the observed fields in rats that are 100 to 200 mm wide (Gaussier, Revel et al. 2002). The authors point out that since place cells only fire when in close proximity to the rat’s location, the rat can only recognise its exact location and has no concept of distances to other locations in the environment, since no cells are firing that encode those other locations. In other models additional mechanisms or representations have been required in order to perform goal navigation (Touretzky and Redish 1995; Endo and Arkin 2003; Cuperlier, Quoy et al. 2005; Hafner 2005). However with the large place fields produced by this model, local goal navigation can be achieved by using a gradient climbing procedure through the place field of the goal place cell.
Fig. 5.5. A model of view cells. The ‘what’ – object recognition – and ‘where’ – object location – inputs to the perirhinal and parahippocampal cells. The normalised sum of the Pr-Ph activity produces the place cell activity (Gaussier, Revel et al. 2002). Reprinted with kind permission from Springer Science and Business Media.
5.1.6 Place Cells Through Ideothetic Information Place cell firing fields can also be established through the use of ideothetic information such as wheel encoder counts from a mobile robot. In this methodology the simulated rat or robot initially relies completely upon path integration to establish the firing properties of its place cells. Over time, the growing population of place cells can become associated with local views of the environment, with both associations shown by the model in Fig. 5.6 (Arleo 2000). In this model place cells are recruited incrementally as the robot explores the environment. A threshold limits the number of place cells that can be active at any one time – new cells are not recruited if the area is already sufficiently represented within the place cell network. Connections are learned between sets of cell representing both ideothetic and allothetic stimuli. During exploration the robot is given a monotonically
48
Emulating Nature: Models of Hippocampus
Fig. 5.6. Allothetic and ideothetic information influences place cells. Initially inertial stimuli drive the recruitment of place cells, but gradually visual stimuli start to become more dominant (Arleo 2000). Reprinted with permission of the author and Verlag-Dissertation.
Fig. 5.7. Place cell firing rates during a robot experiment. (a) Coverage of the arena by place fields, with each dot the centre of a place cell’s place field. The dot colour corresponds to the activation level of the place cells associated with each place field centre. (b) 600 × 600 mm testing arena, with bar-coded walls. The robot’s position corresponds to the firing activity shown in (a). (c) A sample activity profile for a robot located in the top right corner of the arena (Arleo 2000). Reprinted with permission of the author and Verlag-Dissertation.
Head Direction and Place Cells – State of the Art
49
growing sense of uncertainty, which eventually prompts it to return to its home location. The robot then recalibrates its path integrator using allothetic input learned at its home location. This approach avoids the problem of sensory aliasing, because ideothetic input can disambiguate situations of ambiguous allothetic stimuli. In addition, place cell activity can be updated in the absence of visual input. Well-conditioned place cells are produced through their dependence on two coherent allothetic and ideothetic spatial representations. An example of the place cell array firing rates is shown in Fig. 5.7.The influence of stimuli on the pose cells can also develop in the opposite order. In models where both allothetic and ideothetic stimuli drive place cell activity, cell activity can initially be driven solely by allothetic cues, with internal ideothetic connections evolving over time. One such model is shown in Fig. 5.8 (Stringer, Rolls et al. 2002). Like purely allothetic models, activity is initially driven by visual input. Each place cell is associated with a unique location in the environment when the cell is maximally activated as a result of visual input. The tuning curve of each place cell due to visual input is Gaussian shaped.
Fig. 5.8. Two-dimensional continuous attractor model of place cells. There are three types of input; visual, head direction cells that provide orientation information, and forward velocity cells that provide translational velocity information (Stringer, Rolls et al. 2002). Reprinted with permission from Informa Healthcare.
These visually driven firing profiles are used to form appropriate recurrent links within the place cells and links from other cells that encode ideothetic information. The model learns these links using two different processes. Hebbian learning is used to update links between co-activated place cells, with the rationale that cells active at the same time probably represent close physical locations, due to the hard-coded firing profiles caused by visual stimuli. In addition, a trace rule is used to update the
50
Emulating Nature: Models of Hippocampus
recurrent weights based not only on their current activity but also their past firing activity. The second rule is used to produce broader firing profiles for the place cells than can be obtained by using the Hebbian rule, even if the initial visual stimuli produce narrow activity profiles (Stringer, Trappenberg et al. 2002). To ensure stable activity packets in the network for periods when the simulated agent is not moving, current activity in the network is reinforced. This ensures that there is no activity packet drift due to the weight density irregularities in the rectangular place cell matrix. During initial exploration, synaptic weights are also updated, based on the activity in pairs of place cells as well as the activity in the head direction and forward velocity cells (Eq. 5.5). The learning rule associates place cells that were active slightly in the past with currently active place cells, the currently active head direction cells, and the currently active forward velocity cells. The change in the synaptic weight between two place cells,
FV δwijkl , is given by: ~
P
FV δwijkl = k ri P r j rkHD rl FV ~
where
(5.5) P
k is the learning rate, ri P is the firing rate of place cell i , r j is the trace fir-
j , rkHD is the firing rate of head direction cell k , and rl FV is the firing rate of forward velocity cell l . Learning this association ensures that place
ing rate of place cell
cell i fires when place cell j , head direction cell k, and forward velocity cell l are all active. In this way co-firing of all three sets of cells evolves links that stimulate place cell activity to appropriately represent the movement of the agent through the environment. Simulations showed that the model can self-organise during the initial learning phase based on visual input, and produce stable place cell firing patterns. The simulated agent was also able to update its state when placed in simulated darkness (all visual cues removed) using only ideothetic cues, hence demonstrating the suitability of the learned ideothetic connections. The system’s performance was found to rely on the shape of the hard-coded activity profiles caused by visual input, and the implicit Cartesian information stored in the relative locations of these firing fields. In work by Browning (2000), a minimalist place memory learning system was developed that could bind visual stimulus to place representations. The system achieved localisation through binding of visual information directly to spatial representations of position and heading, as well as the use of a path integration module. Fig. 5.9 shows the overall system, known as a minimalist LV-PI (Local View-Path Integration) model. During robot experiments in a number of small arenas, the system was able to localise with respect to heading but not location. The failure was primarily attributed to shortcomings in the view processing system; to address these Browning proposed a more advanced vision system inspired by fixation of interest, shifting of interest, and feature tracking.
Head Direction and Place Cells – State of the Art
51
Fig. 5.9. Minimalist LV-PI model as implemented by Browning (2000). Path integration (PI) is used to create the world representations used by the robot. Place is represented in the Place Code (PC) cells and heading is represented in the Head Direction (HD) cells. Output from the vision processing module is bound to both representations through associative learning. Reprinted with permission of B. Browning.
5.1.7 Navigation The spatial characteristics of place cells make them suitable for use in navigation. Arleo’s doctoral work developed very detailed models of head direction and place cells for use in navigation (Arleo 2000). The models were created with a strong emphasis on recreating experimental observations of rats. Towards this end, the system used biologically plausible path integration mechanisms and allothetic visual cue responses. A temporal difference learning technique known as Q-learning was applied to the spatial representations to create navigational maps that could be used for goal recall. Place cells were used to drive locomotor action cells within a reward-based model of navigation. The models were all tested on a Khepera mobile robot in a small 600 × 600 mm arena with bar-coded walls and an 800 × 800 mm arena with an external visual cue but no walls. The work contributed a number of insights into the neurophysiological processes involved in the spatial cognition of a rat, as well as a computational robot brain that could map and navigate within its limited environments. Other characteristics of the rodent hippocampus such as the theta rhythm have also been modelled. In the model by Burgess, Recce et al. (1994), firing fields are constructed during exploration by using cells that are tuned to respond to visual cues at various distances from the rat. Each field is seen as an approximate radial basis function, with the fields together representing the surface of the environment. To create the theta rhythm relationship, the firing rate of cells is related to the average angle between the rat’s heading and two visual cues (Fig. 5.10). This creates the desired effect of cells firing later when their place field is directly ahead of the rat, and earlier when the place field is mostly behind the rat. Intermediate firing times are obtained for place fields mainly to the side of the rat. The model also contains a body of cells corresponding to cells observed downstream of the hippocampus. These cells use the firing rate of the place cells to form a population vector representing the direction of the simulated rat with respect to a previously encountered reward source. This enables the simulated rat to navigate to goals within a small arena. The researchers noted that it is equally possible that the phase effect in rodents is created through some intrinsic property of place cells rather than the mechanism that they have proposed.
52
Emulating Nature: Models of Hippocampus
Fig. 5.10. Generating a theta rhythm. A theta rhythm can be created by relating the phase firing of a cell to the average angle, α, from the rat’s heading direction to two visual cues. An average angle of 0 degrees means the rat’s heading direction is in between the two cue locations, corresponding to a place field located ahead of the rat and hence a late firing phase (Burgess, Recce et al. 1994). Reprinted with permission from Elsevier.
Fig. 5.11. Learning to navigate to a goal. (a) Navigational map progress after 2, 20, and 21 trials. Navigational vectors start to point towards the goal as trials progress. During the final trial the platform was removed; the simulated rat rapidly made its way to the remembered platform location and stayed in that area. (b) Graph showing times taken to reach the platform over the 20 trials (Blum and Abbott 1996). Reprinted with permission from The MIT Press.
Discussion
53
Some models use simulated place cells to encode a different type of information. Blum and Abbott (1996) propose the concept of a map suggesting future movement directions based on the knowledge of the current location. Long term potentiation (LTP) is used to store temporal and spatial information about a rat’s motion through modification of synaptic strengths between place cells. When traversing a familiar route, a delay in LTP induction allows presynaptic cells to link with postsynaptic cells further along the path. When retracing that route, current place cell activity drives activity in cells corresponding to places slightly in front of the rat. To navigate to the end of the route, the simulated rat heads towards the position encoded by this anticipatory cell activity. Testing of this model in a simulated Morris water maze experiment produced decreasing platform acquisition times over the first 10 trials followed by consistent rapid platform finding behaviour (Fig. 5.11). Later work on this model extended it to multiple goal locations using the same map (Gerstner and Abbott 1997). This was achieved by modulating place cell activity based on goal location, in addition to each cell’s dependence on the location of the rat relative to the cell’s place field. Different goal locations triggered different navigational maps. This also allowed a form of navigational map interpolation – goal locations could be specified that were never learned in exploration. The system would produce a map that was a combination of maps to learned goal locations.
5.2 Discussion During the development of many of these computational models, the focus has been on recreating the neuronal firing patterns observed in the rat brain during experiments. Many of the observed neuronal firing phenomena such as peak deformation (Goodridge and Touretzky 2000) and theta precession (Burgess, Recce et al. 1994) have been successfully replicated in these models. However, while the biological faithfulness of these models has been thoroughly tested, far less work has investigated the practical mapping and navigation potential of these artificial models, especially on actual robots rather than in simulation. When these models have been implemented on robots, the experiments have occurred in small arenas with artificial visual cues. This is no doubt partly due to the main focus of this work on proving theories of biological navigation and mapping mechanisms – most of the experimental data regarding rodents has been performed in small arenas (Fyhn, Molden et al. 2004; Hafting, Fyhn et al. 2005). In contrast, many mobile robots are expected to operate in large, complex environments. Little research to date has investigated the capability of biologically inspired computational models in such environments.
6 Robotic or Bio-inspired: A Comparison
The last four chapters have discussed the mapping and navigation problem and the range of system solutions that exist both in the animal world and on robot platforms. There are many forces driving the diversity that is apparent when examining these systems. In nature, creatures and their navigation systems have evolved to suit their environments and their own physical and sensory characteristics. In the natural world, the range of body types, sensory equipment, environments, and lifestyles has produced a myriad of solutions to the problems facing a creature that needs to move around effectively in its environment. Likewise in the research labs of robotics researchers and the domestic home or industrial workplace, mapping and navigation systems have developed in ways to suit the environments, the sensors available and the purpose of the robot. Given this diversity it seems a challenging task to identify one research direction that is most likely to yield a complete robot mapping and navigation system. The specific mechanisms and representations of each system differ so greatly that direct comparison is not possible. Fortunately, there are a number of fundamental mapping and navigation issues that are common to all these systems. Through consideration of these issues for all these systems it is possible to define goals for future research in this field, and in the process identify one means by which these goals may be met. The following sections compare and contrast biological systems and models of biological systems with robotic mapping algorithms in a range of areas. The comparison highlights the shortcomings of both types of system as well as their complementary characteristics. These shortcomings are particularly relevant when considering where future research into the mapping and navigation problem should focus. After a discussion of possible future research approaches, the final section describes the line of research that was pursued in the work described in this book.
6.1 Robustness Versus Accuracy One of the most significant differences between biological mapping systems and probabilistic methods is the world representations that they use. Many probabilistic systems incorporate high resolution occupancy grid maps, such as those shown in Fig. 6.1. In work by Grisetti, Stachniss et al. (2005) in the Freiburg campus environment, the M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 55–60, 2008. springerlink.com © Springer-Verlag Berlin Heidelberg 2008
56
Robotic or Bio-inspired: A Comparison
occupancy grid contained 6.25 million squares. By contrast the place cells found in the rodent hippocampus appear quite coarse, and likely do not encode information to such a precise degree. Furthermore many probabilistic based mapping methods attempt to produce a Cartesian map with direct spatial correspondence to the physical environment. A 2.5 metre wide, 16.7 metre long corridor should appear as an identically sized region of free space in an occupancy grid map, with occupied cells along its edges. Biological systems relax the requirement that there be direct Cartesian correspondence between a map and the environment. In rodents, place field size and shape can vary significantly depending on the type of environment. For instance, the addition of straight barriers to an environment resulted in the destruction of the place fields located where the barrier was added (Muller and Kubie 1987). Barriers can also cause an apparent remapping from the affected place cells into new place cells. Many animals appear to rely more on robust coping strategies than precise representations of their world. One such example is the desert ant; it navigates back to the general vicinity of its nest and then uses a systematic search procedure to find it (Wehner, Gallizzi et al. 2002).
Fig. 6.1. High resolution occupancy maps. (a) Grid map of the University of Freiburg campus, measuring 250 × 250 m, with a grid resolution of 0.1 m. (b) Grid map of the MIT Killian Court, measuring 250 × 215 m with a resolution of 0.1 m. Both maps were generated using the extended Rao-Blackwellized particle filter method (Grisetti, Stachniss et al. 2005). © 2005 IEEE.
6.2 Map Friendliness Versus Map Usability Maps that are highly accurate, precise, and Cartesian have a number of advantages and disadvantages. One immediate observation is that they are ‘human-friendly’. Looking at the map in Fig. 6.1b, a human can immediately make a number of high level observations about the environment – there are multiple rectangular loops in the environment, the environment is almost entirely made up of corridors running at discrete ninety degree orientations, and so on. It is easy for a human to relate parts of the map back to physical places in the environment, and also to compare the map and environment on a global scale. In contrast, it is hard to present an equivalent map for any animal or insect. Place fields observed from brain readings in the rodent hippocampus convey only the
Sensory Differences
57
observed locations in the environment to which a single place cell responds (Fig. 6.2). There is no definitive method of combining the environmental features that a rodent perceives with the place fields of each place cell to produce a single ‘conventional’ map. An occupancy grid map could be created by using place cell activity and externally determined ranges to obstacles. It is unlikely however such a map would be representative of the rodent’s actual mapping process. The extent to which rats rely on “range-to-obstacle” concepts is unknown, as is whether rats use a linear geometric scale for ordering the map.
Fig. 6.2. Illustrative example of place cell firing and occupancy maps. Darker squares indicate higher firing rates / increased occupancy periods. (a) Time spent by rat at each location in the circular environment. (b) Spike map showing firing rate of one cell. (c) The firing rate array is obtained by dividing the spike map by the time spent in each location, to reveal a single peak of activity.
The maps shown in Fig. 6.1 are human-friendly in that a human can easily understand them. A human could use these maps to navigate efficiently to various locations. However, in doing so a number of high level classification and planning processes would be used. The long narrow rectangular sections of free space in Fig. 6.1b might immediately be classified as corridors, and the large obstacle shapes in Fig. 6.1a as buildings. A human might consciously trace along the paths shown in the map, before choosing and following the shortest one. The overall process would be very different to the subconscious navigation that occurs as a person drives to and from work, walks around an office building, or moves around in their house and garden. This distinction is an important one – the form of most maps created by probabilistic methods is governed by the underlying mathematical algorithms, or human requirements that the map be in a form suitable for performance analysis. In contrast, biological mapping systems have evolved to produce representations that are suited to the navigation tasks animals must perform everyday.
6.3 Sensory Differences Animals and robot platforms have different sensory equipment and movement capabilities. Probabilistic methods exploit the particular characteristics of the sensors they use. Biological sensors have evolved for important activities such as finding food and mating, with biological navigation systems evolving to suit the sensors and activities.
58
Robotic or Bio-inspired: A Comparison
Models of biological systems emulate the biological mapping and navigation mechanisms while using robotic sensors. Some researchers have applied compensatory algorithms to the output of the robotic sensors in order to more closely emulate the biological systems. For instance, the use of certain colour spaces minimises the effect of illumination differences (Tews, Robert et al. 2005), a task that many biological vision sensors achieve effortlessly. Other research approaches have involved plausibly modifying the models so that they take advantage of the differences in sensing equipment. For instance, instead of using a camera, a set of sonar sensors can be used to detect bearings and ranges to environment cues (Recce and Harris 1998). In the search for a biologically inspired mapping and navigation system, there are two ways to approach this problem of sensory differences. One approach is based on the fact that biological systems manage to perform quite well despite the specific limitations of some of their sensors. Although biological vision sensors can be quite advanced, there is a lack of accurate range sensing equipment (with the exception of such animals as bats). It seems reasonable that given the rapidly increasing computational power of modern computers, it should eventually be possible to create an artificial system equipped only with biologically faithful sensors that can match the navigation performance of animals. This approach dictates the meticulous emulation of the sensory equipment and theorised navigation strategies of animals even if this means purposefully handicapping certain aspects of the systems. The other approach involves trying to exploit the superior characteristics of some robotic sensors by extending the mapping and navigation models. Given the current inability to match the capabilities of many biological sensors, it also seems reasonable that the superior characteristics of some robotic sensors be exploited to make up for other shortcomings in the models. In this approach the focus is on creating functional robotic systems, rather than faithful replication of proposed biological mechanisms. Biological mapping and navigation mechanisms are modified to accommodate the different characteristics of robot sensors. This approach has received relatively little attention, unlike the fields focusing on biologically inspired mechanical robot design. There has been only a limited amount of research into developing practical biologically inspired robot mapping and navigation systems (Gaussier and Zrehen 1994).
6.4 Capability in Real World Environments Biological navigation systems perform well enough to allow the millions of species of animals using them to function and survive every day. These systems combine advanced sensing, clever mapping and robust navigation mechanisms. There is a reasonable amount of knowledge about the capabilities of their sensors, and experiments have gathered a significant amount of knowledge about the navigation capabilities of these animals. Many theories have been devised to account for their capabilities, but in some areas research is only starting to scratch the surface. However, there is no question that animals can navigate well in a wide range of complex, dynamic environments. In most areas the state of the art in robotic mapping and navigation systems has not yet come close to matching the abilities of animals. In specific subsets of the problem and under certain conditions, these systems may outperform their biological counterparts, but it is with the sacrifice of robustness and flexibility, and is usually
One Solution
59
accompanied by a host of assumptions and the use of advanced sensing equipment. Nevertheless the top methods can produce impressive maps of large indoor and outdoor environments (Thrun 2000; Montemerlo, Thrun et al. 2003; Grisetti, Stachniss et al. 2005), and some robots can even navigate effectively using their maps (Yamauchi and Beer 1996; Thrun 2000). Most of the major mapping and navigation problems, such as closing the loop or coping with ambiguity, can be solved by one or another of these methods with appropriate assumptions. In contrast to the robotic systems, computational models of the rodent hippocampus have only been used in simulation or on robots in small structured environments, and are yet to solve many of the major mapping and navigation problems. The small size and limited complexity of these environments reduces the major mapping and navigation challenges such as closing the loop, dealing with extended sensory ambiguity, and navigating to goals to trivial or non-existent problems. None of these models have been tested in or successfully applied to the large unmodified environments in which one might reasonably expect an autonomous mobile robot to function, such as an office floor or an outdoor set of pathways. Biologically inspired models are partly limited because the goal for much of the research is to validate or otherwise navigational theories for a particular animal, rather than to produce a fully functional robot system (Franz and Mallot 2000). The uncertainty about biological systems and subsequent speculation has produced models that may only be partially faithful to the biology, with resulting navigation performance that is inferior to that of the animal. In pursuing the development of a biologically plausible model, it is unlikely that a researcher will stumble upon a better performing model by chance – current biological systems are the product of a long sequence of evolution.
6.5 One Solution Given the current state of robotic and biological mapping and navigation systems, several conclusions can be drawn. It is unlikely that research in the near future will create a perfect model of the higher level mapping and navigation systems, such as those of a rodent, primate, or human. Animal brains are only partially understood; researchers create theories from the readings of a few dozen electrodes, but the theories are far from being comprehensively proven. Even though models may recreate some aspects of animal navigation behaviour, there can be no real confidence that the underlying mechanisms driving the artificial system are the same as those in the real animal. Furthermore, even biological systems do not necessarily possess all the capabilities autonomous robots require to function in the challenging environments earmarked for their deployment. Conventional robot mapping and navigation research is also facing many challenges. The most impressive recent demonstrations of mobile robotics have been largely made possible by good engineering and incremental improvements in algorithms, sensors, and computational power. The Defence Advanced Research Projects Agency (DARPA) Grand Challenge of 2005 is one prime example of this; while the onboard navigation and mapping systems were state of the art, it was only with an impressive array of expensive, high precision sensing equipment that this specific task
60
Robotic or Bio-inspired: A Comparison
could be solved (Crane III, Armstrong Jr et al. 2004; Ozguner, Redmill et al. 2004). Some may argue that the continual improvement in sensors and computer power may eventually lead to navigation systems that surpass all the abilities of one of the most adept of navigators – humans. However, it is perhaps more likely that this milestone will be achieved through fundamental methodology changes, rather than steady computational and sensory improvements, although such changes will definitely facilitate the process. So, where to look for a solution to the mapping and navigation problem? This book proposes that an eventual solution may be found using a biologically inspired yet completely pragmatic approach. This is not a novel proposal, with previous research investigating this hypothesis (Browning 2000). However, research on pragmatic models of biological systems to date has still had a heavy emphasis on biological plausibility and has had limited practical success. No research has developed or tested a biologically inspired model under the same conditions and criteria as conventional robot mapping and navigation systems. For example, the biologically inspired model developed by Arleo (2000) can perform localisation, mapping and goal recall, but only in a very small artificial arena, with continuous visual input from artificial cues or a distal cue source. Other less biologically faithful models have displayed limited re-localisation ability in robot experiments in an artificial arena with visual cues, and have been able to navigate to goals in simple T and two arm mazes in simulation (Browning 2000). These approaches have also been fundamentally limited by the rodents on which they are based. The capabilities of rodent mapping and navigation systems do not necessarily fulfil all the desired capabilities of an autonomous mobile robot. If a biologically inspired system is ever to compete with conventional probabilistic methods, it must also contain solutions to the navigational shortcomings of animals. This work therefore sought to further investigate whether models of the rodent hippocampus can serve as a basis for a complete robot mapping and navigation system. The first major step towards this goal was to investigate the practical capabilities and limitations of existing state of the art models of the rodent hippocampus. Towards this end, Chapter 7 presents the results and findings of a pilot study that implemented a conventional model of the mapping and navigation mechanisms in the rodent hippocampus.
7 Pilot Study of a Hippocampal Model
Rodents are the most thoroughly understood animal with respect to both their mapping and navigation capabilities and the neuronal mechanisms by which they are achieved. However, of the few studies that have actually implemented a hippocampal model on a robot (Arleo 2000; Krichmar, Nitz et al. 2005), none have investigated the full extent to which the model can be used as a practical mapping and navigation system. This chapter presents the implementation and experimental testing of a classical model of the mapping and navigation processes in the rodent hippocampus. The model is presented in stages, starting with its representation of spatial orientation, followed by its representation of spatial location. Results from the model’s performance in robot experiments at each stage are presented. These results lead into a discussion of the limitations of the model, including one key finding. This is followed by an examination of the literature on rodent navigation to determine the plausibility of these findings. The chapter concludes by discussing the significance of these limitations with respect to creating a practical robot mapping and navigation system.
7.1 Robot and Environment The model was initially developed in simulation before being deployed on a robot. For reasons of brevity, this chapter only describes the experimental setup for the robot experiments, which was mimicked in the simulation experiments. Experiments were run on a Pioneer 2DXe mobile robot from Mobile Robots Inc (formerly ActivMedia Robotics), as shown in Fig. 7.1. At this stage of the project the robot was equipped with a SICK laser range finder, a ring of eight sonar sensors, a Sony Pan-Tilt-Zoom (PTZ) camera, and wheel encoders. The robot also had a full onboard computer with a 400 MHz Athlon K6 processor and 128 MB of RAM. An 802.11b wireless ad-hoc network provided communications with a 1.1 GHz Intel Pentium 3 laptop which ran the majority of the model computation. Table 1 gives a summary of the equipment used in these experiments. The laser range finder and sonar sensors were not used in this pilot study. M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 61–86, 2008. springerlink.com © Springer-Verlag Berlin Heidelberg 2008
62
Pilot Study of a Hippocampal Model Table 7.1. Summary of equipment used
Equipment Description Used in Pilot Study SICK laser range finder Range readings in a 180° arc No Ring of eight sonar transducers Array covers approximately No 210° of arc; each sensor detects obstacles within a 30° cone. Sony PTZ camera Field of view 48.8° horizontal Yes by 37.6° vertical; 768 × 576 resolution. Motor encoders 500-ticks per revolution motor Yes encoders Athlon K6 400 MHz On-board computer for local Yes processor processing of data Intel Pentium 3 1.1 GHz Off-board computer for run- Yes laptop ning main model computation
Fig. 7.1. Equipment used during indoor experiments. A Pioneer 2DXe robot was used with a wirelessly connected laptop.
The vision problem was simplified by using a system of artificial visual landmarks. Coloured sheets of paper were rolled into 210 mm tall cylinders with a diameter of 90 mm. Cylinder colour schemes were split between the upper and lower halves of the cylinder. Four different colours – red, green, blue, and magenta – provided a total of
Complete Model Structure
63
Fig. 7.2. The Pioneer robot in the two by two metre experimental arena. Coloured cylinders are arranged around the perimeter.
16 types of unique landmark. The vision system could report the bearing, range to and colour of cylinders within its field of vision, along with an associated uncertainty value for each reading (Prasser and Wyeth 2003). Cylinders were consistently visible at ranges between one and three metres, with a distance uncertainty of about 10%. The testing environment was a flat two by two metre area of a laboratory floor, with masking tape marking the boundaries (for the observer’s benefit only). The coloured cylinders were used as visual cues for the robot and were placed just outside the arena in various configurations, such as the one shown in Fig. 7.2. Because the vision system was trained to recognise rectangular areas of a solid uniform colour, it did occasionally incorrectly recognise people’s socks and posters in the background as coloured cylinders.
7.2 Complete Model Structure There are a variety of theories on how rodents map and navigate in their environments, and consequently a range of models of these processes. However there is some degree of consistency across most models regarding the major inputs, components, and functions. The model described in this chapter follows the general consensus that rodents use both external and internal sensory input, and that they have neural structures for representing their spatial orientation and location, as described in Section 4.1.1. Fig. 7.3 shows the overall model structure. The robot’s pose is encoded in two modules known as the head direction (HD) and place networks. The head direction network encodes the robot’s orientation, the place network the robot’s location. Ideothetic information in the form of wheel encoder input drives the path integration process, which
64
Pilot Study of a Hippocampal Model
updates the robot’s state information in both the head direction and place networks. Allothetic information in the form of camera images is processed to form a representation known as the local view (LV), which calibrates the robot’s state information stored in the head direction and place networks.
Fig. 7.3. Overall structure of the hippocampal model
7.3 A Model of Spatial Orientation This section describes the implementation of a model of spatial orientation inspired by head direction cells in the rodent hippocampus. The word ‘inspired’ is used because only the major characteristics of the biological head direction cells were modelled. The biology served as a seed for the development of a simple, intuitive model of spatial orientation with the aim of providing a functional representation of the orientation of a mobile robot. 7.3.1 Representing Orientation The core component of the spatial orientation model is an array of neural units or ‘cells’ roughly corresponding to biological head direction cells. Each cell is tuned to be maximally active when the robot’s orientation matches the cell’s preferred direction, as shown in Fig. 7.4. The cell’s activity level reduces as the robot orientation rotates away from this preferred direction. The cell arrangement reflects their associated robot orientations – nearby cells encode similar robot orientations. When the ensemble activity of the head direction cells is viewed as a bar graph, one can see a ‘packet
A Model of Spatial Orientation
65
Fig. 7.4. Head direction network. The cell activation levels are shown by the darkness of the shading. The ensemble cell activity encodes the robot’s orientation.
Fig. 7.5. Activity profile in the head direction network. The packet is slightly skewed because the robot is rotating.
of activity’ that resembles a Gaussian curve (Fig. 7.5). The centre of this ‘activity packet’ represents the current perceived orientation of the robot. To create the desired cell firing profile, the cells are connected together in a competitive attractor network (CAN). Attractor networks, which are introduced in Chapter 5, have often been used to model head direction cells due in part to their desirable network dynamics (Redish, Elga et al. 1996; Stringer, Trappenberg et al. 2002). In this model 81 head direction cells are used to encode the full range of robot orientations. This means each cell is maximally active for a spread of about 4.4 degrees in the
66
Pilot Study of a Hippocampal Model
robot’s range of possible orientations. One full iteration of the head direction CAN consists of the following steps: 1. 2. 3. 4. 5.
Association of allothetic cues with robot orientation Calibration of robot orientation using allothetic cues Internal competitive attractor network dynamics Update of orientation using ideothetic input Global normalisation of network activity
Each of these steps is described in the following sections. 7.3.2 Learning Allothetic Cues Without recalibration the robot’s perception of its orientation will gradually become incorrect. Recalibration is a two step process; the robot must first learn associations between its orientation and the input from its external sensors, and then use that information to correct its orientation state information when it later receives that same sensor input. The robot’s camera and vision processing module can see coloured cylinders and report on their colour, distance, and relative bearing from the robot, and associated uncertainties (Prasser and Wyeth 2003). A three-dimensional matrix of local view cells encodes the cylinder colour (type), distance and bearing. Activated local view cells are constantly being associated with active head direction cells through strengthening of weighted connections between them (Fig. 7.6). Although the vision system reports geometric information, the learning and re-localisation system does not explicitly use this information. It relies instead on the consistency in visual appearance of the environment at each robot pose. The link weight, β ijkl , is updated by: t +1 t β ijkl = β ijkl + ς (ElVijk − δ decay )
Fig. 7.6. Active view cells are associated with active head direction cells
(7.1)
A Model of Spatial Orientation
67
where the link weight is increased by an amount proportional to the product of activity in the head direction cell El and view cell Vijk less a decay constant δ decay . 7.3.3 Re-localisation Using Allothetic Cues Once the robot has learned the associations between sensory input and orientation, it can use them to correct its perceived orientation. Whenever a familiar visual scene is encountered, activity is injected into the head direction cells associated with the scene. If the robot’s perceived orientation is correct, this has the effect of reinforcing its belief. However, if the current perceived orientation is incorrect, the injected activity initiates an alternative hypothesis regarding the robot’s orientation, which then competes with the current hypothesis under the competitive attractor network dynamics. If the robot is rotating, the associated activity from the local view must be shifted somewhat according to the robot’s rotational velocity before being injected into the head direction cells. This lead stops the injected activity from affecting the path integration mechanism which is described in Section 7.3.5. The amount of shift required, e , is linearly related to the rotational velocity ω by a constant value k p :
e = k pw
(7.2)
The current view cell activity V is projected through the association weights β and a vision influence constant α into the head direction cells E.
El t +1 = El t + αLl t
Ll = ∑∑∑ β ijklVijk t
X
Y
Z
(7.3)
t
(7.4)
i =0 j =0 k =0
Fig. 7.7. Connectivity of local view cells with head direction cells. Local view cell activity is projected through association links to a cell
Ll
in an array storing allothetic input. Activity in
L is injected directly into the head direction cells.
68
Pilot Study of a Hippocampal Model
Fig. 7.7 shows activity in the local view cells being projected into a copy of the head direction cell array, Ll , that stores activity caused by allothetic cues. This activity is injected directly into the actual head direction network. 7.3.4 Internal Dynamics To achieve stable activity profiles in the head direction network, a system of localised excitation and global inhibition is used. Each cell is connected to all the cells in the network including itself via excitatory links. The self-connected weight is the largest, with weight values dropping off in a discrete Gaussian manner for connections to more distant cells. The matrix storing the excitatory weight values between all cells, δ , is given by: ⎛ (i − j )2
1 ⎜⎜ δ ij = e ⎝ u
u = ∑e n
⎞ ⎟ s 2 ⎟⎠
⎛ i2 ⎞ ⎜ ⎟ ⎝ s2 ⎠
(7.5)
i =1
2
where s is the variance of the Gaussian distribution used to form the weights profile. During an experiment, activity in each of the head direction cells is projected through the appropriate excitatory links back into the head direction cells. The updated cell activity level after excitation,
Ei
t +1
Ei
t +1
, is given by:
= Ei + ∑ δ ij E j t
N
t
(7.6)
j =0
where
E j is the excitatory cell and δ ij is the excitatory link value from cell j to cell i.
Fig. 7.8. Head direction cells excite themselves and other cells. The total activity sum is used to inhibit all cells equally.
A Model of Spatial Orientation
69
Global inhibition based on the total activity in the network is performed to ensure that without input from allothetic stimuli, smaller activity packets will gradually ‘die out’. The updated activity level of a cell after inhibition,
Ei
t +1
Eit +1 , is given by:
= max(Ei + (1.0 − v ), 0)
where v = ∑ Ei N
t
(7.7)
t
i=0
The internal dynamics of the head direction cells are shown in Fig. 7.8. To ensure that the total activity sum in the head direction network stays constant, a global normalisation step is performed after the allothetic association, allothetic calibration, and ideothetic update steps. The cell activity levels after normalisation,
Eit +1 ,
are given by:
Ei t +1 =
∑E 1
Ei t
N
t
(7.8)
i
i =0
7.3.5 Path Integration Using Ideothetic Information Because external sensory input is not always available or useful, the robot must have some means of updating its state information based on internal sensory information. This is achieved by maintaining two sets of ideothetic input cells that project energy into the head direction network, as shown in Fig. 7.9. One set of cells is active for clockwise rotation, the other for counter-clockwise rotation, much like the model by Skaggs, Knierim et al. (1995). When the direction of rotation does not match the direction associated with the set of cells, all cells are forced to zero activity. Every cell in each network is initially activated according to the angular velocity of the robot. In addition, the activity profile of the head direction cells from the last iteration is also projected onto each set of ideothetic cells. The activity levels of the clockwise cells, Ec , and counter-clockwise cells, Ecc , are given by:
Ec = kωω c + k E E Ecc = kωω cc + k E E
(7.9)
where ωc and ωcc are the detected clockwise and counter-clockwise angular velocities of the robot, kω is an activation constant, kE is a projection constant, and E is the current activity profile of the head direction cells. ωc and ωcc are limited to values of zero or larger. The activity from the ideothetic cells is injected into the head direction network through offset weighted connections. These weighted connections project the activity in the ideothetic cells into appropriately offset clockwise or counter-clockwise positions in the head direction network. This in effect allows the injected activity to lead
70
Pilot Study of a Hippocampal Model
the current perceived robot orientation by a certain amount. For the implementation described here the offset size was five cells, or about 22 degrees, which was found to provide desirable path integration dynamics in the networks. After injection, the activity level of the head direction cells,
Eit +1 , is given by:
Eit +1 = Eit + k [(Ec )i +δ + (Ecc )i −δ ] where k is a constant determining the influence of ideothetic stimuli, and weight offset.
(7.10) is the
Fig. 7.9. Path integration cells in action for counter-clockwise rotation. Moving outwards from the centre, the first ring of cells are the head direction cells, followed by the counter-clockwise path integration cells and the clockwise path integration cells.
7.4 Model Performance This section presents results from two simple experiments using this hippocampal model – a path integration module calibration experiment, and a SLAM experiment in one-dimension. Experiments were conducted in both simulation and on the real robot, with comparable results, but due to the easy availability of ‘ground truth’ in simulation, results presented here are from the simulation experiments. 7.4.1 Experiment 1: Path Integration Calibration Calibration of the path integration system can be achieved through two means – modification of the ideothetic stimuli influence constant k, or modification of the weight offset size . In this context calibration refers not to correction of odometric drift but tuning of the path integration system to produce shifts in head direction activity that correctly correspond to the actual rotation of the robot. Like many of the biological models of path integration in the rodent hippocampus, the model used in this work
Model Performance
71
does not scale to all velocities. However, for the range of velocities encountered in operation of the robot (0 to 50 º/s), it is sufficiently well behaved. In this particular implementation the ideothetic stimuli influence constant k was modified to calibrate the system. To calibrate the constant, the robot was instructed to spin at five degrees per second in 125 degree alternating turns. The velocity of the robot was obtained by filtering the velocity of the activity peak in the local view network corresponding to coloured cylinders in the environment. The difference between this visual velocity and the velocity of the peak activity packet in the head direction network was used to drive correction of the k constant. Path integration in the head direction network was initially quite poor but corrected itself significantly by twenty five seconds into the trial and was accurate after fifty seconds, as shown in Fig. 7.10. The actual encoder velocities from the robot vary by about 300% when supposedly doing a constant velocity turn and as such were not recorded. The long calibration times were partly due to the scarcity of the cylinders – only two in this particular trial – and the camera’s small field of view, meaning cylinders were only in sight for about eight seconds at a time.
Fig. 7.10. Odometry calibration. The calibration was mostly complete after fifty seconds. Once corrected the calibration parameters remained constant as shown by the last two cycles in the graph.
7.4.2 Experiment 2: Localisation and Mapping in 1D In the second experiment the robot was set to rotate at five degrees per second in the centre of the arena, with each full rotation taking approximately 72 seconds. Two identical cylinders were placed just outside the arena as visual cues, as shown in Fig. 7.11. During the first four rotations, the robot was allowed to learn associations between these visual cues and its orientation. Using the pre-calibrated path integration parameters,
72
Pilot Study of a Hippocampal Model
Fig. 7.11. Environment for 1D localisation and mapping experiments
Fig. 7.12. Results of orientation calibration. Note the re-localisation at t = 324 seconds. The 200° jump corresponds to the 160° robot kidnap.
the activity peak in the head direction network closely tracked the actual robot orientation, as shown by Fig. 7.12. Because the learning and recalibration processes occur concurrently, the model also used the visual cue information to maintain localisation during this time. Halfway through the fifth lap the robot was kidnapped in orientation by 160 degrees, shown by the straight vertical segment of the robot orientation curve. This introduced a 160 degree error in the head direction network’s representation of robot orientation. Then at 318 seconds into the experiment, the robot saw the rightmost
Model Performance
73
coloured cylinder for the third time. Due to the ambiguity of having two identical cylinders in the environment, this vision information introduced two similarly weighted robot orientation hypotheses in the form of peaks of visually driven activity. Fig. 7.13 shows these hypotheses and how the activity in the head direction network changed over a period of about six seconds under this visual input. The size of these visually driven peaks also varied during this time. Although both visual peaks were dominant at different times, the network dynamics resulted in the leftmost hypothesis, which happened to be correct, becoming dominant.
Fig. 7.13. Re-localisation under visual input. Seeing the rightmost cylinder introduced two robot orientation hypotheses at t = 318 seconds. Although both hypotheses were dominant at different times, the network dynamics resulted in the robot re-localising to the leftmost hypothesis.
Six seconds after completing this first re-localisation, the robot saw the top cylinder for the third time, at t = 332 seconds. The vision system once again introduced two activity peaks regarding the robot’s orientation (Fig. 7.14). Between t = 332 seconds and t = 337 seconds, both visual peaks were alternately dominant. However, in contrast to the situation shown in Fig. 7.13, one visual peak overlapped and reinforced the current robot orientation hypothesis. This combination of a visual hypothesis with the current hypothesis reduced the effect of the other visual peak, which only introduced a minor peak in the head direction cell activity. The head direction network then correctly tracked the actual robot orientation for the remainder of the experiment (Fig. 7.12, t = 338 – 530 seconds).
74
Pilot Study of a Hippocampal Model
Fig. 7.14. Visual reinforcement of the current orientation hypothesis. Seeing the top cylinder introduced two visually driven activity peaks at t = 332 seconds. The right visual peak reinforced the current, correct orientation hypothesis, while the left peak introduced only minimal activity in the head direction network.
These results illustrate that: • path integration parameters can self-calibrate based on visual input during robot movement, • ideothetic wheel encoder input can sensibly update the robot’s perceived orientation, • allothetic visual input can correct localisation errors, and • one-dimensional SLAM can be performed in a simple environment with ambiguous visual cues.
7.5 A Model of Spatial Location This section describes the implementation of a model of spatial location inspired by place cells in the rodent hippocampus. Some of the mathematical basis for the model is a two-dimensional version of that used for the head direction model and hence is not discussed in detail, having already been covered in Section 7.3. 7.5.1 Representing Location The place cells are modelled as a two-dimensional matrix of cells, with each cell tuned to be maximally activated when the robot is at a specific location. A coarse
A Model of Spatial Location
75
representation is used, with the path integration system tuned so that each place cell represents a physical area of approximately 250 mm by 250 mm. This is quite different to most previous approaches involving place cells that have used a much finer resolution. The entire Pioneer robot covers an area represented by only four place cells. However, the place field size to robot size ratio compares favourably with that observed in nature – the stable place fields have areas of about 0.3 m2, compared to a robot area of about 0.16 m2, a ratio of about two to one. Place fields recorded in healthy rats have been recorded as having an area of about 0.056 m2 (Save, Cressant et al. 1998), compared to a rat ‘area’ of about 0.015 m2, a ratio of about three to one. To provide the appropriate system dynamics the model uses a competitive attractor network arrangement similar to that used for the head direction network. The cells are arranged in a two-dimensional matrix with full excitational interconnectivity between all cells. The excitational weights are created using a two-dimensional version of the discrete Gaussian distribution in Equation 7.5. This ensures the activation of each place cell reduces as the robot moves away from the cell’s preferred location. One full iteration of the place CAN consists of the same five steps used in a head direction CAN iteration (see Section 7.3.1), with orientation replaced by location: 1. 2. 3. 4. 5.
Association of allothetic cues with robot location Calibration of robot location using allothetic cues Internal competitive attractor network dynamics Update of robot location using ideothetic input Global normalisation
Each of these steps is described in the following sections. 7.5.2 Learning Allothetic Cues The need for correction of the robot’s state using visual input is especially important when considering its location, since slight errors in orientation quickly lead to large positional errors as translation occurs. Unlike the head direction system, there is no compensation for the current velocity of the robot when injecting allothetic input into the place cells. The place cells are relatively coarse when compared with the head direction cells and as such are not as sensitive to a slight lag in allothetic input. The learning of vision-place associations is essentially a two-dimensional version of the process described in Section 7.3.2. Active local view cells become associated with active place cells by a process similar to the one shown in Eq. 7.1. 7.5.3 Re-localisation Using Allothetic Cues Active local view cells inject activity into the place cells through association links. The place cell activity after allothetic injection,
t +1 Clm , is given by:
t + ρ ∑∑∑ β ijklmVijk Clmt +1 = Clm X
Y
Z
i =0 j = 0 k = 0
(7.11)
76
Pilot Study of a Hippocampal Model
Fig. 7.15. View cell connections to the place cells. Activity in the local view cells is projected to each place cell through weighted connections.
where
Vijk is the local view cell activation level, β ijklm is the association link between
local view cell ijk and place cell lm, and ρ is the overall allothetic influence constant. Fig. 7.15 shows active local view cells projecting to the place cells. The nature of the competitive attractor dynamics ensures that activity in the place cell network tends to converge into peaks of activity or activity packets. Fig. 7.16 shows two examples of place cell activity. Fig. 7.16a shows the activity levels in a 51 by 51 grid of place cells. There is one major peak representing the most probable location of the robot and another much smaller peak. This small peak has not been created through ideothetic input, but rather by allothetic input which is shown in Fig. 7.16b. From a purely visual perspective, the robot is equally sure it is in two different locations, represented by the two peaks of the same size. This visual input is driving the activity in the place cells. The larger place cell peak represents the current place hypothesis and is being supported by one of the visual input peaks, while the smaller second peak has been started by the other visual input peak. Because of their close proximity, without continuous visual stimulus, the attractor dynamics would tend to merge the two peaks rather than have them compete against each other. In this particular case, the small peak in Fig. 7.16a will compete with the larger peak but will not win unless future allothetic input supports the smaller peak more strongly than the larger one. The place network deals with ambiguous visual input through this allothetic injection and competitive attractor network dynamics. View cells can project to more than one location in the place cell network, as shown by Fig. 7.17. The plot shows the strength of allothetic connections between a local view cell and a number of place cells. Although the view cell is most strongly associated with one cluster of place cells, it has weaker connections to several other areas. This type of firing profile for a visual cell can be caused by ambiguous visual information in the environment, where the robot thinks it is seeing the same visual scene at several different locations in the environment. 7.5.4 Internal Dynamics Stable activity profiles are maintained in the place cell network through competitive attractor network excitation and inhibition. The dynamics are a two-dimensional
A Model of Spatial Location
77
version of those used for the head direction cells. A two-dimensional Gaussian distribution is used to create the excitatory weights for each cell. The updated place cell activity levels after excitation,
Cij
t +1
Cij
t +1
, are given by:
= Cij + ∑∑ ε ijkl Ckl t
I
I
t
(7.12)
k =0 l =0
Fig. 7.16. Sample activity profiles. (a) Activity in a 51 by 51 place cell grid. Total cell activity is normalised, so cell activity can be though of as a rudimentary probability of being located at the cell’s corresponding physical location. (b) The activity caused by visual input for the same place cell grid (before scaling). From a vision perspective there is equal probability of being located in two different locations.
Fig. 7.17. View cell-place cell weight map. Lighter shading indicates stronger connections from the view cell to the place cell. When activated this local view cell injects activity strongly into one area of the place cells and weakly into several other areas.
78
Pilot Study of a Hippocampal Model
The main difference between the place cell and head direction networks is that there is no wraparound for the place network – cells at one edge of the matrix do not project strongly to cells at the opposite edge of the matrix (they do project very weakly to each other because each cell excites every other to some degree). To minimise boundary effects a buffer layer of cells is used around the edges of the place cell matrix; these cells can be active and affect cells in the matrix proper, but do not themselves have any spatial correspondence to the physical world. 7.5.5 Path Integration Using Ideothetic Information The current activity packet in the place network is projected to its anticipated future position based on the velocity and orientation of the robot. Fig. 7.18 shows the overall path integration process. The robot orientation is not usually known exactly – rather it is represented by a number of activated head direction cells. The ideothetic input is processed on an individual basis for each head direction cell and its associated activity level. The place cell activity after path integration,
Cij
t +1
= Cij + ∑ E z t
z=N z =0
t
∑∑ f (v )C
Cijt +1 , is given by:
x =1 y =1
x =0 y =0
t x + i + is , y + j + j s
(7.13)
v = f (Vabs , z ) The activity levels in each of four cells C are multiplied by a fractional function f (v ) that is dependent on velocity υ, which is itself a function of head direction cell index z and the absolute velocity of the robot Vabs. The four cells are selected based on
Fig. 7.18. Path integration in the place cells. Current place cell activity is updated under path integration by orientation information stored in the head direction network and translational velocity information.
Model Performance
79
the offsets is and js which are functions of υ. The sum of this activity is then multiplied by the activity level in each head direction cell E. This process is repeated for all head direction cells, with the sum of its output injected into the place cells as ideothetic input. The coarseness of the place cell representation reduces the value of explicitly representing the uncertainty in the robot’s translational velocity with a probability distribution, as is often done with particle filters (Thrun, Fox et al. 2000).
7.6 Model Performance To test the ability of the place cell model to track a robot’s position in space over time, experiments were performed in the same arena as described in Section 7.4. Experiments were performed in both simulation and on a real robot. This section presents the results obtained from the simulation experiments, in which the robot’s actual location could be continuously tracked, enabling the demonstration of the key finding of this part of the work. A brief discussion of the real robot tests is included at the end of the section. 7.6.1 Experiment 3: Localisation and Mapping in 2D The simulated robot was instructed to move around its two by two metre environment using random translations and rotations. Only one action was performed at a time, so
Fig. 7.19. Localisation using path integration. Over the short term the place cell model was able to track the robot’s position effectively.
80
Pilot Study of a Hippocampal Model
the robot never followed any curved trajectories. For these experiments the path integration parameters were all hand-tuned before the experiment, due to the increased complexity of the self-calibration process in two dimensions. The robot was able to track its position quite well for short periods of time as shown in Fig. 7.19. The place network kept the robot localised over the short-term by using path integration. The system was also sometimes able to re-localise when kidnapped; the chances of successfully re-localising reduced as the period between the kidnap and the detection of a familiar visual scene increased. In longer experiments the network’s tracking ability proved to be unstable. Over the period of an hour the robot became lost and its perceived location moved well outside its two by two metre arena. The positional error of the simulated robot during the experiment is shown in Fig. 7.20. The error appeared relatively stable during the first 20 minutes of the experiment, but quickly grew in later stages of the experiment to be larger than the size of the experimental arena. The system was also implemented on the real robot with similar results, although the trial could not be run for as long. The place cell model was able to keep the robot localised for short test durations only. Because the robot was moving based on its perceived position, not its actual position, a small error in pose usually compounded quite rapidly and the tests had to be terminated before the robot collided with objects outside the arena. The following section discusses the reasons for this tracking failure and highlights two problems with the current system.
Fig. 7.20. Positional tracking error. The error remained small for about 20 minutes before growing in an unbounded manner.
Discussion and Summary
81
7.7 Discussion and Summary When the system re-localised its position through viewing of a familiar scene, it did not always re-orientate to the associated robot orientation as well. The opposite situation could also occur; re-orientation but no re-localisation. In both cases, incomplete re-localisation resulted from the head direction networks and place cell networks not being rigidly coupled to each other. Incomplete re-localisation had some role in the failure of the system to track the robot’s location over time. Arleo (2000) addressed this problem by imposing an extra recalibration constraint to the place and head direction networks. The head direction network not only needed a familiar scene to recalibrate orientation but also that the robot perceives itself to be located at a position associated with that scene. This approach solves the localisation problem assuming the robot has sufficient time to sequentially re-localise position and then orientation. Arleo also proposed it would allow the robot to recover from kidnapping, although no results were presented. The kidnapping scenario can however be analysed in a thought experiment. If the robot was kidnapped it would wander until seeing a familiar scene. At that moment the robot would not immediately re-orientate itself since its perceived location would not be the one associated with that scene. However it would re-localise its position, after which it could then re-orientate itself since it now had the perceived position associated with the scene. The robot would need to slow or stop to allow the re-orientation to occur. This problem is a relatively minor one and could be solved in a similar way for the model presented here. However the model has a much more significant limitation. The major limitation of this hippocampal model is that it cannot maintain multiple pose hypotheses. Although it can maintain multiple location and orientation hypotheses, each place cell activity packet is associated with the same head direction activity profile – different place cell activity packets cannot have different orientations associated with them. If ambiguous visual input suggests two possible poses, these poses cannot be verified or discarded through further robot movement and visual input because path integration will move all place cell activity packets in the same net direction of all possible head directions. This renders the model useless in any environment where the robot regularly has long periods of ambiguous allothetic information. This was the major cause of the localisation failures in the experiments described in Section 7.6.1. This problem is illustrated in Fig. 7.21. It shows a schematic of the robot at two different moments in time when it has associated its current place and head direction network activity packets with an identical view of two cylinders in the environment. Fig. 7.22 shows the robot some time later when it encounters the same visual scene again, prompting two possible robot pose hypotheses. The robot needs to move to receive further visual input so it can determine which position and orientation is correct. However there is only one head direction network containing both orientation possibilities. Each activity packet in the place cells is associated with both these orientation possibilities. Ideally, to maintain multiple pose hypotheses under further robot movement, the pose hypothesis in the left of Fig. 7.22 would move downwards as the robot
82
Pilot Study of a Hippocampal Model
moves forward. However, because it now has a second direction associated with it (dotted upward arrow) that should only be associated with the other pose possibility, it will move in the net direction (shown by the short thick arrow), and will spread both upwards and downwards. The right pose hypothesis will also spread and move in the same manner. Fig. 7.23 shows the activity packets after further robot movement. Without appropriate binding, the separate representations of place and orientation cannot sensibly maintain multiple hypotheses of pose. The neurophysiological separation of the place and head-direction cells(O'Keefe and Conway 1978; Ranck 1984) renders this a spatial memory form of the binding problem (Thiele and Stoner 2003). While mechanisms such as synchronisation have been proposed as a way of binding the information of different neurons in other domains (Gray, Konig et al. 1989), we observed that there was a simple modification of the model that offered an elegant engineering solution to the spatial binding problem.
Fig. 7.21. Learning under visual ambiguity. An overhead view of the robot (large circle) at two moments in time where it has associated the place cell activity packet (grey shading) and orientation (long solid arrow) of the robot with an identical view of two cylinders. In Fig. 7.22 the robot re-encounters this scene.
Fig. 7.22. Re-localisation failure under visual ambiguity. The robot encounters the same scene as in Fig. 7.21 and injects activity supporting the two possible locations and two possible orientations. Further robot movement results in both activity packets smearing upwards and downwards and moving in the net direction shown by the short thick arrow.
Discussion and Summary
83
Fig. 7.23. Activity packets after further robot movement. Both activity packets have smeared upwards and downwards and have moved in the same erroneous direction.
7.7.1 Comparison to Biological Systems Are rats able to maintain multiple pose hypotheses? The research literature was examined to determine whether this problem might be solved using any theories developed from experimental observations. Early work by Bostock, Muller et al. (1991) investigated the effects of substituting a new stimulus for a familiar stimulus in the environment. Rats performed food foraging activities in a cylindrical arena with a white cue card on the wall. After stable firing fields were developed, the white cue card was replaced by a geometrically identical black cue card. Place cells responded in two different ways and were classified by the researchers as “rotational” or “complex”. The firing fields of the rotational cells could be rotated to superimpose on the original firing fields in the presence of the white cue card. However, the firing fields of the complex cells either disappeared or changed so that they could not be superimposed on the original fields even through rotation. The researchers concluded that in a new environment the hippocampal spatial representation initially generalised through a rotation of the previous map before discriminating with a complex remapping of the environment. Changes in environment shape can also lead to gradual remapping. In experiments where rats foraged in both square and circular walled arenas, place cells took several days or weeks to differentiate between the two environments (Lever, Wills et al. 2002). Faster remapping occurred when the environments were further differentiated in colour and texture, with the majority of observed cells differentiating between the two environments within a day (Wills, Lever et al. 2005).
84
Pilot Study of a Hippocampal Model
Experiments in which rats are pre-trained to search at the midpoint of two markers but are then given only one in the test show the rat searching in two possible locations relative to the single marker, as if treating that marker as first being one and then the other of the original two (Redish 1999). This behaviour suggest that rats may switch internal maps, although it could be explained by the rat maintaining two pose hypotheses when confronted by ambiguous visual input. When exposed to environments with sections that are identical in both appearance and physical orientation, rats appear to construct two distinct hippocampal maps that overlap yet are significantly different (Skaggs and McNaughton 1998). When rats were kidnapped from one identical region to another, their place fields initially fired as if the rat had not been kidnapped, but then in some of the rats the place cell firing changed to represent the rat’s true physical location – partial remapping occurred when faced with an ambiguous situation. When faced with a conflict of distal and local cues the entire spatial representation in the hippocampus can split into opposing representations, as shown by experiments performed by Knierim (2002). The results of these experiments show that several predictions that current competitive attractor models make do not occur, although Knierim points out that there is a wealth of specific evidence and theory that supports these predictions as well. The author has not yet come across any place cell activity recordings from a real rat that show multiple peaks being maintained for any length of time. Although it appears that rats can switch between multiple maps maintained by the same set of place cells, no evidence of them concurrently updating multiple maps has been found (Bostock, Muller et al. 1991). Rats seem to keep only one map in ‘active’ memory at any one time, switching maps when prompted by significantly inconsistent visual input. The experimental evidence to date suggests that the hippocampus may deal with ambiguity and perceptual inconsistency by learning and switching between multiple maps rather than by maintaining multiple pose hypotheses within the one map. 7.7.2 Comparison to Other Models This pilot study has revealed a fundamental limitation in the hippocampal model described in this chapter that makes it unsuitable for use as a robot mapping and navigation system. Before attempting to address the model’s shortcomings, it is beneficial to compare and contrast its structure and performance with other models of the rodent hippocampus. This section focuses on two issues – the balance between selfcalibration and visual cue reliance, and the ability to perform SLAM in large and ambiguous environments. Almost all robot-based research into hippocampal models has been performed in small, highly controlled environments. Artificial visual cues are often used, and in many cases every visual cue is unique, making the problem of localisation and mapping significantly less difficult. Some models also rely on distal cues that are effectively invariant in absolute bearing from all locations inside the small experimental arena (Arleo 2000). The nature of these environments is at least partially the reason why many models rely on continuous recognition of well-defined cues, and define
Discussion and Summary
85
their path integration characteristics through initial visual exploration of the environment. This approach removes the need for a hard-coded path integration module, but is not necessarily ideal for implementation on robots in real world environments. The compromise between the level of reliance on visual cues and the ability of a model to self-calibrate its path integration system exists throughout the computational models of the rodent hippocampus. The model described in this chapter has a reduced dependence on visual cues at the expense of requiring a hard coded path integration system. For robots, this approach is sensible because many robotic platforms have quite well-defined movement characteristics. It is generally much easier to create a motion model for a wheeled robot than for a quadruped animal. The movement of a wheeled robot can be tracked accurately in the short-term by using these motion models. Even on non-wheeled robots, path integration can be performed using other sensors such as inertial movement units and compasses. In addition, these motion models can model the uncertainty in movement due to wheel slippage and other causes, resulting in awareness of both the robot’s new position and the uncertainty in that position. What many robots do not have is the ability to continually track the position of features in the environment. Without accurate laser range finders, a robot must deduce ranges from visual stimuli. Range-finding from stereo vision is possible but requires that the robot reliably track visual features as it moves, which may require assumptions about the environment. Extracting reliable range information using only a single vision sensor is an even more difficult task (Davison, Reid et al. 2007). In addition, a robot may not have the time or necessary conditions to perform its path integration calibration in a real world environment. Vision may only be intermittently available due to the environment being obscured by people or dynamic objects. A robot may need to start functioning as soon as it is placed in a new environment, with no time for tuning. It seems reasonable then that any mapping and navigation system for a real robot exploit the motion models that exist for robot platforms of various types. The second major issue with computational models of the rodent hippocampus is that none have been successfully used in a large, unmodified, and visually ambiguous real world environment. For example, no model has been demonstrated to recover from the large localisation errors that result from long movement periods without definitive vision information. This ability to ‘close the loop’ is a fundamental requirement of any autonomous mobile robot operating in unmodified real world environments. Most importantly, few if any models have the ability to robustly handle significant sensory ambiguity in the environment. As discussed in Section 7.7, any model based on a separate head direction-place network structure and without appropriate binding between the two is unable to sensibly maintain multiple hypotheses of pose over time. This is where the biological models perform most poorly, when compared with state of the art probabilistic mapping and navigation methods. The next chapter describes a practical extension of the rodent hippocampal model known as the RatSLAM model. RatSLAM retains the desirable aspects of the model presented in this chapter, such as the competitive attractor dynamics. However, it also extends the concept of place and head direction cells to solve the problems of maintaining multiple hypotheses of pose and re-localising simultaneously in orientation and location.
86
Pilot Study of a Hippocampal Model
7.7.3 Conclusion This chapter has presented the implementation of a hippocampal model in a simple environment with artificial visual cues. The model was able to perform SLAM in one-dimension but could not maintain localisation when the problem was extended to the full robot state. The inability to perform full pose SLAM was due to a fundamental limitation of the model – it cannot correctly maintain multiple hypotheses of pose. This is a key requirement if a robot is to function in ambiguous environments. A review of the literature revealed that rats may use multiple overlapping maps in ambiguous environments or switch maps when prompted by visual evidence, but there was no strong evidence supporting the hypothesis that rats maintain multiple pose hypotheses over time. The analysis of this model and other hippocampal models suggested that in their current form, they are poorly equipped to deal with large, ambiguous environments.
8 RatSLAM: An Extended Hippocampal Model
Separate representations of robot orientation and spatial location are inherently unsuitable for mapping and navigation in large, ambiguous environments, as demonstrated by the work presented in Chapter 7. This chapter describes the implementation of an extended hippocampal model known as RatSLAM, which combines the concept of head direction and place cells to form a new type of cell known as a pose cell. The chapter starts with a description of the extended pose cell model. RatSLAM can use a number of vision processing systems, which are described in Section 8.2. Section 8.3 discusses a way of visualising the system’s non-metric performance. RatSLAM was tested in a range of indoor and outdoor experiments, which are described in Sections 8.4 and 8.4.4. The chapter then concludes in Section 8.5.
8.1 A Model of Spatial Pose This section describes the overall structure and individual components of the RatSLAM system. 8.1.1 Complete Model Structure The RatSLAM system contains several major components, as shown in Fig. 8.1. The robot’s pose is encoded in a single CAN module known as the pose cell network. Both ideothetic and allothetic stimuli influence the activity in the pose cells through their respective processing modules. 8.1.2 Biological Evidence for Pose Cells There is some limited biological evidence for cells firing in both a directionally and spatially selective manner in the rodent brain. Directionally selective place cell discharges have been reported in rodent experiments in an eight-arm maze, such as shown in Fig. 8.2 (McNaughton, Barnes et al. 1983). One theory put forward by this work is that place cells may become directional in narrow geometric environments such as corridor sections. In such an environment, the rat is unlikely to turn around except at either end, and hence only experiences stimuli in one of two directions as it M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 87–116, 2008. springerlink.com © Springer-Verlag Berlin Heidelberg 2008
88
RatSLAM: An Extended Hippocampal Model
Fig. 8.1. The RatSLAM system structure. The pose cell network replaces the separate head direction and place cell networks.
moves up and down the corridor. Conversely in open environments such as a cylindrical arena place cells are mostly non-directional (Muller, Kubie et al. 1987). Taube (1990) reported cells with a firing rate that was modulated approximately equally by both head direction and location, although the overall correspondence was weak. A computational model has been developed that predicts many of the experimental observations of directional place cells (Sharp 1991). In simulation experiments this model developed directional place cells in an eight-arm maze but non-directional cells in an open cylinder arena. Recently ‘grid cells’ have been found in entorhinal cortex (Fyhn, Molden et al. 2004; Hafting, Fyhn et al. 2005; Sargolini, Fyhn et al. 2006), which fire at regular intervals in the environment. Some grid cells are also conjunctive, responding only at specific orientations as well as spatial locations. However, their precise role in mapping is the subject of much current speculation, and further experimental work will surely yield more information about their function. Grid cells are discussed further in Section 12.4.
Fig. 8.2. The eight-arm maze used in rodent experiments (Carlson 1999)
A Model of Spatial Pose
89
Excepting grid cells, the experimental evidence does not support the concept of inherently directionally selective place cells, and the pose cell model presented here cannot be described as biologically plausible. Practical performance rather than biological plausibility was the aim of this research. While these experimental results helped retain some perspective on the concept of pose cells, the model described here is the result of a pragmatic approach to developing a model of cells that encode pose. 8.1.3 Representing Pose To form a population of pose cells, the competitive attractor network structures that were used to model head direction and place cells were combined to form a threedimensional structure. The pose cells are arranged in an ( x ' , y ' ,θ ' ) pattern, allowing the robot to simultaneously represent multiple pose estimates in x ' , y ' , and θ ' (Fig. 8.3). Primed co-ordinate variables are used because although the correspondence between cells and physical co-ordinates is initially Cartesian, this relationship can become discontinuous and non-linear as the system learns the environment. For example, in indoor experiments, each pose cell initially fires maximally when the robot is in a 0.25 × 0.25 m area and within a 10 degree band of orientation. However as an experiment progresses, the pose volume each pose cell corresponds to can grow, shrink, warp, or even disappear under the influence of visual information. The network’s ( x ' , y ' ) plane can be arranged in any shape that tessellates, such as a square (Fig. 8.3), rectangle (Fig. 8.4), or hexagon. Due to the relative complexity of the wraparound connections, a hexagonal arrangement may be optimal because it requires the least number of wrapping connections for a given network size.
Fig. 8.3. The three-dimensional pose cell model. Each dimension corresponds to one of the three state variables of a ground-based robot.
Cells in the pose cell matrix have a continuous range of activity between 0 and 1. The activity in each cell encodes the probability of the robot having the specific pose associated with that cell. On a cell-by-cell basis this does not have much meaning, but when viewed as a whole, the ensemble activity can be decoded to provide an estimate of the robot’s most likely pose. Fig. 8.4 shows a snapshot of cell activity in the pose cell matrix during an experiment. The largest most strongly activated cluster of cells
90
RatSLAM: An Extended Hippocampal Model
Fig. 8.4. Snapshot of pose cell activity during an experiment. Several activity packets of varying size are evolving under the influence of the competitive attractor dynamics.
encodes the most likely pose of the robot. Other less dominant clusters or activity packets encode other possible hypotheses regarding the robot’s pose. 8.1.4 Internal Dynamics As with the head direction and place networks, competitive attractor network dynamics control the activity in the pose network. There are three stages to the internal dynamics: • an excitatory update, • global inhibition of all cells, and • normalisation of pose cell activity. Due to the three-dimensional nature of the pose network structure, the algorithms in each stage are more complex than those used in the pilot study, but the principles are generally the same. Excitatory Update A three-dimensional discrete Gaussian distribution is used to create the excitatory weight matrix, , which each pose cell uses to project activity to all other cells in the pose cell matrix. The distribution, ε abc , is the product of a two-dimensional Gaussian distribution corresponding to excitation in the
(x' , y ' )
pose cell plane, and a one-
dimensional distribution corresponding to excitation in the culated by:
ε abc = e where
(
)
− a 2 +b 2 k x 'y '
e −c
2
kθ '
θ'
dimension, and is cal-
(8.1)
k x ' y ' and kθ ' are the variance constants for the distributions in the ( x' , y ' )
plane and the
θ'
dimension, and
a , b , and c are the x' , y ' , and θ ' co-ordinates
A Model of Spatial Pose
within the distribution. The change in pose cell activity caused by excitation, is given by:
91
ΔPx ' y 'θ ' ,
ΔPx ' y 'θ ' = ∑∑∑ ε (a − x ' )(b − y ' )(c −θ ' ) Pabc N X ' N Y ' Nθ '
(8.2)
a =0 b =0 c =0
N x ' , N y ' , and Nθ ' are the three dimensions of the pose cell matrix in ( x' , y ' , θ ' ) space. The pose cells ‘wrap’ in all three directions; for instance, the ‘top’
where
pose cell layer in Fig. 8.3 excites both the layers directly below it and the layers at the ‘bottom’ of the diagram. Global Inhibition Because multiple pose hypotheses (represented by multiple activity packets) require time to compete and be reinforced by further visual input, inhibition is relatively gentle, meaning rival packets can co-exist for significant periods of time. The activity levels in the pose cells after inhibition,
[
Pxt'+y1'θ ' , are given by:
) ]
(
Pxt'+y1'θ ' = max Pxt' y 'θ ' + ϕ Pxt' y 'θ ' − max (P ) , 0
(8.3)
where the inhibition constant φ controls the level of global inhibition. Activation levels are limited to non-negative values. Normalisation Normalisation maintains the sum of activation in the pose cells at one after visual and path integration input. The activity level in the pose cells after normalisation,
Pxt'+y1'θ ' ,
is given by: t +1 x ' y 'θ '
P
=
Pxt' y 'θ '
∑∑∑ P N X ' N Y ' Nθ '
(8.4)
t abc
a = 0 b = 0 c =0
8.1.5 Learning Visual Scenes The local view module is a collection of neural units that represent the input to the robot’s external sensors in a form that is usable by the mapping and localisation system. Two simultaneous interactions occur between the local view cells and pose cells: mapping via associative learning, and localisation by the injection of activity into the pose cells. Fig. 8.5 shows the local view and pose cell structures and the links that form between them.
92
RatSLAM: An Extended Hippocampal Model
Fig. 8.5. The pose cell and local view cell structures. Associative links are learnt between an
Vi
and an activated pose cell Px ' y 'θ ' . Active local view cells inject energy into the pose cells through these links.
activated local view cell
Visual scenes are associated with the robot’s believed position by a learning function which increases the connection strengths between co-activated local view and pose cells. The updated connection strength,
β ixt +' 1y 'θ ' , is given by:
β ixt +' y1'θ ' = max (β ixt ' y 'θ ' , λVi Px ' y 'θ ' ) where
(8.5)
Vi is the activity level of the local view cell and Px ' y 'θ ' is the activity level of
the pose cell. These links form the pose-view map shown in Fig. 8.1. 8.1.6 Re-localising Using Familiar Visual Scenes The pose-view map can be used to inject activity into the pose cells, which is the mechanism that RatSLAM uses to maintain or correct its believed pose. Active local view cells project their activity into the pose cells to which they are associated, by an amount proportional to the association strength. The change in pose cell activity, ΔPx ' y 'θ ' , is given by:
ΔPx ' y 'θ ' = ∑ β ix ' y 'θ 'Vi
(8.6)
i
The learning method that RatSLAM uses to build the pose-view map cannot usefully associate raw camera data with pose cells. The data must be processed to reduce the dimensionality of the camera image while preserving distinctive information. The single layer learning mechanism between the local view and pose cells works best when the local view is a sparse feature vector, as this avoids problems with linearly inseparable inputs. To constrain the local view structure to a practical number of cells, the vision processing system should perform spatial generalisation; activity in the
A Model of Spatial Pose
93
local view cells should not change significantly for small changes in robot pose, as discussed in Section 8.2.1. 8.1.7 Intuitive Path Integration The path integration process updates the pose cell activity profile by shifting activity based on wheel encoder velocities and the ( x ' , y ' , θ ' ) co-ordinates of each pose cell. The advantage of a unified pose representation is that each cell encodes its own preferred orientation. This allows multiple hypotheses of the robot’s pose to propagate in different directions, unlike in a more convention head direction – place cell model. Many computational models of the rodent hippocampus achieve path integration by projecting existing cell activity to an anticipated future location, and allowing the competitive attractor network dynamics to shift the current activity towards this new activity. In a three-dimensional competitive attractor network it is difficult to achieve well-behaved path integration characteristics while still retaining suitable internal network dynamics. Consequently, a more pragmatic approach was taken with this extended hippocampal model. The new path integration method shifts existing pose cell activity rather than projecting a copy of the current activity forwards in time. This process makes its performance independent of variable sensory update rates and robot velocity, and produces more accurate robot paths, as well as removing the need for parameter tuning. The method has no real biological basis, unlike other methods in more biologically faithful models (Arleo, Smeraldi et al. 2001; Stringer, Rolls et al. 2002). After path integration the updated pose cell activity,
Pxt'+y1'θ ' =
∑δ ∑δ ∑δθ α
Pxt'+y1'θ ' , is given by:
δxo′ +1 δyo′ +1 δθ o′ +1 a = xo′ b = yo′ c =
abc
P(tx '+ a )( y '+ b )(θ ' + c )
(8.7)
o′
where δxo′ , δy o′ , δθ o′ are the rounded down integer offsets in the x ' , y ' and θ ' directions. The amount of activity injected is based on the product of the activity of the sending pose cell unit, P, and a residue component, α. The residue component is spread over a 2 × 2 × 2 cube to account the quantisation effects of the grid representation. It is based on the fractional components of the offsets, xf, yf and θf which themselves are based on the translational velocity v, rotational velocity ω and preferred cell orientation θ ' . The integer and fractional offsets are given by Eqs. 8.8 and 8.9 respectively:
⎡δxo′ ⎤ ⎡ ⎣k x 'v cosθ '⎦⎤ ⎢δy ′ ⎥ = ⎢ k v sin θ ' ⎥ ⎦⎥ ⎢ o ⎥ ⎢ ⎣ y' ⎢⎣δθ o′ ⎥⎦ ⎢⎣ ⎣kθ 'ω ⎦ ⎥⎦
⎡δx′f ⎤ ⎡k x 'v cos θ '−δx'o ⎤ ⎥ ⎢ ⎥ ⎢ ⎢δy ′f ⎥ = ⎢ k y 'v sin θ '−δy 'o ⎥ ⎢δθ ′f ⎥ ⎢ k θ ' ω − δθ ' ⎥ ⎣ ⎦ ⎣ ⎦
(8.8)
(8.9)
94
where
RatSLAM: An Extended Hippocampal Model
k x ' , k y ' , and kθ ' are path integration constants. The residue matrix α is cal-
culated in Eqs. 8.10 and 8.11:
α abc = g (δx′f , a − δxo′ )g (δy ′f , b − δyo′ )g (δθ ′f , c − δθ o′ )
(8.10)
⎧1 − u, if v = 0; g (u, v ) = ⎨ if v = 1 ⎩ u,
(8.11)
8.2 Generation of Local View Three visual processing methods were used in the course of this work: a cylinder recognition system; a sum of absolute differences matcher; and a histogram matcher. The cylinder recognition system was used in the pilot study and has already been described in Section 7.1. This section gives an overview of the other two local view generation processes, which were developed by David Prasser, another researcher on the project (Prasser, Wyeth et al. 2004; Prasser, Milford et al. 2005). 8.2.1 Sum of Absolute Differences Module In indoor environments, images from the Pioneer robot’s camera are converted to 8bit greyscale images and down sampled to a resolution of 12 × 8 pixels, as shown in Fig. 8.6. Down sampling serves two main purposes: the coarseness of the image renders it slightly invariant to the camera location and orientation, producing limited spatial generalisation, and the small size of the image reduces the computational resources required to process it. The 96-pixel images are processed by a Sum of Absolute Differences (SAD) module that produces the local view (Prasser, Wyeth et al. 2004). The SAD module compares each image with its repository of stored template images. New images that are
Fig. 8.6. High resolution and down sampled greyscale camera images. The low resolution 12 × 8 pixel images are used by the RatSLAM system.
Generation of Local View
95
sufficiently similar to template images are re-recognised as such, while significantly different images are learned as new image templates and added to the repository. Using this vision module, the local view consists of a one-dimensional array of cells, with each cell corresponding to a particular image template. The activity level of a cell, a , varies between 0 and 1, and is given by:
0, di > d max ⎧ ai = ⎨ ⎩1 (di + ε ), di