174 72 84MB
English Pages [3145] Year 2015
Willi Freeden M. Zuhair Nashed Thomas Sonar Editors
Handbook of Geomathematics Second Edition
1 3Reference
Handbook of Geomathematics
Willi Freeden M. Zuhair Nashed Thomas Sonar Editors
Handbook of Geomathematics Second Edition
With 785 Figures and 102 Tables
Editors Willi Freeden Geomathematics Group University of Kaiserslautern Kaiserslautern Germany M. Zuhair Nashed Department of Mathematics University of Central Florida Orlando, FL USA Thomas Sonar Computational Mathematics Technische Universität Braunschweig Braunschweig Germany
ISBN 978-3-642-54550-4 ISBN 978-3-642-54551-1 ISBN 978-3-642-54552-8 (print and electronic bundle) DOI 10.1007/ 978-3-642-54551-1
(eBook)
Library of Congress Control Number: 2015943671 Mathematics Subject Classification: 01A55, 35R25, 65N20, 86A05, 86A20, 86A22, 86A25, 86A30 Springer Heidelberg New York Dordrecht London © Springer-Verlag Berlin Heidelberg 2010, 2015 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper Springer-Verlag GmbH Berlin Heidelberg is part of Springer Science+Business Media (www.springer. com)
Foreword
With the first edition of the Handbook of Geomathematics, the foundation was laid to provide some meaning to the notion “Geomathematics” and as to what this new interdisciplinary field comprises. Since then, about 5 years ago, a vast number of publications appeared in various journals and books, where one could classify them to treat topics in the field of Geomathematics—and this number is still continuously increasing! Thus, one can infer that Geomathematics as an interdisciplinary research area has grown, in depth as well as in width, and has (re)defined and sharpened its boundaries. There could be no better reason to collect and publish the current state of the art of geomathematical works—hence this extended second edition. Comparing the two editions, one can immediately see that the progress made in these 5 years is enormous. And to make this progress and the new results more accessible, the editors of Handbook of Geomathematics have decided to utilize Springer’s dynamic reference platform. In this way, the Handbook and all its chapters became alive! Now, authors can update their (still peer-reviewed) contributions anytime to include new results. The editors have the possibility to invite new authors and acquire new chapters whenever they feel a new topic or publication has emerged, without having to wait years for a new edition to make sense. Every version of every chapter is associated with a date and time, to track changes and the currency of the work, such that it can be referred to accordingly in other scientific works. For the active researchers and users, the advantages of such a reference platform comprises all what modern technology has to offer: search functions, pdf download and storage, online/offline availability, HTML formatting for screen adaptations, and many more. Finally, we would like to thank all our collaborators and contributors who made this project possible and we appreciate all the support from the community to continue this exciting and fruitful endeavor in promoting Geomathematics. Kaiserslautern, Germany May 2015
Mario Aigner Willi Freeden
v
Preface
The aim of Handbook of Geomathematics, published in 2010, was twofold: The first aim was to offer appropriate means of assimilating, assessing, and reducing to comprehensible form the readily increasing flow of data from geochemical, geodetic, geological, geophysical, and satellite sources. The second aim was to provide an objective basis for scientific interpretation, classification, testing of concepts, modeling, simulation, and solution of problems in the geosciences. The handbook was the first authoritative mathematical forum of geoscientifically relevant research. The editors and the publisher are delighted that the handbook was very successful and well received. In recognition of information changing quickly these days and researchers demanding the most current information possible, it seems that the stage is set for the second edition as a central reference work for academics, policymakers, and practitioners internationally. Thus, the editors and the publisher Springer have decided it is time to update and substantially expand the first edition. The handbook fills the gap of a basic reference work on mathematics concerned with methods and problems of geoscientific significance, i.e., geomathematics. It consolidates the current knowledge by providing succinct summaries of concepts and theories, definitions of terms, biographical entries, organizational profiles, a guide to sources of information, and an overview of the landscapes and contours of today’s geomathematics. Contributions are written in an easy-to-understand and informative style for a general readership, typically from areas outside the particular research field. The second edition of Handbook of Geomathematics comprises the following scientific fields: 1. General Issues, Historical Background, and Future Perspectives 2. Observational and Measurement Key Technologies 3. Modeling of the System Earth (Geosphere, Cryosphere, Hydrosphere, Atmosphere, Biosphere, Anthroposphere) 4. Analytic, Algebraic, and Operator Theoretical Methods 5. Statistical and Stochastic Methods 6. Spherical Function Systems and Methods 7. Computational and Numerical Methods 8. Cartographic, Photogrammetric, Information Systems and Methods vii
viii
Preface
We, as editors, wish to express our particular gratitude to the people who not only made this work possible but also made it an extremely satisfying project:
• The contributors to the handbook, who dedicated much time, effort, and creative energy to the project. The contributions of the handbook open new opportunities to contribute significantly to the understanding of our planet, its climate, its environment, and about an expected shortage of natural resources. Even more, the editors are convinced that they offer the key parameters for the study of the Earth’s dynamics such that interactions of its solid part with ice, oceans, atmosphere, etc., become more and more accessible. During the process of reviewing and editing, authors as well as editors became exposed to exciting new information that, in some cases, strongly increased activity, creativity, and novelty. At the end, as far as we are aware, this second edition of the handbook presents the appropriate ground in dealing generically with fundamental problems and “key technologies” as well as exploring the wide geomathematical range of consequences and interactions for woman or mankind. • The folks at Springer, particularly Clemens Heine, who initiated the first edition, and Mario Aigner, who was the production interface of the second edition with authors and editors. Indeed, the editors have had a number of editorial experiences during a long time, but working with Mario was extraordinarily enjoyable. Thanks also go to Annalea Manalili and Michael Hermann as the responsible persons of this work within Springer’s innovative, twenty-first century digitalfirst reference publishing platform.
Thank you very much for all your exceptional efforts and support in creating and continuing a work summarizing exciting discoveries and impressive research achievements. We hope that the second edition of Handbook of Geomathematics will stimulate and inspire new research efforts and the intensive exploration of very promising directions. Kaiserslautern, Germany Orlando, USA Braunschweig, Germany
Willi Freeden M. Zuhair Nashed Thomas Sonar
Contents
Volume 1 Part I
General Issues, Historical Background, and Future Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Geomathematics: Its Role, Its Aim, and Its Potential . . . . . . . . . . . . . . . . . Willi Freeden
3
Navigation on Sea: Topics in the History of Geomathematics . . . . . . . . . . Thomas Sonar
79
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: the History of the Geomagnetic Atlases . . . . . . . . . . . . . . . . . . . . . 107 Karin Reich and Elena Roussanova Part II Observational and Measurement Key Technologies . . . . . . . . . . 145 Earth Observation Satellite Missions and Data Access . . . . . . . . . . . . . . . . 147 Henri Laur and Volker Liebig Satellite-to-Satellite Tracking (Low-Low/High-Low SST) . . . . . . . . . . . . . . 171 Wolfgang Keller GOCE: Gravitational Gradiometry in a Satellite . . . . . . . . . . . . . . . . . . . . . 211 Reiner Rummel Sources of the Geomagnetic Field and the Modern Data That Enable Their Investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Nils Olsen, Gauthier Hulot, and Terence J. Sabaka Part III Modeling of the System Earth (Geosphere, Cryosphere, Hydrosphere, Atmosphere, Biosphere, Anthroposphere) . . . . 251 Classical Physical Geodesy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Helmut Moritz Geodetic Boundary Value Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 F. Sansò ix
x
Contents
Time-Variable Gravity Field and Global Deformation of the Earth . . . . . 321 Jürgen Kusche Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 Willi Freeden and Michael Schreiner Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Erik W. Grafarend, Matthias Klapp, and Zdenˇek Martinec Multiresolution Analysis of Hydrology and Satellite Gravitational Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 Helga Nutz and Kerstin Wolf Time-Varying Mean Sea Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519 Luciana Fenoglio-Marc and Erwin Groten Self-Attraction and Loading of Oceanic Masses . . . . . . . . . . . . . . . . . . . . . . 545 Julian Kuhlmann, Maik Thomas, and Harald Schuh Unstructured Meshes in Large-Scale Ocean Modeling . . . . . . . . . . . . . . . . 567 Sergey Danilov and Jens Schröter Numerical Methods in Support of Advanced Tsunami Early Warning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601 Jörn Behrens Gravitational Viscoelastodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623 Detlef Wolf Elastic and Viscoelastic Response of the Lithosphere to Surface Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661 Volker Klemann, Maik Thomas, and Harald Schuh Multiscale Model Reduction with Generalized Multiscale Finite Element Methods in Geomathematics . . . . . . . . . . . . . . . . . . . . . . . . . 679 Yalchin Efendiev and Michael Presho Efficient Modeling of Flow and Transport in Porous Media Using Multi-physics and Multi-scale Approaches . . . . . . . . . . . . . . . . . . . . . 703 Rainer Helmig, Bernd Flemisch, Markus Wolff, and Benjamin Faigle Convection Structures of Binary Fluid Mixtures in Porous Media . . . . . . 751 Matthias Augustin, Rudolf Umla, and Manfred Lücke Numerical Dynamo Simulations: From Basic Concepts to Realistic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779 Johannes Wicht, Stephan Stellmach, and Helmut Harder Mathematical Properties Relevant to Geomagnetic Field Modeling . . . . . 835 Terence J. Sabaka, Gauthier Hulot, and Nils Olsen
Contents
xi
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 879 Christian Gerhards Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions in Geomagnetic Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921 Jin Sun Using B-Spline Expansions for Ionosphere Modeling . . . . . . . . . . . . . . . . . 939 Michael Schmidt, Denise Dettmering, and Florian Seitz The Forward and Adjoint Methods of Global Electromagnetic Induction for CHAMP Magnetic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 985 Zdenˇek Martinec Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1057 Nils Dorband, Martin Fengler, Andreas Gumann, and Stefan Laps
Volume 2 Radio Occultation via Satellites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1089 Christian Blick and Sarah Eberle Asymptotic Models for Atmospheric Flows . . . . . . . . . . . . . . . . . . . . . . . . . . 1127 Rupert Klein Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155 Carsten Mayer and Willi Freeden On High Reynolds Number Aerodynamics: Separated Flows . . . . . . . . . . 1255 Mario Aigner Turbulence Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1297 Steffen Schön and Gaël Kermarrec Forest Fire Spreading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1349 Sarah Eberle, Willi Freeden, and Ulrich Matthes Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1387 Andreas Meister and Joachim Benz Model-Based Visualization of Instationary Geo-Data with Application to Volcano Ash Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1417 Martin Baumann, Jochen Förstner, Vincent Heuveline, Jonas Kratzke, Sebastian Ritterbusch, Bernhard Vogel, and Heike Vogel Modeling of Fluid Transport in Geothermal Research . . . . . . . . . . . . . . . . 1443 Jörg Renner and Holger Steeb
xii
Contents
Fractional Diffusion and Wave Propagation . . . . . . . . . . . . . . . . . . . . . . . . . 1507 Yuri Luchko Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1547 Matthias Augustin, Mathias Bauer, Christian Blick, Sarah Eberle, Willi Freeden, Christian Gerhards, Maxim Ilyasov, René Kahnt, Matthias Klug, Sandra Möhringer, Thomas Neu, Helga Nutz, Isabel Michel née Ostermann, and Alessandro Punzi Part IV Analytic, Algebraic, and Operator Theoretical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1631 Noise Models for Ill-Posed Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1633 Paul N. Eggermont, Vincent LaRiccia, and M. Zuhair Nashed Sparsity in Inverse Geophysical Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 1659 Markus Grasmair, Markus Haltmeier, and Otmar Scherzer Multiparameter Regularization in Downward Continuation of Satellite Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1689 Shuai Lu and Sergei V. Pereverzev Evaluation of Parameter Choice Methods for Regularization of Ill-Posed Problems in Geomathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1713 Frank Bauer, Martin Gutting, and Mark A. Lukas Quantitative Remote Sensing Inversion in Earth Science: Theory and Numerical Treatment . . . . . . . . . . . . . . . . . . . . . . . . . . 1775 Yanfei Wang Correlation Modeling of the Gravity Field in Classical Geodesy . . . . . . . . 1807 Christopher Jekeli Inverse Resistivity Problems in Computational Geoscience . . . . . . . . . . . . 1845 Alemdar Hasanov (Hasanoˇglu) and Balgaisha Mukanova Identification of Current Sources in 3D Electrostatics . . . . . . . . . . . . . . . . . 1863 Aron Sommer, Andreas Helfrich-Schkarbanenko, and Vincent Heuveline Transmission Tomography in Seismology . . . . . . . . . . . . . . . . . . . . . . . . . . . 1887 Guust Nolet Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1905 Ignace Loris Strategies in Adjoint Tomography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1943 Yang Luo, Ryan Modrak, and Jeroen Tromp
Contents
xiii
Potential-Field Estimation Using Scalar and Vector Slepian Functions at Satellite Altitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2003 Alain Plattner and Frederik J. Simons Multidimensional Seismic Compression by Hybrid Transform with Multiscale-Based Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2057 Amir Z. Averbuch, Valery A. Zheludev, and Dan D. Kosloff Tomography: Problems and Multiscale Solutions . . . . . . . . . . . . . . . . . . . . 2087 Volker Michel RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2121 Volker Michel Material Behavior: Texture and Anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . 2149 Ralf Hielscher, David Mainprice, and Helmut Schaeben Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P- and S-Wave Velocities Near Surface Profiling . . . . . . . . . . 2189 Andrey Konkov, Andrey Lebedev, and Sergey Manakov Simulation of Land Management Effects on Soil N2 O Emissions Using a Coupled Hydrology-Biogeochemistry Model on the Landscape Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2207 Martin Wlotzka, Vincent Heuveline, Steffen Klatt, Edwin Haas, David Kraus, Klaus Butterbach-Bahl, Philipp Kraft, and Lutz Breuer
Volume 3 Part V
Statistical and Stochastic Methods . . . . . . . . . . . . . . . . . . . . . . . . 2233
An Introduction to Prediction Methods in Geostatistics . . . . . . . . . . . . . . . 2235 Ralf Korn and Alexandra Kochendörfer Statistical Analysis of Climate Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2257 Helmut Pruscha Oblique Stochastic Boundary-Value Problem . . . . . . . . . . . . . . . . . . . . . . . . 2285 Martin Grothaus and Thomas Raskop Geodetic Deformation Analysis with Respect to an Extended Uncertainty Budget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2317 Hansjörg Kutterer It’s All About Statistics: Global Gravity Field Modeling from GOCE and Complementary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2345 Roland Pail
xiv
Contents
Mixed Integer Estimation and Validation for Next Generation GNSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2373 Peter J.G. Teunissen Mixed Integer Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2405 Peiliang Xu Part VI Special Function Systems and Methods . . . . . . . . . . . . . . . . . . . 2453 Special Functions in Mathematical Geosciences: An Attempt at a Categorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2455 Willi Freeden and Michael Schreiner Clifford Analysis and Harmonic Polynomials . . . . . . . . . . . . . . . . . . . . . . . . 2483 Klaus Gürlebeck and Wolfgang Sprößig Splines and Wavelets on Geophysically Relevant Manifolds . . . . . . . . . . . . 2527 Isaac Pesenson Scalar and Vector Slepian Functions, Spherical Signal Estimation and Spectral Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2563 Frederik J. Simons and Alain Plattner Dimension Reduction and Remote Sensing Using Modern Harmonic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2609 John J. Benedetto and Wojciech Czaja Part VII Computational and Numerical Methods . . . . . . . . . . . . . . . . . . 2633 Radial Basis Function-Generated Finite Differences: A Mesh-Free Method for Computational Geosciences . . . . . . . . . . . . . . . . . 2635 Natasha Flyer, Grady B. Wright, and Bengt Fornberg Numerical Integration on the Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2671 Kerstin Hesse, Ian H. Sloan, and Robert S. Womersley Fast Spherical/Harmonic Spline Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 2711 Martin Gutting Multiscale Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2747 Stephan Dahlke Sparse Solutions of Underdetermined Linear Systems . . . . . . . . . . . . . . . . 2773 Inna Kozlov and Alexander Petukhov Nonlinear Methods for Dimensionality Reduction . . . . . . . . . . . . . . . . . . . . 2799 Charles K. Chui and Jianzhong Wang
Contents
xv
Part VIII Cartographic, Photogrammetric, Information Systems and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2853 Cartography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2855 Liqiu Meng Theory of Map Projection: From Riemann Manifolds to Riemann Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2883 Erik W. Grafarend Modeling Uncertainty of Complex Earth Systems in Metric Space . . . . . . 2965 Jef Caers, Kwangwon Park, and Céline Scheidt Geometrical Reference Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2995 Manuela Seitz, Detlef Angermann, Michael Gerstl, Mathis Bloßfeld, Laura Sánchez, and Florian Seitz Analysis of Data from Multi-satellite Geospace Missions . . . . . . . . . . . . . . 3035 Joachim Vogt Geodetic World Height System Unification . . . . . . . . . . . . . . . . . . . . . . . . . . 3067 Michael Sideris Mathematical Foundations of Photogrammetry . . . . . . . . . . . . . . . . . . . . . . 3087 Konrad Schindler Potential Methods and Geoinformation Systems . . . . . . . . . . . . . . . . . . . . . 3105 Hans-Jürgen Götze Geoinformatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3131 Monika Sester Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3151
About the Editors
Willi Freeden Geomathematics Group University of Kaiserslautern P.O. Box 3049 67663 Kaiserslautern Germany Studies in Mathematics, Geography, and Philosophy at the RWTH Aachen, 1971 “Diplom” in Mathematics, 1972 “Staatsexamen” in Mathematics and Geography, 1975 Ph.D. in Mathematics with distinction, 1979 “Habilitation” in Mathematics at the RWTH Aachen, 1981/1982 Visiting Research Professor at the Ohio State University, Columbus (Department of Geodetic Science and Surveying), 1984 Professor of Mathematics at the RWTH Aachen (Institute of Pure and Applied Mathematics), 1989 Professor of Technomathematics (Industrial Mathematics), 1994 Head of the Geomathematics Group, and 2002–2006 Vice-President for Research and Technology at the University of Kaiserslautern, published over 145 papers, several book chapters, and 10 books. Editor-in-Chief of the Springer International Journal on Geomathematics (GEM) and Editor-in-Chief of the Birkhäuser book series Geosystems Mathematics and the Birkhäuser lecture notes series Geosystems Mathematics and Computing.
xvii
xviii
About the Editors
M. Zuhair Nashed Department of Mathematics University of Central Florida Orlando, FL 32816 USA S.B. and S.M. degrees in electrical engineering from MIT and Ph.D. in Mathematics from the University of Michigan. Served for many years as Professor at Georgia Tech and the University of Delaware. Held visiting professor positions at the University of Michigan, University of Wisconsin, AUB, and KFUPM and distinguished visiting scholar positions at various universities worldwide. Recipient of the Lester Ford Award of the Mathematical Association of America, the Sigma Xi Faculty Research Award and sustained Research Award in Science, Dr. Zakir Husain Award of the Indian Society of Industrial and Applied Mathematics, Fellow of the American Mathematical Society (Inaugural Class of 2013), and several international awards. Published over 140 papers and several expository papers and book chapters. Editor of two journals and member of editorial board of 30 journals, including four Springer journals. Editor-in-Chief of the Birkhäuser book series Geosystems Mathematics. Editor or coeditor of 11 books. Plenary lectures at meetings of 10 mathematical and engineering societies, and over 400 invited hour talks at conferences and colloquia. Organized over 30 international conferences and mini-symposia. Thomas Sonar Computational Mathematics Technische Universität Braunschweig 38106 Braunschweig Germany Study of Mechanical Engineering at the Fachhochschule Hannover, “Diplom” in 1980. Work as a Laboratory Engineer at the Fachhochschule Hannover from 1980 to 1981, founder of an engineering company, and consulting work from 1981 to 1984. Studies in Mathematics and Computer Science at the University of Hannover, “Diplom” with distinction in 1987. Research Scientist at the DLR in Braunschweig from 1987 to 1989 and then Ph.D. studies in Stuttgart and Oxford, 1991 Ph.D. in Mathematics with distinction, from 1991 to 1996 “Hausmathematiker” at the Institute for Theoretical Fluid Dynamics of DLR at Göttingen. 1995 “Habilitation” in Mathematics at the TH Darmstadt and 1996 Professor of Applied Mathematics at the University of Hamburg. Since 1999 Professor of Technomathematics at the TU Braunschweig. Head of the Group “Partial Differential Equations.” Member of the “Braunschweigische Wissenschaftliche Gesellschaft” and Corresponding Member of the Hamburg Academy of Sciences. Published more than 100 papers, several book chapters, and 10 books.
Contributors
Mario Aigner Institute of Fluid Mechanics and Heat Transfer, Vienna University of Technology, Vienna, Austria Detlef Angermann Deutsches Geodätisches Forschungsinstitut, Munich, Germany Matthias Augustin Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany Amir Z. Averbuch School of Computer Science, Tel Aviv University, Tel Aviv, Israel Frank Bauer DZ Bank AG, Kapitalmärkte Handel, Quantitative Modelle, F/KHSQ, Frankfurt, Germany Mathias Bauer CBM GmbH, Bexbach, Germany Martin Baumann Engineering Mathematics and Computing Lab (EMCL), Karlsruhe Institute of Technology, Karlsruhe, Germany Jörn Behrens Numerical Methods in Geosciences, University of Hamburg, CliSAP, Hamburg, Germany John J. Benedetto Norbert Wiener Center, Department of Mathematics, University of Maryland, College Park, MD, USA Joachim Benz Faculty of Organic Agricultural Sciences, Work-Group DataProcessing and Computer Facilities, University of Kassel, Witzenhausen, Germany Christian Blick Geomathematics Kaiserslautern, Germany
Group,
University
of
Kaiserslautern,
Mathis Bloßfeld Deutsches Geodätisches Forschungsinstitut, Munich, Germany Lutz Breuer Institute of Landscape Ecology and Resources Management, JustusLiebig-University of Giessen, Giessen, Germany Klaus Butterbach-Bahl Karlsruhe Institute of Technology (KIT), Institute of Meteorology and Climate Research, Garmisch-Partenkirchen, Germany
xix
xx
Contributors
Jef Caers Department of Energy Resources Engineering, Stanford University, Stanford, CA, USA Charles K. Chui Department of Statistics, Stanford University, Stanford, CA, USA Wojciech Czaja Norbert Wiener Center, Department of Mathematics, University of Maryland, College Park, MD, USA Stephan Dahlke FB 12 Mathematics and Computer Sciences, Philipps-University of Marburg, Marburg, Germany Sergey Danilov Alfred Wegener Institute for Polar and Marine Research, Bremerhaven, Germany Denise Dettmering Deutsches Germany
Geodätisches
Forschungsinstitut,
Munich,
Nils Dorband Meteomatics GmbH, St. Gallen, Switzerland Sarah Eberle Geomathematics Kaiserslautern, Germany
Group,
University
of
Kaiserslautern,
Yalchin Efendiev Department of Mathematics, Institute for Scientific Computation (ISC), Texas A&M University, College Station, TX, USA Numerical Porous Media SRI Center, King Abdullah University of Science and Technology (KAUST), Makkah Province, Kingdom of Saudi Arabia Paul N. Eggermont Food and Resource Economics, University of Delaware, Newark, DE, USA Benjamin Faigle Department of Hydromechanics and Modeling of Hydrosystems, Institute of Hydraulic Engineering, University of Stuttgart, Stuttgart, Germany Martin Fengler Meteomatics GmbH, St. Gallen, Switzerland Luciana Fenoglio-Marc Institute of Geodesy, Technical University Darmstadt, Darmstadt, Germany Bernd Flemisch Department of Hydromechanics and Modeling of Hydrosystems, Institute of Hydraulic Engineering, University of Stuttgart, Stuttgart, Germany Natasha Flyer Institute for Mathematics Applied to Geosciences, National Center for Atmospheric Research, Boulder, CO, USA Bengt Fornberg Department of Applied Mathematics, University of Colorado, Boulder, CO, USA Jochen Förstner German Weather Service (DWD), Offenbach, Germany Willi Freeden Geomathematics Kaiserslautern, Germany
Group,
University
of
Kaiserslautern,
Contributors
xxi
Christian Gerhards Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany Michael Gerstl Deutsches Geodätisches Forschungsinstitut, Munich, Germany Hans-Jürgen Götze Institut für Geowissenschaften, Geophysik, ChristianAlbrechts-Universität zu Kiel, Kiel, Germany Erik W. Grafarend Department of Geodesy and Geoinformatics, Stuttgart University, Stuttgart, Germany Markus Grasmair Department of Mathematics, Norwegian University of Science and Technology, Trondheim, Norway Erwin Groten Institute of Physical Geodesy, Technical University Darmstadt, Darmstadt, Germany Martin Grothaus Functional Analysis Group, University of Kaiserslautern, Kaiserlautern, Germany Andreas Gumann Zurich, Switzerland Klaus Gürlebeck Bauhaus-Universität Weimar, Weimar, Germany Martin Gutting Geomathematics Group, University of Siegen, Siegen, Germany Edwin Haas Karlsruhe Institute of Technology (KIT), Institute of Meteorology and Climate Research, Garmisch-Partenkirchen, Germany Markus Haltmeier Institute of Mathematics, University of Innsbruck, Innsbruck, Austria Helmut Harder Institut für Geophysik, Westfählische Wilhelms-Universität, Münster, Germany Alemdar Hasanov (Hasanoˇglu) Mathematics and Computer Science, Izmir University, Izmir, Turkey Andreas Helfrich-Schkarbanenko Institute for Applied and Numerical Mathematics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany Rainer Helmig Department of Hydromechanics and Modeling of Hydrosystems, Institute of Hydraulic Engineering, University of Stuttgart, Stuttgart, Germany Kerstin Hesse Department of Mathematics, University of Paderborn, Paderborn, Germany Vincent Heuveline Engineering Mathematics and Computing Lab (EMCL), Karlsruhe Institute of Technology, Karlsruhe, Germany Institute for Applied and Numerical Mathematics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany University of Heidelberg, Interdisciplinary Center for Scientific Computing, Engineering Mathematics and Computing Lab, Heidelberg, Germany
xxii
Contributors
Ralf Hielscher Applied Functional Analysis, Technical University Chemnitz, Chemnitz, Germany Gauthier Hulot Equipe de Géomagnétisme, Institut de Physique du Globe de Paris, Sorbonne Paris Cité, Université Paris Diderot, Paris, France Maxim Ilyasov Geomathematics Kaiserslautern, Germany
Group,
University
of
Kaiserslautern,
Christopher Jekeli Division of Geodetic Science, School of Earth Sciences, Ohio State University, Columbus, OH, USA René Kahnt G.E.O.S. Ingenieurgesellschaft mbH, Freiberg, Germany Wolfgang Keller Geodätisches Institut, Universität Stuttgart, Stuttgart, Germany Gaël Kermarrec Institut für Erdmessung, Leibniz Universität Hannover, Hannover, Germany Matthias Klapp Asperg, Germany Steffen Klatt Karlsruhe Institute of Technology (KIT), Institute of Meteorology and Climate Research, Garmisch-Partenkirchen, Germany Rupert Klein FB Mathematik und Informatik, Institut für Mathematik, Freie Universität Berlin, Berlin, Germany Volker Klemann Geodesy and Remote Sensing, Helmholtz Centre Potsdam, GFZ German Research Centre for Geosciences, Potsdam, Germany Matthias Klug Geomathematics Kaiserslautern, Germany
Group,
University
of
Kaiserslautern,
Alexandra Kochendörfer Department of Financial Mathematics, Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany Andrey Konkov The Institute of Applied Physics of the Russian Academy of Sciences, Nizhny Novgorod, Russia Ralf Korn Department Kaiserslautern, Germany
of
Mathematics,
University
of
Kaiserslautern,
Department of Financial Mathematics, Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany Dan D. Kosloff Department of Earth and Planetary Sciences, Tel Aviv University, Tel Aviv, Israel Inna Kozlov Department of Computer Science, Holon Institute of Technology, Holon, Israel Philipp Kraft Institute of Landscape Ecology and Resources Management, JustusLiebig-University of Giessen, Giessen, Germany
Contributors
xxiii
Jonas Kratzke Engineering Mathematics and Computing Lab (EMCL), Karlsruhe Institute of Technology, Karlsruhe, Germany David Kraus Karlsruhe Institute of Technology (KIT), Institute of Meteorology and Climate Research, Garmisch-Partenkirchen, Germany Julian Kuhlmann Earth System Modelling, Helmholtz Centre Potsdam, GFZ German Research Centre for Geosciences, Potsdam, Germany Institute of Meteorology, Freie Universität Berlin, Berlin, Germany Jürgen Kusche Astronomical, Physical and Mathematical Geodesy Group, Bonn University, Bonn, Germany Hansjörg Kutterer Bundesamt für Kartographie und Geodäsie, Frankfurt am Main, Germany Vincent LaRiccia Food and Resource Economics, University of Delaware, Newark, DE, USA Stefan Laps Bochum, Germany Henri Laur European Space Agency (ESA), Head of Earth Observation Mission Management Division, ESRIN, Frascati, Italy Andrey Lebedev The Institute of Applied Physics of the Russian Academy of Sciences, Nizhny Novgorod, Russia Volker Liebig European Space Agency (ESA), Director of Earth Observation Programmes, ESRIN, Frascati, Italy Ignace Loris Université libre de Bruxelles, Bruxelles, Belgium Shuai Lu Institute for Computational and Applied Mathematics, Austrian Academy of Sciences, Linz, Austria Yuri Luchko Department of Mathematics, Physics, and Chemistry, Beuth Technical University of Applied Sciences Berlin, Berlin, Germany Manfred Lücke Institut für Theoretische Physik, Universität des Saarlandes, Saarbrücken, Germany Mark A. Lukas Mathematics and Statistics, School of Engineering and Information Technology, Murdoch University, Murdoch, WA, Australia Yang Luo Department of Geosciences, Princeton University, Princeton, NJ, USA Sandra Möhringer Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany David Mainprice Géosciences UMR CNRS 5243, Université Montpellier 2, Montpellier, France
xxiv
Contributors
Sergey Manakov The Institute of Applied Physics of the Russian Academy of Sciences, Nizhny Novgorod, Russia Zdenˇek Martinec Department of Geophysics, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic Dublin Institute for Advanced Studies, Dublin, Ireland Ulrich Matthes Rhineland-Palatinate Center of Excellence for Climate Change Impacts, Trippstadt, Germany Carsten Mayer Geomathematics Kaiserslautern, Germany
Group,
University
of
Kaiserslautern,
Andreas Meister Department of Mathematics, Work-Group of Analysis and Applied Mathematics, University of Kassel, Kassel, Germany Liqiu Meng Institute for Photogrammetry and Cartography, TU Munich, Munich, Germany Isabel Michel née Ostermann Fraunhofer ITWM, Kaiserslautern, Germany Volker Michel Geomathematics Group, University of Siegen, Siegen, Germany Ryan Modrak Department of Geosciences, Princeton University, Princeton, NJ, USA Helmut Moritz Institute of Navigation, Graz University of Technology, Graz, Austria Balgaisha Mukanova Eurasian National University, Astana, Kazakhstan M. Zuhair Nashed Department of Mathematics, University of Central Florida, Orlando, FL, USA Thomas Neu Tiefe Geothermie Saar GmbH, Saarbrücken, Germany Guust Nolet Geosciences Azur, Université de Nice, Sophia Antipolis, France Helga Nutz Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany Nils Olsen DTU Space, Technical University of Denmark, Kgs. Lyngby, Denmark Roland Pail Institute of Astronomical and Physical Geodesy, TU Munich, Munich, Germany Kwangwon Park Department of Energy Resources Engineering, Stanford University, Stanford, CA, USA Sergei V. Pereverzev Institute for Computational and Applied Mathematics, Austrian Academy of Sciences Linz, Austria
Contributors
xxv
Isaac Pesenson Department of Mathematics, Temple University, Philadelphia, PA, USA Alexander Petukhov Department of Mathematics, University of Georgia, Athens, GA, USA Alain Plattner Department of Geosciences, Princeton University, Princeton, NJ, USA Department of Earth and Environmental Science, California State University, Fresno, CA, USA Michael Presho The Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin, Austin, TX, USA Helmut Pruscha Mathematical Institute, University of Munich, Munich, Germany Alessandro Punzi Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany Thomas Raskop Functional Analysis Group, University of Kaiserslautern, Kaiserlautern, Germany Karin Reich Department of Mathematics, University of Hamburg, Germany Jörg Renner Experimentelle Geophysik - Institut für Geologie, Mineralogie und Geophysik, Ruhr-Universität Bochum, Bochum, Germany Sebastian Ritterbusch Engineering Mathematics and Computing Lab (EMCL), Karlsruhe Institute of Technology, Karlsruhe, Germany Elena Roussanova Saxonian Academy of Sciences and Humanities in Leipzig, Leipzig, Germany Reiner Rummel Institut für Astronomische und Physikalische Geodäsie, TU Munich, Munich, Germany Laura Sánchez Deutsches Geodätisches Forschungsinstitut, Munich, Germany Terence J. Sabaka Planetary Geodynamics Laboratory, Code 698, NASA Goddard Space Flight Center, Greenbelt, MD, USA F. Sansò Politecnico di Milano – Polo Regionale di Como, Como, Italy Otmar Scherzer Computational Science Center, University of Vienna, Vienna, Austria Steffen Schön Institut für Erdmessung, Leibniz Universität Hannover, Hannover, Germany Helmut Schaeben Geophysics and Geoinformatics, TU Bergakademie Freiberg, Freiberg, Germany
xxvi
Contributors
Céline Scheidt Department of Energy Resources Engineering, Stanford University, Stanford, CA, USA Konrad Schindler Photogrammetry and Remote Sensing, ETH Zürich, Zürich, Switzerland Michael Schmidt Deutsches Geodätisches Forschungsinstitut, Munich, Germany Jens Schröter Alfred Wegener Institute for Polar and Marine Research, Bremerhaven, Germany Michael Schreiner Institute for Computational Engineering, University of Buchs, Buchs, Switzerland Harald Schuh Geodesy and Remote Sensing, Helmholtz Centre Potsdam, GFZ German Research Centre for Geosciences, Potsdam, Germany TU Berlin, Berlin, Germany Florian Seitz Deutsches Geodätisches Forschungsinstitut, Munich, Germany Manuela Seitz Deutsches Geodätisches Forschungsinstitut, Munich, Germany Monika Sester Institute of Cartography and Geoinformatics, Leibniz University Hannover, Hannover, Germany Michael Sideris Department of Geomatics Engineering, University of Calgary, Calgary, AB, Canada Frederik J. Simons Department of Geosciences, Princeton University, Princeton, NJ, USA Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA Ian H. Sloan School of Mathematics and Statistics, University of New South Wales, Sydney, Australia Aron Sommer Institut für Informationsverarbeitung (TNT), Leibniz Universität Hannover, Hannover, Germany Thomas Sonar Computational Mathematics, Braunschweig, Braunschweig, Germany
Technische
Universität
Wolfgang Sprößig TU Bergakademie Freiberg, Freiberg, Germany Holger Steeb Kontinuumsmechanik - Mechanik, Ruhr-Universität Bochum, Bochum, Germany Stephan Stellmach Institut für Geophysik, Westfählische Wilhelms-Universität, Münster, Germany Jin Sun Institute of Geophysics, ETH Zürich, Zürich, Switzerland
Contributors
xxvii
Peter J.G. Teunissen Department of Spatial Sciences, Curtin University of Technology, Perth, Australia Department of Geoscience and Remote Sensing, Delft University of Technology, Delft, Netherlands Maik Thomas Geodesy and Remote Sensing, Helmholtz Centre Potsdam, GFZ German Research Centre for Geosciences, Potsdam, Germany Institute of Meteorology, Freie Universität Berlin, Berlin, Germany Jeroen Tromp Department of Geosciences, Princeton University, Princeton, NJ, USA Program in Applied & Computational Mathematics, Princeton University, Princeton, NJ, USA Rudolf Umla BP Exploration Operating Company Limited, Sunbury on Thames, UK Bernhard Vogel Institute for Meteorology and Climate Research (IMK), Karlsruhe Institute of Technology, Eggenstein-Leopoldshafen, Germany Heike Vogel Institute for Meteorology and Climate Research (IMK), Karlsruhe Institute of Technology, Eggenstein-Leopoldshafen, Germany Joachim Vogt School of Engineering & Science – SES, Jacobs University, Bremen, Germany Jianzhong Wang Department of Mathematics, Sam Houston State University, Huntsville, TX, USA Yanfei Wang Key Laboratory of Petroleum Resources Research, Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing, People’s Republic of China Johannes Wicht Max-Planck Intitut für Sonnensystemforschung, KaltenburgLindau, Germany Martin Wlotzka University of Heidelberg, Interdisciplinary Center for Scientific Computing, Engineering Mathematics and Computing Lab, Heidelberg, Germany Detlef Wolf Department of Geodesy and Remote Sensing, German Research Center for Geosciences GFZ, Telegrafenberg, Potsdam, Germany Kerstin Wolf Geomathematics Kaiserslautern, Germany
Group,
University
of
Kaiserslautern,
Markus Wolff Department of Hydromechanics and Modeling of Hydrosystems, Institute of Hydraulic Engineering, University of Stuttgart, Stuttgart, Germany
xxviii
Contributors
Robert S. Womersley School of Mathematics and Statistics, University of New South Wales, Sydney, Australia Grady B. Wright Department of Mathematics, Boise State University, Boise, ID, USA Peiliang Xu Disaster Prevention Research Institute, Kyoto University, Uji, Kyoto, Japan Valery A. Zheludev School of Computer Science, Tel Aviv University, Tel Aviv, Israel
Part I General Issues, Historical Background, and Future Perspectives
Geomathematics: Its Role, Its Aim, and Its Potential Willi Freeden
Contents 1 2 3 4 5 6 7 8
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geomathematics as a Cultural Asset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geomathematics as Task and Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geomathematics as Interdisciplinary Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geomathematics as a Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geomathematics as Solution Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geomathematics as Solution Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geomathematics: Three Exemplary “Circuits” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Circuit: Gravity Field from Deflections of the Vertical . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Circuit: Oceanic Circulation from Ocean Topography . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Circuit: Seismic Processing from Acoustic Wave Tomography . . . . . . . . . . . . . . . . . 9 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 5 6 9 10 12 14 22 23 47 57 75 76
Abstract
During the last decades, geosciences and geoengineering were influenced by two essential scenarios: First, the technological progress has completely changed the observational and measurement techniques. Modern high-speed computers and satellite-based techniques are more and more entering all geodisciplines. Second, there is a growing public concern about the future of our planet, its climate, and its environment and about an expected shortage of natural resources. Obviously, both aspects, viz., efficient strategies of protection against threats of a changing Earth and the exceptional situation of getting terrestrial, airborne, as well as
W. Freeden () Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_1
3
4
W. Freeden
spaceborne data of better and better quality, explain the strong need of new mathematical structures, tools, and methods, i.e., geomathematics. This paper deals with geomathematics, its role, its aim, and its potential. Moreover, the “circuit” geomathematics is exemplified by three problems involving the Earth’s structure, namely, gravity field determination from terrestrial deflections of the vertical, ocean flow modeling from satellite (altimeter measured) ocean topography, and reservoir detection from (acoustic) wave tomography.
1
Introduction
Geophysics is an important branch of physics; it differs from the other physical disciplines due to its restriction to objects of geophysical character. Why shouldn’t the same hold for mathematics what physics regards as its canonical right since the times of Emil Wiechert in the late nineteenth century? More than ever before, there are significant reasons for a well-defined position of geomathematics as a branch of mathematics and simultaneously of geosciences (cf. Fig. 1). On the one hand, these
There exists a growing public concern about Modern high speed computers are the future of our planet, its climate, its envientering more and more all geodis- ronment, and about an expected shortage of natural resources. ciplines.
There is a strong need for strategies of protection against threats of a changing Earth.
There is an exceptional situation of getting data of better and better quality.
Fig. 1 Four significant reasons for the increasing importance of geomathematics
Geomathematics: Its Role, Its Aim, and Its Potential
5
reasons are intrinsically based on the self-conception of mathematics; on the other hand, they are explainable from the current situation of geosciences. In the following, I would like to explain my thoughts on geomathematics in more detail. My objective is to convince not only the geoscientists but also a broader audience: “Geomathematics is essential and important. As geophysics within physics it has an adequate forum within mathematics, and it should have a fully acknowledged position within geosciences!”
2
Geomathematics as a Cultural Asset
In chapter Navigation on Sea: Topics in the History of Geomathematics of this Handbook of Geomathematics, T. Sonar starts his contribution with the sentence: “Geomathematics in our times is thought of being a very young science and a modern area in the realms of mathematics. Nothing is farer from the truth. Geomathematics began as man realized that he walked across a sphere–like Earth and that this observation has to be taken into account in measurements and calculations.” In consequence, we can only do justice to geomathematics if we look at its historic importance, at least shortly. According to the oldest evidence which has survived in written form, geomathematics was developed in Sumerian Babylon and ancient Egypt (see Fig. 2) on the basis of practical tasks concerning measuring, counting, and calculation for reasons of agriculture and stock keeping.
Fig. 2 Papyrus scroll containing indications of algebra, geometry, and trigonometry due to Ahmose (nineteenth century BC) [Department of Ancient Egypt and Sudan, British Museum EA 10057, London, Creative Commons Lizenz CC-BY-SA 2.0] (taken from Sonar 2011)
6
W. Freeden
In the ancient world, mathematics dealing with problems of geoscientific relevance flourished for the first time, for example, when Eratosthenes (276–195 BC) of Alexandria calculated the radius of the Earth. We also have evidence that the Arabs carried out an arc measurement northwest of Bagdad in the year 827 AD. Further key results of geomathematical research lead us from the Orient across the occidental Middle Ages to modern times. Copernicus (1473–1543) successfully made the transition from the Ptolemaic geocentric system to the heliocentric system. Kepler (1571–1630) determined the laws of planetary motion. Further milestones from a historical point of view are, for example, the theory of geomagnetism developed by Gilbert (1544–1608), the development of triangulation methods for the determination of meridians by Brahe (1547–1601) and Snellius (1580–1626), the laws of falling bodies by Galilei (1564–1642), and the basic theory on the propagation of seismic waves by Huygens (1629–1695). The laws of gravitation formulated by the English scientist Newton (1643–1727) have taught us that gravitation decreases with an increasing distance from the Earth. In the seventeenth and eighteenth centuries, France took over an essential role through the foundation of the Academy in Paris (1666). Successful discoveries were the theory of the isostatic balance of mass distribution in the Earth’s crust by Bouguer (1698– 1758), the calculation of the Earth’s shape and especially of the pole flattening by Maupertuis (1698–1759) and Clairaut (1713–1765), and the development of the calculus of spherical harmonics by Legendre (1752–1833) and Laplace (1749– 1829). The nineteenth century was essentially characterized by Gauß (1777–1855). Especially important was the calculation of the lower Fourier coefficients of the Earth’s magnetic field, the hypothesis of electric currents in the ionosphere, as well as the definition of the level set of the geoid (however, the term “geoid” was defined by Listing (1808–1882), a disciple of Gauß). Riemann (1826–1866) made lasting contributions to differential geometry, some of them enabling the later development of general relativity. Helmert (1843–1917) laid the mathematical foundation of modern geodesy. At the end of the nineteenth century, the basic idea of the dynamo theory in geomagnetics was developed by Elsasser (1904–1981), Bullard (1907– 1980), etc. This very incomplete list (which does not even include essential facets of the last century) already shows that geomathematics is one of the large achievements of mankind from a historic point of view.
3
Geomathematics as Task and Objective
Modern geomathematics deals with the qualitative and quantitative properties of the current or possible structures of the system Earth. It guarantees concepts of scientific research concerning the system Earth, and it is simultaneously the force behind it. The system Earth (see Fig. 3) consists of a number of elements which represent individual systems themselves. The complexity of the entire system Earth is determined by interacting physical, biological, and chemical processes transforming and transporting energy, material, and information (cf. Emmermann and Raiser 1997). It is characterized by natural, social, and economic processes influencing
Geomathematics: Its Role, Its Aim, and Its Potential
7
Fig. 3 Geosystems Mathematics as key technology penetrating the complex system Earth (modified illustration following Emmermann and Raiser 1997)
one another. In most instances, a simple theory of cause and effect is therefore completely inappropriate if we want to understand the system. We have to think in dynamical structures and to account for multiple, unforeseen, and of course sometimes even undesired effects in the case of interventions. Inherent networks must be recognized and made use of, and self-regulation must be accounted for. All these aspects require a type of mathematics which must be more than a mere collection of theories and numerical methods. Mathematics dedicated to geosciences, i.e., geomathematics, deals with nothing more than the organization of the complexity of the system Earth. Descriptive thinking is required in order to clarify abstract complex situations. We also need a correct simplification of complicated interactions, an appropriate system of mathematical concepts for their description, and exact thinking and formulations. Geomathematics has thus become the key science of the complex system Earth. Wherever there are data and observations to be processed, e.g., the diverse scalar, vectorial, and tensorial clusters of satellite data, we need mathematics. For example, statistics serves for noise reduction, constructive approximation serves for compression and evaluation, and the theory of special function systems yields georelevant graphical and numerical representations – there are mathematical
8
W. Freeden
algorithms everywhere. The specific task of geomathematics is to build a bridge between mathematical theory and geophysical as well as geotechnical applications. The special attraction of this branch of mathematics is therefore based on the vivid communication between applied mathematicians more interested in model development, theoretical foundation, and the approximate as well as computational solution of problems and geoengineers and physicists more familiar with measuring technology, methods of data analysis, implementation of routines, and software applications. There is a very wide range of modern geosciences on which geomathematics is focused (see Fig. 4), not least because of the continuously increasing observation diversity. Simultaneously, the mathematical “toolbox” is becoming larger. A special feature is that geomathematics primarily deals with those regions of the Earth which are only insufficiently or not at all accessible for direct measurements (even by remote sensing methods (as discussed, e.g., in chapters Quantitative Remote Sensing Inversion in Earth Science: Theory and Numerical Treatment, Inverse Resistivity Problems in Computational Geoscience, Analysis of Data from Multi-satellite Geospace Missions, and Potential Methods and Geoinformation Systems of this work)). Inverse methods (see, for instance, chapters Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution, Multiresolution Analysis of Hydrology and Satellite Gravitational Data, Elastic and Viscoelastic Response of the Lithosphere to Surface Loading, Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents, The Forward and Adjoint Methods of Global Electromagnetic Induction for CHAMP Magnetic Data, Modeling Deep Geother-
Fig. 4 Geomathematics, its range of fields, and its disciplines
Geomathematics: Its Role, Its Aim, and Its Potential
9
mal Reservoirs: Recent Advances and Future Perspectives, Noise Models for Ill-Posed Problems, Sparsity in Inverse Geophysical Problems, Multiparameter Regularization in Downward Continuation of Satellite Data, Evaluation of Parameter Choice Methods for Regularization of Ill-Posed Problems in Geomathematics, Quantitative Remote Sensing Inversion in Earth Science: Theory and Numerical Treatment, Inverse Resistivity Problems in Computational Geoscience, Identification of Current Sources in 3D Electrostatics, Transmission Tomography in Seismology, Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery, Strategies in Adjoint Tomography, Potential-Field Estimation Using Scalar and Vector Slepian Functions at Satellite Altitude, Multidimensional Seismic Compression by Hybrid Transform with Multiscale-Based Coding, Tomography: Problems and Multiscale Solutions, RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences, Material Behavior: Texture and Anisotropy, and Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P- and S-Wave Velocities Near Surface Profiling of this handbook) are absolutely essential for mathematical evaluation in these cases (see, e.g., Nashed (1981) for a deeper insight). Mostly, a physical quantity is measured in the vicinity of the Earth’s surface, and it is then continued downward or upward by mathematical methods until one reaches the interesting depths or heights.
4
Geomathematics as Interdisciplinary Science
Once more it should be mentioned that, today, computers and measurement technology have resulted in an explosive propagation of mathematics in almost every area of society. Mathematics as an interdisciplinary science can be found in almost every area of our lives. Consequently, mathematics is closely interacting with almost every other science, even medicine and parts of the arts (mathematization of sciences). The use of computers allows for the handling of complicated models for real data sets. Modeling, computation, and visualization yield reliable simulations of processes and products. Mathematics is the “raw material” for the models and the essence of each computer simulation. As the key technology, it translates the images of the real world to models of the virtual world, and vice versa (cf. Fig. 5 for an example in seismic tomography). The special importance of mathematics as an interdisciplinary science (see also Neuzert and Rosenberger 1991; Beutelspacher 2001; Pesch 2002) has been acknowledged increasingly within the last few years in technology, economy, and commerce. However, this process does not remain without effects on mathematics itself. New mathematical disciplines, such as scientific computing, financial and business mathematics, industrial mathematics, biomathematics, and also geomathematics, have complemented the traditional disciplines. Interdisciplinarity also implies the interdisciplinary character of mathematics at school. Relations and references to other disciplines (especially informatics, physics, chemistry, biology, and also economy and geography) become more impor-
10
W. Freeden
Fig. 5 Geomathematics as a key technology bridging the real and virtual world: An example from seismic tomography
tant, more interesting, and more expandable. Problem areas of mathematics become explicit and observable, and they can be visualized. Of course, this undoubtedly also holds for the system Earth.
5
Geomathematics as a Challenge
From a scientific and technological point of view, the past twentieth century was a period with two entirely different faces concerning research and its consequences. The first two thirds of the century were characterized by a movement toward a seemingly inexhaustible future of science and technology; they were marked by the absolute belief in technical progress which would make everything achievable in the end. Up to the 1960s, mankind believed to have become the master of the Earth (note that, in geosciences as well as other sciences, to master is also a synonym for to understand). Geoscience was able to understand plate tectonics on the basis of Wegener’s theory of continental drift, geoscientific research began to deal with the Arctic and Antarctic, and man started to conquer the universe by satellites, so that for the first time in mankind’s history, the Earth became measurable on a global scale. Then, during the last third of the past century, there was a growing skepticism as to the question whether scientific and technical progress had really brought us forth and whether the effects of our achievements were responsible. As a consequence of the specter of a shortage in raw materials (mineral oil and natural gas reserves), predicted by the Club of Rome, geological/geophysical research with the objective of exploring new reservoirs was stimulated during the 1970s (see, e.g., Jakobs and Meyer 1992). Moreover, the last two decades of the century have sensitized us
Geomathematics: Its Role, Its Aim, and Its Potential
11
Fig. 6 (Potato) “Earth” of radius 6,371 km
for the global problems resulting from our behavior with respect to climate and environment. Our senses have been sharpened as to the dangers caused by the forces of nature, from earthquakes and volcanic eruptions to the temperature development and the hole in the ozone layer, etc. Man has become aware of his environment. The image of the Earth as a potatodrenched by rainfall (which is sometimes drawn by oceanographers) is not a false one (see Fig. 6). The humid layer on this potato, maybe only a fraction of a millimeter thick, is the ocean. The entire atmosphere hosting the weather and climate events is only a little bit thicker. Flat bumps protruding from the humid layer represent the continents. The entire human life takes place in a very narrow region of the outer peel (only a few kilometers in vertical extension). However, the basically excellent comparison of the Earth with a huge potato does not give explicit information about essential ingredients and processes of the system Earth, for example, gravitation, magnetic field, deformation, wind and heat distribution, ocean currents, internal structures, etc. In our twenty-first century, geoproblems currently seem to overwhelm the scientific programs and solution proposals. “How much more will our planet Earth be able to take?” has become an appropriate and very urgent question. Indeed, there have been a large number of far-reaching changes during the last few decades, e.g., species extinction, climate change, formation of deserts, ocean currents, structure of the atmosphere, transition of the dipole structure of the magnetic field to a quadrupole structure, etc. These changes have been accelerated dramatically. The main reasons for most of these phenomena are the unrestricted growth in the industrial societies (population and consumption, especially of resources, territory, and energy) and severe poverty in the developing and newly industrialized countries. The dangerous aspect is that extreme changes have taken place within a very short time; there has been no comparable development in the dynamics of the system Earth in the past. Changes brought about by man are much faster than changes due to natural fluctuations. Besides, the current financial crisis shows that our western model of affluence (which holds for approximately 1 billion people) cannot be transferred globally to 5–8 billion people. Massive effects on mankind are inevitable. The appalling résumé is that the geoscientific problems collected over the decades must now all be solved simultaneously. Interdisci-
12
W. Freeden
plinary solutions including intensive mathematical research are urgently required as answers to an increasingly complex world. Geomathematics is absolutely essential for a sustainable development in the future. However, the scientific challenge does not only consist of increasing the leading role of mathematics within the current “scientific consortium Earth”. The significance of the subject “Earth” must also be acknowledged (again) within mathematics itself, so that mathematicians will become more enthusiastic about it. Up to now, it has become usual and good practice in application-oriented mathematical departments and research institutions to present applications in technology, economy, finances, and even medicine as being very career enhancing for young scientists. Geomathematics can be integrated smoothly into this phalanx with current subjects like exploration, geothermal research, navigation, and so on. Mathematics should be the leading science for the solution of these complex and economically very interesting problems, instead of fulfilling mere service functions. Of course, basic research is indispensable. We should not hide behind the other geosciences! Neither should we wait for the next horrible natural disaster! Now is the time to turn expressly toward georelevant applications. The Earth as a complex, however limited system (with its global problems concerning climate, environment, resources, and population) needs new political strategies. Consequently, these will step-by-step also demand changes in research and teaching at the universities due to our modified concept of “well-being” (e.g., concerning milieu, health, work, independence, financial situation, security, etc.). This will be a very difficult process. We dare to make the prognosis that, finally, it will also result in a modified appointment practice at the universities. Chairs in the field of geomathematics must increase in number and importance, in order to promote attractiveness, but also to accept a general responsibility for society. The time has come to realize that geomathematics is indispensable as a constituting discipline within a mathematical faculty (instead of “ivory tower-like” parity thinking following traditional structures). Additionally, the curricular standards and models for school lessons in mathematics (see, e.g., Sonar 2001; Bach et al. 2004) have also to change. We will not be able to afford any jealousies or objections on our way in that direction.
6
Geomathematics as Solution Potential
Current methods of applied measurement and evaluation processes vary strongly, depending on the examined measurement parameters (gravity, electric or magnetic field force, temperature and heat flow, etc.), the observed frequency domain, and the occurring basic “field characteristic” (potential field, diffusion field, or wave field, depending on the basic differential equations). In particular, the differential equation strongly influences the evaluation processes. The typical mathematical methods are therefore listed here according to the respective “field characteristic” – as it is usually done in geomathematics:
Geomathematics: Its Role, Its Aim, and Its Potential
13
• Potential methods (potential fields, elliptic differential equations) in geomagnetics, geoelectrics, gravimetry, etc. • Diffusion methods (diffusion fields, parabolic differential equations) in flow and heat transport, magnetotellurics, geoelectromagnetics, etc. • Wave methods (wave fields, hyperbolic differential equations) in seismics, georadar, etc. The diversity of mathematical methods will increase in the future due to new technological developments in computer and measurement technology. More intensively than before, we must aim for the creation of models and simulations for combinations and networks of data and observable structures. The process (i.e., the “circuit”) for the solution of practical problems usually has the following components: • Mathematical modeling: the practical problem is translated into the language of mathematics, requiring the cooperation between application-oriented scientists and mathematicians. • Mathematical analysis: the resulting mathematical problem is examined as to its “well-posedness” (i.e., existence, uniqueness, dependence on the input data). • Development of a mathematical solution method: appropriate analytical, algebraic, statistic/stochastic, and/or numerical methods and processes for a specific solution must be adapted to the problem; if necessary, new methods must be developed. The solution process is carried out efficiently and economically by the decomposition into individual operations, usually on computers. • “Back-transfer” from the language of mathematics to applications: the results are illustrated adequately in order to ensure their evaluation. The mathematical model is validated on the basis of real data and modified, if necessary. We aim for good accordance of model and reality. Often, the circuit must be applied several times in an iterative way in order to get a sufficient insight into the system Earth. Nonetheless, the advantage and benefit of the mathematical processes are a better, faster, cheaper, and more secure problem solution on the basis of the mentioned means of simulation, visualization, and reduction of large amounts of data. So, what is it exactly that enables mathematicians to build a bridge between the different disciplines? The mathematics’ world of numbers and shapes contains very efficient tokens by which we can describe the rule-like aspect of real problems. This description includes a simplification by abstraction: essential properties of a problem are separated from unimportant ones and included into a solution scheme. Their “eye for similarities” often enables mathematicians to recognize a posteriori that an adequately reduced problem may also arise from very different situations, so that the resulting solutions may be applicable to multiple cases after an appropriate adaptation or concretization. Without this second step, abstraction remains essentially useless.
14
W. Freeden
The interaction between abstraction and concretization characterizes the history of mathematics and its current rapid development as a common language and independent science. A problem reduced by abstraction is considered as a new “concrete” problem to be solved within a general framework, which determines the validity of a possible solution. The more examples one knows, the more one recognizes the causality between the abstractness of mathematical concepts and their impact and cross-sectional importance. Of course, geomathematics is closely interconnected with geoinformatics, geoengineering, and geophysics. However, geomathematics basically differs from these disciplines (cf. Kümmerer 2002). Engineers and physicists need the mathematical language as a tool. In contrast to this, geomathematics also deals with the further development of the language itself. Geoinformatics concentrates on the design and architecture of processors and computers, databases and programming languages, etc., in a georeflecting environment (cf. chapters Potential Methods and Geoinformation Systems and Geoinformatics). In geomathematics, computers do not represent the objects to be studied, but instead represent technical auxiliaries for the solution of mathematical problems of georeality. Statistics (usually seen as a basic subdiscipline of mathematics) is generally devoted to the analysis and interpretation of uncertainties caused by limited sampling of a property under study. In consequence, the focus of geostatistics is the development and statistical validation of models to describe the distribution in time and space of Earth sciences phenomena. Geostatistical realizations (see, e.g., chapters An Introduction to Prediction Methods in Geostatistics, Statistical Analysis of Climate Series, Geodetic Deformation Analysis with Respect to an Extended Uncertainty Budget, and It’s All About Statistics: Global Gravity Field Modeling from GOCE and Complementary Data) aim at integrating physical constraints, combining heterogeneous data sources, and characterizing uncertainty. Applications include a large palette of areas, for example, groundwater hydrology, air quality, and land use change using terrestrial as well as satellite data. Because of both the statistical distribution of sample data and the spatial correlation among the sample data, a large variety of Earth science problems are effectively addressed using statistical procedures. Stochastic systems and processes play an important role in mathematical models of phenomena in geosciences.
7
Geomathematics as Solution Method
Up to now, ansatz functions for the description of geoscientifically relevant parameters have been frequently based on the almost spherical shape of the Earth. By modern satellite positioning methods, the maximum deviation of the actual Earth’s surface from the average Earth’s radius (6,371 km) can be proved as being less than 0.4 %. Although a mathematical formulation in a spherical context may be a restricted simplification, it is at least acceptable for a large number of problems (see chapter Special Functions in Mathematical Geosciences: An Attempt at a Categorization of this handbook for more details). In fact, ellipsoidal nomenclature
Geomathematics: Its Role, Its Aim, and Its Potential
15
(see chapter Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics) is much closer to geophysical and/or geodetic purposes, but the computational and numerical amount of work is a tremendous obstacle that could not be sufficiently cleared so far. Usually, in geosciences, we consider a separable Hilbert space such as the L2 space (of finite signature energy) with a (known) polynomial basis as reference space for ansatz functions. However, there is a striking difference between the L2 space and the Earth’s body and surface. Continuous “surface functions” can be described in arbitrary accuracy, for example, with respect to C- and L2 -topology by restrictions of harmonic functions (such as spherical harmonics), whereas “volume functions” contain anharmonic ingredients (for more details see, e.g., Freeden and Gerhards 2013; Michel 2013). This fact has serious consequences for the reconstruction of signatures. Since the times of C.F. Gauß (1863, Werke, Bd. 5), a standard method for globally reflected surface approximation involving equidistributed data has been the Fourier series in an orthogonal basis in terms of spherical harmonics. It is characteristic for such an approach that these polynomial ansatz functions do not show any localization in space (cf. Fig. 7). In the momentum domain (throughout this work called frequency domain), each spherical function corresponds to exactly one single Fourier coefficient reflecting a certain frequency. We call this ideal frequency localization. Due to the ideal frequency localization and the simultaneous dispensation with localization in space, local data modifications influence all the Fourier coefficients (that have to be determined by global integration). Consequently, this also leads to global modifications of the data representations in case of local changes. Fourier expansions provide approximation by oscillation, i.e., the oscillations grow in number, while the amplitudes become smaller and
Fig. 7 Uncertainty principle and its consequences for space–frequency localization
16
W. Freeden
Fig. 8 Spherical harmonics of low degrees leading to a zonal function, i.e., a sum of Legendre functions
smaller. Nevertheless, we may state that ideal frequency localization has proved to be extraordinarily advantageous due to the important physical interpretability (as multipole moments) of the model and due to the simple comparability of the Fourier coefficients with observables in geophysical and/or geodetic interrelations (see, e.g., the Meissl diagrams discussed by Meissl (1971), Rummel and van Gelderen (1995), Grafarend (2001), and Nutz (2002)). From a mathematical and physical point of view, however, certain kinds of ansatz functions would be desirable which show ideal frequency localization as well as localization in space. Such an ideal system of ansatz functions would allow for models of highest resolution in space; simultaneously, individual frequencies would remain interpretable. However, the principle of uncertainty, which connects frequency localization and space localization qualitatively and quantitatively, teaches us that both properties are mutually exclusive (except for the trivial case).
Geomathematics: Its Role, Its Aim, and Its Potential
17
Fig. 9 Weighted summation of spherical harmonics leading to the generation of space-localized zonal kernels
Extreme ansatz functions in the sense of such an uncertainty principle are, on the one hand, spherical polynomials (see Fig. 8), i.e., spherical harmonics (no space localization, ideal frequency localization), and, on the other hand, the Dirac (kernel) function(als) (ideal space localization, no frequency localization). In consequence (see also chapter Special Functions in Mathematical Geosciences: An Attempt at a Categorization of this handbook, and for further details Freeden 1998, 2011; Freeden et al. 1998; Freeden and Maier 2002; Freeden and Schreiner 2009; Freeden and Gutting 2013), (spherical harmonic) Fourier methods are surely well suited to resolve low- and medium-frequency phenomena, while their application is critical to obtain high-resolution models. This difficulty is also well known to theoretical physics, e.g., when describing monochromatic electromagnetic waves or considering the quantum-mechanical treatment of free particles. In this case, plane waves with fixed frequencies (ideal frequency localization, no space localization) are the solutions of the corresponding differential equations, but do certainly not reflect the physical reality. As a remedy, plane waves of different frequencies are superposed to the so-called wave packages, which gain a certain amount of space localization while losing their ideal frequency (spectral) localization.
18
W. Freeden
A suitable superposition of polynomial ansatz functions (see Freeden and Schreiner 2009) leads to the so-called kernel functions/kernels with a reduced frequency but increased space localization (cf. Fig. 9). These kernels can be constructed as to cover various spectral bands and, hence, can show all intermediate stages of frequency and space localization. The width of the corresponding frequency and space localization is usually controlled using a so-called scale parameters. If the kernel is given by a finite superposition of polynomial ansatz functions, it is said to be bandlimited, while in the case of infinitely many ansatz functions, the kernel is called non-bandlimited. It turns out that due to their higher-frequency localization (short frequency band), the bandlimited kernels show less space localization than their non-bandlimited counterparts (infinite frequency band). This leads to the following characterization of ansatz functions: Fourier methods with polynomial trial functions are the canonical point of departure for approximations of low-frequency phenomena (global to regional modeling). Because of their excellent localization properties in the space domain, bandlimited and non-bandlimited kernels with increasing space localization properties can be used for stronger and stronger modeling of short-wavelength phenomena (local modeling). Using kernels of different scales, the modeling approach can be adapted to the localization properties of the physical process under consideration. By use of sequences of scale-dependent kernels tending to the Dirac kernel, i.e., the so-called Dirac sequences, a multiscale approximation (i.e., “zooming-in” process) can be established appropriately. Later on in this treatise (see also chapters Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution, Mathematical Properties Relevant to Geomagnetic Field Modeling, Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions in Geomagnetic Induction, The Forward and Adjoint Methods of Global Electromagnetic Induction for CHAMP Magnetic Data, Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications, Potential-Field Estimation Using Scalar and Vector Slepian Functions at Satellite Altitude, and Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P- and S-wave Velocities Near Surface Profiling), we deal with simple (scalar and/or vectorial) wavelet techniques, i.e., with multiscale techniques based on special kernel functions: (spherical) scaling functions and wavelets. Typically, the generating functions of scaling functions have the characteristics of lowpass filters, i.e., the polynomial basis functions of higher frequencies are attenuated or even completely left out. The generating functions of wavelets, however, have the typical properties of band-pass filters, i.e., the polynomial basis functions of low and high frequencies are attenuated or even completely left out when constructing the wavelet. Thus, wavelet techniques usually lead to a multiresolution of the Hilbert space under consideration, i.e., a certain two-parameter splitting with respect to scale and space. To be more concrete, the Hilbert space under consideration can be decomposed into a nested sequence of approximating subspaces – the scale spaces – corresponding to the scale parameter. In each scale space, a model of the data function can usually be calculated using the respective scaling functions, thus leading to an approximation of the data at a certain resolution. For increasing scales, the approximation improves and the information obtained on coarse levels
Geomathematics: Its Role, Its Aim, and Its Potential
19
is contained in all levels of approximation above. The difference between two successive approximations is called the detail information, and it is contained in the so-called detail spaces. The wavelets constitute the basis functions of the detail spaces, and summarizing the subject, every element of the Hilbert space can be represented as a structured linear combination of scaling functions and wavelets of different scales and at different positions (“multiscale approximation”) (cf. Freeden et al. (1998); Freeden and Michel (2004); Michel (2002); Michel (2013), and the references therein). Hence, we have found a possibility to break up complicated functions like geomagnetic field, electric currents, gravitational field, deformation field, oceanic currents, propagation speed of seismic waves, etc., into single pieces of different resolutions and to analyze these pieces separately and to decorrelate certain features. This helps to find adaptive methods (cf. Fig. 10) that take into account the specific structure of the data, i.e., in areas where the data show only a few coarse spatial structures, the resolution of the model can be chosen to be rather low; in areas of complicated data structures, the resolution can be increased accordingly. In areas where the accuracy inherent in the measurements is reached, the solution process can be stopped by some kind of thresholding. That is, using scaling functions and wavelets at different scales, the corresponding approximation techniques can be constructed as to be suitable for the particular data situation. Consequently, although most data show correlation in space as well as in frequency, the kernel functions with their simultaneous space and frequency localization allow for the efficient detection and approximation of essential features in the data structure by only using fractions of the original information (decorrelation of signatures) (Fig. 11). Finally, it is worth mentioning that future spaceborne observation combined with terrestrial and airborne activities will provide huge data sets of the order of millions of data to be continued downward to the Earth’s surface (see chapters Earth Observation Satellite Missions and Data Access, Satellite-to-Satellite Tracking (Low-Low/High-Low SST), GOCE: Gravitational Gradiometry in a Satellite, Sources of the Geomagnetic Field and the Modern Data That Enable Their Investigation, Mathematical Properties Relevant to Geomagnetic Field Modeling, Using B-Spline Expansions for Ionosphere Modeling, Radio Occultation via Satellites, and Analysis of Data from Multi-satellite Geospace Missions concerning different fields of observation). Standard mathematical theory and numerical methods are not at all adequate for the solution of data systems with a structure such as this, because these methods are simply not adapted to the specific character of the spaceborne problems. They quickly reach their capacity limit even on very powerful computers. In our opinion, a reconstruction of significant geophysical quantities from future data material requires much more: for example, it requires a careful analysis, fast solution techniques, and a proper stabilization of the solution, usually including procedures of regularization (see Freeden 1999; Freeden and Mayer 2003 and the references therein). In order to achieve these objectives, various strategies and structures must be introduced reflecting different aspects (cf. Fig. 12). As already pointed out, while global long-wavelength modeling can be adequately done by the use of polynomial expansions, it becomes more and more obvious
Fig. 10 Scaling functions (upper row) and wavelet functions (lower row) in mutual relation (“tree structure”) within a multiscale reconstruction
20 W. Freeden
Fig. 11 Global multiscale reconstruction of the Earth’s Gravitational Model EGM96 (based on data taken from F. G. Lemoine et al. 1998)
Geomathematics: Its Role, Its Aim, and Its Potential 21
22
W. Freeden
Fig. 12 Structural principles and methods
that splines and/or wavelets are most likely the candidates for medium- and shortwavelength approximation. But the multiscale concept of wavelets demands its own nature which – in most cases – cannot be developed from the well-known theory in Euclidean spaces. In consequence, the stage is also set to present the essential ideas and results involving a multiscale framework to the geoscientific community (see chapters Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution, Multiresolution Analysis of Hydrology and Satellite Gravitational Data, Multiscale Model Reduction with Generalized Multiscale Finite Element Methods in Geomathematics, Efficient Modeling of Flow and Transport in Porous Media Using Multi-physics and Multi-scale Approaches, Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents, Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications, Multiparameter Regularization in Downward Continuation of Satellite Data, Multidimensional Seismic Compression by Hybrid Transform with Multiscale-Based Coding, Tomography: Problems and Multiscale Solutions, Splines and Wavelets on Geophysically Relevant Manifolds, and Multiscale Approximation for different realizations).
8
Geomathematics: Three Exemplary “Circuits”
In the sequel, the “circuit” geomathematics as solution potential will be demonstrated with respect to contents, origin, and intention on three completely different geoscientifically relevant examples, namely, the determination of the gravity field from terrestrial deflections of the vertical, the computation of (geostrophic) oceanic
Geomathematics: Its Role, Its Aim, and Its Potential
23
circulation from altimeter satellite data (i.e., RADAR data), and seismic processing from acoustic wave propagation. Their solution can be performed within the same mathematical context; it can be actually described within the same apparatus of concepts and formulas based on the use of fundamental solutions of associated partial differential equations and their appropriate regularizations. Nevertheless, the numerical methods must be specifically adapted to the concrete solution of any of the three problems; in this respect they are basically dissimilar. In the first example (i.e., the gravity field determination), we are confronted with a process of integrating the vectorial surface gradient equation; in the second example, (i.e., modeling of the oceanic circulation), we have to execute a process of differentiation following the vectorial surface curl gradient equation – in both cases to discretely given data which are assumed to be available on spheres, for simplicity. The first example uses vectorial data on the Earth’s surface, and the second example utilizes scalar satellite altimetry data. The essential goal of the third example concerned with seismic (post)processing is to transfer the existing signal, which resulted from acoustic wave propagation from seismograms, under the expectation that designated properties of the target bedrock such as a migration result or a velocity field model can be interpreted from the transformed information. In more detail, the acoustic waves are reflected at the places of impedance contrast (rapid changes of the medium density), propagated back, and then recorded on the surface and/or in available boreholes by receivers of seismic energy. The recorded seismograms are carefully processed to detect fractures along with their location, orientation, and aperture, which are needed for interpretation and definition of the target reservoir. In all our three examples, our work is based on a simple regularization procedure of fundamental solutions to associated partial differential equations and the realization of a multiscale approach leading to locally supported wavelets.
8.1
Circuit: Gravity Field from Deflections of the Vertical
The modeling of the gravity field and its equipotential surfaces, especially the geoid of the Earth (see chapters Earth Observation Satellite Missions and Data Access, Satellite-to-Satellite Tracking (Low-Low/High-Low SST), GOCE: Gravitational Gradiometry in a Satellite, Classical Physical Geodesy, Geodetic Boundary Value Problem, Time-Variable Gravity Field and Global Deformation of the Earth, Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution, Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics, Multiresolution Analysis of Hydrology and Satellite Gravitational Data, Time-Varying Mean Sea Level, Gravitational Viscoelastodynamics, Oblique Stochastic Boundary-Value Problem, It’s All About Statistics: Global Gravity Field Modeling from GOCE and Complementary Data, Analysis of Data from Multi-satellite Geospace Missions, and Geodetic World Height System Unification), is essential for many applications (see Fig. 13 for a graphical illustration), from
Fig. 13 Gravity-involved processes (From “German Priority Research Program: Mass transport and mass distribution in the system Earth, DFG–SPP 1257”)
24 W. Freeden
Geomathematics: Its Role, Its Aim, and Its Potential
25
which we only mention some significant examples (essentially following Rummel 2002): Earth system: there is a growing awareness of global environmental problems (e.g., the CO2 question, the rapid decrease of rain forests, global sea level changes, etc.). What is the role of the future terrestrial campaigns, airborne duties, and satellite activities in this context? They do not tell us the reasons for physical processes, but it is essential to bring the phenomena into one system (e.g., to make sea level records comparable in different parts of the world). In other words, the geoid, i.e., the equipotential surface at sea level as defined by Listing (1873), is viewed as an almost static reference for many rapidly changing processes and at the same time as a “frozen picture” of tectonic processes that evolved over geological time spans. Solid earth physics: the gravity anomaly field has its origin mainly in mass inhomogeneities of the continental and oceanic lithosphere. Together with height information and regional tomography, a much deeper understanding of tectonic processes should be obtainable in the future. Physical oceanography: the altimeter satellites in combination with a precise geoid will deliver global dynamic ocean topography. Global surface circulation can be computed resulting in a completely new dimension of ocean modeling. Circulation allows the determination of transport processes of, e.g., polluted material. Satellite orbits: for any positioning from space, the uncertainty in the orbit of the spacecraft is the limiting factor. The future spaceborne techniques will basically eliminate all gravitational uncertainties in satellite orbits. Geodesy and civil engineering: accurate heights are needed for civil constructions, mapping, etc. They are obtained by leveling, a very time-consuming and expensive procedure. Nowadays geometric heights can be obtained fast and efficiently from space positioning (e.g., GPS, GLONASS, and (future) GALILEO). The geometric heights are convertible to leveled heights by subtracting the precise geoid (see Fig. 22), which is implied by a high-resolution gravitational potential. To be more specific, in those areas where good gravity information is already available, the future data information will eliminate all mediumand long-wavelength distortions in unsurveyed areas. GPS, GLONASS, and GALILEO together with the planned explorer satellite missions for the past 2015 time frame will provide extremely high-quality height information at the global scale. Exploration geophysics and prospecting: airborne gravity measurements have usually been used together with aeromagnetic surveys, but the poor precision of airborne gravity measurements has hindered a wider use of this type of measurements. Strong improvements can be expected by combination with terrestrial and spaceborne observations in the future scenario. The basic interest (see, e.g., Bauer et al. (2014)) in gravitational methods in exploration is based on the small variations in the gravitational field anomalies in relation to, e.g., an ellipsoidal reference model, i.e., the so-called “normal” gravitational field.
26
W. Freeden
Mathematical Modeling of the Gravity Field (Classical Approach) Gravity as provided on the Earth’s surface by absolute and/or relative measurements (see Fig. 14) is the combined effect of the gravitational mass attraction and the centrifugal force due to the Earth’s rotation. The force of gravity provides a directional structure to the space above the Earth’s surface. It is tangential to the vertical plumb lines and perpendicular to all level surfaces. Any water surface at rest is part of a level surface. As if the Earth were a homogeneous, spherical body, gravity turns out to be constant all over the Earth’s surface, the well-known quantity 9:8 ms2 . The plumb lines are directed toward the Earth’s center of mass, and this implies that all level surfaces are nearly spherical, too. The gravity decreases from the poles to the equator by about 0:05 ms2 (see Fig. 15). This is caused by the flattening of the Earth’s figure and the negative effect of the centrifugal force, which is maximal at the equator. High mountains and deep ocean trenches (cf. Fig. 16) cause the gravity to vary. Materials within the Earth’s interior are not uniformly distributed. The irregular gravity field shapes the geoid as virtual surface. The level surfaces are ideal reference surfaces, for example, for heights. The gravity acceleration (gravity) w is the resultant of gravitation v and centrifugal acceleration c (see chapters Classical Physical Geodesy, Geodetic Boundary Value Problem, Time-Varying Mean Sea Level, and Geodetic World Height System Unification): w D v C c:
(1)
The centrifugal force c arises as a result of the rotation of the Earth about its axis. Here, we assume a rotation of constant angular velocity around the rotational axis
Fig. 14 The falling apple: Newton’s approach to absolute (left) and relative (right) gravity observation
Geomathematics: Its Role, Its Aim, and Its Potential
27
Fig. 15 Illustration of the gravity intensity: constant, i.e., 9:8 ms2 (left), decreasing to the poles by about 0:05 ms2 (mid), and real simulation (right)
Fig. 16 Illustration of the constituents of the gravity intensity g (ESA medialab, ESA communication production SP–1314) Fig. 17 Gravitation v; centrifugal acceleration c; gravity acceleration w
direction of plumb line
x v
c
w
center of mass
x3 , which is further assumed to be fixed with respect to the Earth. The centrifugal acceleration acting on a unit mass is directed outward, perpendicularly to the spin axis (see Fig. 17). If the 3 -axis of an Earth-fixed coordinate system coincides with the axis of rotation, then we have c D rC; where C is the so-called centrifugal potential (with f 1 ; 2 ; 3 g the canonical orthonormal system in Euclidean space R3 ). The direction of the gravity w is known as the direction of the plumb line, and the quantity g D jwj is called the gravity intensity (often just gravity). The gravity potential of the Earth can be expressed in the form
28
W. Freeden
W D V C C:
(2)
The (vectorial) gravity acceleration w (see chapter Gravitational Viscoelastodynamics for more details) is given by w D rW D rV C rC:
(3)
The surfaces of constant gravity potential W .x/ D const, x 2 R3 , are designated as equipotential (level, or geopotential) surfaces of gravity. The gravity potential W of the Earth is the sum of the gravitational potential V and the centrifugal potential C , i.e., W D V C C . In an Earth-fixed coordinate system, the centrifugal potential C is explicitly known. Hence, the determination of equipotential surfaces of the potential W is strongly related to the knowledge of the gravitational potential V . The gravity vector w given by w D rW is normal to the equipotential surface passing through the same point. Thus, equipotential surfaces intuitively express the notion of tangential surfaces, as they are normal to the plumb lines given by the direction of the gravity vector (cf. Fig. 18). The traditional concept in gravitational field modeling is based on the assumption that all over the Earth the position (i.e., latitude and longitude) and the scalar gravity g are available. Moreover, it is common practice that the gravitational effects of the sun, moon, Earth’s atmosphere, etc., are accounted for by means of corrections. The gravitational part of the gravity potential can then be regarded as a harmonic function in the exterior of the Earth. A classical approach to gravity field modeling was conceived by Stokes (1849), Helmert (1981), Neumann (1887). They proposed to reduce the given gravity accelerations from the Earth’s surface to the geoid. As the geoid is a level surface, its potential value is constant. The difference between the reduced gravity on the geoid and the reference gravity on the so-called normal ellipsoid is called the gravity anomaly. The disturbing potential, i.e., the difference between the actual and the reference potential, can be obtained from a (third) boundary value problem of potential theory. Its solution is representable in integral form, i.e., by the Stokes integral. The disadvantage of the Stokes approach is that the reduction to the geoid requires the introduction of assumptions concerning the unknown mass distribution between the Earth’s surface and the geoid (for more details concerning the classical theory, the reader is referred, e.g., to chapter Gravitational Viscoelastodynamics and the references therein).
Fig. 18 Level surface and plumb line
(x) x w(x) level surface plumb line
Geomathematics: Its Role, Its Aim, and Its Potential
29
Fig. 19 Regularity at infinity
y
0
x 2
x
Next, we briefly recapitulate the classical approach to global gravity field determination by formulating the differential/integral relations between gravity disturbance, gravity anomaly, and deflections of the vertical on the one hand, and the disturbing potential and the geoidal undulations on the other hand. The representations of the disturbing potential in terms of gravity disturbances, gravity anomalies, and deflections of the vertical are written in terms of wellknown integral representations over the geoid. For practical purposes the integrals are replaced by approximate formulas using certain integration weights and knots conventionally within a spherical framework (see chapter Numerical Integration on the Sphere for a survey paper on numerical integration on the sphere). Equipotential surfaces of the gravity potential W allow in general no simple representation (see Figs. 20, 21, 22). This is the reason why a reference surface – in physical geodesy usually an ellipsoid of revolution – is chosen for the (approximate) construction of the geoid (see Fig. 22). As a matter of fact, the deviations of the gravity field of the Earth from the normal field of such an ellipsoid are small. The remaining parts of the gravity field are gathered in a so-called disturbing gravity field rT corresponding to the disturbing potential T . Knowing the gravity potential, all equipotential surfaces – including the geoid – are given by an equation of the form W .x/ D const. By introducing U as the normal gravity potential corresponding to the ellipsoidal field and T as the disturbing potential, we are led to a decomposition of the gravity potential in the form W DU CT
(4)
such that (C1) (C2)
the center of the ellipsoid coincides with the center of gravity of the Earth, the difference of the mass of the Earth and the mass of the ellipsoid is zero.
According to the classical Newton law of gravitation (1687), knowing the density distribution of a body, the gravitational potential can be computed everywhere in R3 (see chapter Time-Varying Mean Sea Level for more information in time–spacedependent relation). More explicitly, the gravitational potential V of the Earth’s R G exterior is given by V .x/ D 4 .y/jx yj1 dV .y/; x 2 R3 nEarth; where Earth G is the gravitational constant (G D 6:67421011 m3 kg1 s2 ) and is the density
30
W. Freeden
Fig. 20 Geoidal surface over oceans (top) and over the whole Earth (bottom), modeled by smoothed Haar wavelets, Freeden et al. (2009), Geomathematics Group, Kaiserslautern
function. The properties of the gravitational potential V in the Earth’s exterior are easily described as follows: V is harmonic in x 2 R3 nEarth; i.e., V .x/ D 0; x 2 R3 nEarth. Moreover, the gravitational potential V is regular at infinity, i.e., 1 ; jxj ! 1; jxj 1 jrV .x/j DD O ; jxj ! 1: jxj2
jV .x/j DD O
(5) (6)
Geomathematics: Its Role, Its Aim, and Its Potential
31
Fig. 21 Geoidal surface (provided by R. Haagmans, European Space Agency, Earth Surfaces and Interior Section, ESTEC, Noordwijk, ESA ID number SEMLXEOA90E)
Fig. 22 Illustration of the gravity anomaly vector ˛.x/ D w.x/ u.y/ and the gravity disturbance vector ı.x/ D w.x/ u.x/
(x)
’(x) geoid W = const = W0
x
w(x)
u(x)
N(x)
geoidal height
y
u(y)
reference ellipsoid U = const = U0
Note that for suitably large values jxj (see Fig. 19), we have jyj 12 jxj, hence, jx yj jjxj jyjj 12 jxj. However, the actual problem is that in reality the density distribution is very irregular and only known for parts of the upper crust of the Earth. Actually, geoscientists would like to know it from measuring the gravitational field (gravimetry problem). Even if the Earth is supposed to be spherical, the determination of the gravitational potential by integrating Newton’s potential is not achievable (see chapters Classical Physical Geodesy, Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives, Sparsity in Inverse Geophysical Problems, and RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences for inversion methods of Newton’s law). Remark 1. As already mentioned of, the classical remedy avoiding any knowledge of the density inside the Earth is the formulation boundary value problems of potential theory to determine the external gravitational potential from terrestrial
32
W. Freeden
data. For reasons of demonstration, however, here we do not follow the standard (Vening–Meinesz) approach in physical geodesy of obtaining the disturbing potential from deflections of the vertical as terrestrial data (as explained in chapter Classical Physical Geodesy). Our approach is based on the context as developed by Freeden and Schreiner (2009) and Freeden and Wolf (2008). Furthermore, it should be noted that the determination of the Earth’s gravitational potential under the assumptions of nonspherical geometry and terrestrial oblique derivatives in form of a (deterministic or stochastic) boundary value problem of potential theory is discussed, e.g., in chapters Geodetic Boundary Value Problem and Oblique Stochastic Boundary-Value Problem. Further basic details concerning oblique derivative problems can be found in Freeden and Michel (2004), Gutting (2007) and in chapters Geodetic Boundary Value Problem, Oblique Stochastic Boundary-Value Problem, and Fast Spherical/Harmonic Spline Modeling. A point x of the geoid is projected onto the point y of the ellipsoid by means of the ellipsoidal normal (see Fig. 22). The distance between x and y is called the geoidal height or geoidal undulation. The gravity anomaly vector is defined as the difference between the gravity vector w.x/ and the normal gravity vector u.y/, u D rU , i.e., ˛.x/ D w.x/ u.y/ (see Fig. 22). It is also possible to distinguish the vectors w and u at the same point x to get the gravity disturbance vector ı.x/ D w.x/ u.x/: Of course, several basic mathematical relations between the quantities just mentioned are known (see, e.g., chapters Classical Physical Geodesy and Geodetic World Height System Unification). In what follows, we only heuristically describe the fundamental relations. We start by observing that the gravity disturbance vector at the point x can be written as ı.x/ D w.x/ u.x/ D r.W .x/ U .x// D rT .x/:
(7)
Expanding the potential U at x according to Taylor’s theorem and truncating the series at the linear term, we get @U : U .x/ D U .y/ C 0 .y/N .x/ @
(8)
: (D means approximation in linearized sense). Here, 0 .y/ is the ellipsoidal normal at y, i.e., 0 .y/ D u.y/=.y/, .y/ D ju.y/j, and the geoid undulation N .x/, as indicated in Fig. 22, is the aforementioned distance between x and y, i.e., between the geoid and the reference ellipsoid. Using .y/ D ju.y/j D 0 .y/ u.y/ D 0 .y/ rU .y/ D
@U .y/ @ 0
(9)
Geomathematics: Its Role, Its Aim, and Its Potential
33
we arrive at N .x/ D
T .x/ .W .x/ U .y// T .x/ .W .x/ U .y// D : ju.y/j .y/
Letting U .y/ D W .x/ D const D W0 , we obtain the so-called Bruns’ formula N .x/ D
T .x/ : .y/
(10)
It should be noted that Bruns’ formula (10) relates the physical quantity T to the geometric quantity N (cf. Bruns 1878). In what follows we are interested in introducing the deflections of the vertical of the gravity disturbing potential T . For this purpose, let us consider the vector field .x/ D w.x/=jw.x/j. This gives us the identity (with g.x/ D jw.x/j and .x/ D ju.x/j) w.x/ D rW .x/ D jw.x/j .x/ D g.x/.x/:
(11)
Furthermore, we have u.x/ D rU .x/ D ju.x/j 0 .x/ D .x/ 0 .x/:
(12)
The deflection of the vertical ‚.x/ at the point x on the geoid is defined to be the angular (i.e., tangential) difference between the directions .x/ and 0 .x/, i.e., the plumb line and the ellipsoidal normal in the same point: ‚.x/ D .x/ 0 .x/ ..x/ 0 .x// .x/ .x/:
(13)
Clearly, because of (13), ‚.x/ is orthogonal to .x/, i.e., ‚.x/ .x/ D 0. Since the plumb lines are orthogonal to the level surfaces of the geoid and the ellipsoid, respectively, the deflections of the vertical give briefly spoken a measure of the gradient of the level surfaces. This aspect will be described in more detail below: from (11), we obtain, in connection with (13), w.x/ D rW .x/
0
0
(14)
D jw.x/j ‚.x/ C .x/ C ...x/ .x// .x//.x/ : Altogether, we get for the gravity disturbance vector w.x/ u.x/ D rT .x/
D jw.x/j ‚.x/ C ..x/ 0 .x// .x/ .x/ .jw.x/j ju.x/j/ 0 .x/:
(15)
34
W. Freeden
The magnitude D.x/ D jw.x/j ju.x/j D g.x/ .x/
(16)
is called the gravity disturbance, while A.x/ D jw.x/j ju.y/j D g.x/ .y/
(17)
is called the gravity anomaly. Since the vector .x/ 0 .x/ is (almost) orthogonal to 0 .x/, it can be neglected in (15). Hence, it follows that w.x/ u.x/ D rT .x/ : D jw.x/j‚.x/ .jw.x/j ju.x/j/ 0 .x/:
(18)
The gradient rT .x/ can be split into a normal part (pointing in the direction of 0 .x/) and an angular (tangential) part (characterized by the surface gradient r ). It follows that rT .x/ D
@T 1 r T .x/: .x/ 0 .x/ C 0 @ jxj
(19)
By comparison of (18) and (19), we therefore obtain D.x/ D g.x/ .x/ D jw.x/j ju.x/j D
@T .x/; @ 0
(20)
i.e., the gravity disturbance, besides being the difference in magnitude of the actual and the normal gravity vector, is also the normal component of the gravity disturbance vector. In addition, we are led to the angular, i.e., (tangential) differential, equation 1 r T .x/ D jw.x/j ‚.x/: jxj
(21)
Remark 2. The reference ellipsoid deviates from a sphere only by quantities of the order of the flattening. Therefore, in numerical calculations, we treat the reference ellipsoid as a sphere around the origin with (certain) mean radius R. This may cause a relative error of the same order (for more details the reader is referred to standard textbooks of physical geodesy). In this way, together with suitable prereduction processes of gravity, formulas are obtained that are rigorously valid for the sphere. Remark 3. Since j‚.x/j is a small quantity, it may be (without loss of precision) multiplied either by jw.x/j or by ju.x/j, i.e., g.x/ or by .x/.
Geomathematics: Its Role, Its Aim, and Its Potential
35
In well-known spherical approximation, we have .y/ D ju.y/j D
GM ; jyj2
(22)
y GM @ ry .y/ D 2 3 .y/ D 0 @ jyj jyj
(23)
1 @ 2 .y/ D ; 0 .y/ @ jyj
(24)
and
where G is the gravitational constant and M is the mass. If now – as explained above – the relative error between normal ellipsoid and mean sphere of radius R is permissible, we are allowed to go over to the spherical nomenclature x D R ; R D jxj; 2 S2 with S2 being the unit sphere in R3 : Replacing ju.R /j by its spherical approximation GM =R2, we find r T .R / D
GM ‚.R /; R
2 S2 :
(25)
By virtue of Bruns’ formula (10), we finally find the relation between geoidal undulations and deflections of the vertical GM GM ‚.R /; r N .R / D 2 R R
2 S2 ;
(26)
i.e., r N .R / D R ‚.R /;
2 S2 :
(27)
In other words, the knowledge of the geoidal undulations allows the determination of the deflections of the vertical by taking the surface gradient on the unit sphere. From the identity (20), it follows that
@T .x/ D D.x/ D jw.x/j j.x/j @ 0
(28)
@ : D jw.x/j j.y/j 0 .y/ N .x/ @ @ D A.x/ 0 .y/ N .x/; @ where A represents the scalar gravity anomaly as defined by (17). Observing Bruns’ formula, we get
36
W. Freeden
A.x/ D
@T 1 @ .x/ C .y/ T .x/: @ 0 .y/ @ 0
(29)
In the sense of physical geodesy, the meaning of the spherical approximation should be carefully kept in mind. It is used only for expressions relating to small quantities of the disturbing potential, the geoidal undulations, the gravity disturbances, the gravity anomalies, etc. Together with the Laplace equation in the exterior of the sphere around the origin with radius R and the regularity at infinity, the equations D.x/ D
x rT .x/ jxj
(30)
and A.x/ D
x 2 rT .x/ C T .x/ jxj R
(31)
represent the so-called fundamental equations of physical geodesy in spherical approximation (see, e.g., Heiskanen and Moritz 1967; Groten 1979; Torge 1991). Actually, the identities (30) and (31), respectively, serve as boundary conditions of boundary value problems of potential theory, which are known as Neumann and Stokes problem (see also chapters Classical Physical Geodesy and Geodetic Boundary Value Problem of this handbook for more details). The study of Figs. 23–26 leads us to the following remarks: the gravity disturbances, which enable a physically oriented comparison of the real Earth with the ellipsoidal Earth model, are consequences of the imbalance of forces inside the Earth according to Newton’s law of gravitation. It leads to the suggestion of density anomalies. In analogy the difference between the actual level surfaces of the gravity potential and the level surfaces of the model body forms a measure for the deviation of the Earth from a hydrostatic status of balance. In particular, the geoidal undulations (geoidal heights) represent the deflections from the equipotential surface on the mean level of the ellipsoid. The geoidal anomalies generally show no essential correlation to the distribution of the continents (see Fig. 20). It is conjecturable that the geoidal undulations mainly depend on the reciprocal distance of the density anomaly. They are influenced by lateral density variations of large vertical extension, from the core–mantle layer to the crustal layers. In fact, the direct geoidal signal, which would result from the attraction of the continental and the oceanic bottom, would be several hundreds of meters (for more details, see Rummel (2002) and the references therein). In consequence, the weights of the continental and oceanic masses in the Earth’s interior are almost perfectly balanced. This is the phenomenon of isostatic compensation that was observed by P. Bouguer already in the year 1750.
Geomathematics: Its Role, Its Aim, and Its Potential
37
Fig. 23 Absolute values of the deflections of the vertical and their directions computed from EGM96 (cf. Lemoine et al. 1996) from degree 2 up to degree 360 (reconstructed by use of spacelimited (locally supported) scaling functions (from the PhD-thesis Wolf (2009), Geomathematics Group, University of Kaiserslautern)) (min D 0, max D 3:0 104 )
75° N
180° w
120° w
60° w
0°
60° E
120° E
180° E
60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S
–1000
–800
–600
–400
–200
0
200
400
600
800
[m2/s2]
Fig. 24 Disturbing potential computed from EGM96 from degree 2 up to degree 360 (reconstructed by use of space-limited (locally supported) scaling functions (from the PhD-thesis Wolf (2009), Geomathematics Group, University of Kaiserslautern)) (min 1 038 m2 =s2 , max 833 m2 =s2 )
38
W. Freeden
75° N
180° w
120° w
60° w
0°
60° E 120° E 180° E
60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S
–1.5
–1
–0.5
0
0.5
1
[m/s2]
1.5 x10–3]
Fig. 25 Gravity disturbances computed from EGM96 from degree 2 up to degree 360 (reconstructed by the use of space-limited (locally supported) scaling functions (from the PhD-thesis Wolf (2009), Geomathematics Group, University of Kaiserslautern)) (min D 3:6 103 m=s2 , max D 4:4 103 m=s2 )
75° N
180° w 120° w
60° w
0°
60° E
120° E 180° E
60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S 75° S
–1.5
–1
–0.5
0
0.5 2
[m/s ]
1
1.5 –3]
x10
Fig. 26 Gravity anomalies computed from EGM96 from degree 2 up to degree 360 (reconstructed by use of space-limited (locally supported) scaling functions (from the PhD-thesis Wolf (2009), Geomathematics Group, University of Kaiserslautern)) (min D 3:4 103 m=s2 , max D 4:4 103 m=s2 )
Geomathematics: Its Role, Its Aim, and Its Potential
39
Mathematical Analysis Equation (27) under consideration is of vectorial tangential type with the unit sphere S2 in Euclidian space R3 as the domain of definition (see also chapters Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: the History of the Geomagnetic Atlases, Sources of the Geomagnetic Field and the Modern Data That Enable Their Investigation, Convection Structures of Binary Fluid Mixtures in Porous Media, Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents, Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions in Geomagnetic Induction, Using B-Spline Expansions for Ionosphere Modeling, and Potential-Field Estimation Using Scalar and Vector Slepian Functions at Satellite Altitude for relations to geomagnetism). More precisely, we are confronted with a surface gradient equation r P D p with p as given continuous vector field (i.e., p. / D R‚.R /; 2 S2 ) and P as continuously differentiable scalar field that is desired to be reconstructed. In the previously formulated abstraction, the determination of the surface potential function P via the equation r P D p is certainly not uniquely solvable. We are able to add an arbitrary constant to P without changing the equation. For our problem, however, this argument is not valid, since we have to observe additional integration relations resulting from the conditions (C1) and (C2), namely, Z P . /. i /k dS . / D 0I k D 0; 1I i D 1; 2; 3:
(32)
S2
Consequently, we are able to guarantee uniqueness. The solution theory is based on the Green theorem (see Freeden et al. 1998; Freeden and Schreiner 2009; Freeden and Gerhards 2010; Gerhards 2011; Freeden and Gerhards 2013)
P . / D
1 4
Z
Z P . / dS . / S2
S2
r G. / r P . / dS . /;
(33)
where t 7! G.t/ D
1 1 ln.1 t/ C .1 ln 2/; t 2 Œ1; 1/; 4 4
(34)
is the Green function (i.e., the fundamental solution) with respect to the (Laplace)– Beltrami operator on the unit sphere S2 . This leads us to the following result: suppose that p is a given continuous tangential, curl-free vector field on the unit sphere S2 : Then P given by P . / D
1 4
Z S2
1 . . / / p. / dS . /; 1
2 S2 ;
(35)
40
W. Freeden
is the uniquely determined solution of the surface gradient equation r P D p with Z P . / dS . / D 0:
(36)
S2
Consequently, the existence and uniqueness of the equation is assured. Even more, the solution admits a representation in the form of the singular integral (35).
Development of a Mathematical Solution Method Precise terrestrial data of the deflections of the vertical are not available all over the Earth in dense distribution. They exist, e.g., on continental areas in much larger density than on oceanic ones. In order to exhaust the existent scattered data reservoir, we are not allowed to apply a Fourier technique in terms of spherical harmonics (note that, according to Weyl’s concept, the integrability is equivalently interrelated to the equidistribution of the data points). For more details see Weyl (1916) (onedimensional theory); Freeden et al. (1998); Freeden and Wolf (2008); Freeden and Schreiner (2009); Freeden and Gerhards (2010) (Spherical Theory). Instead, we have to use an appropriate zooming-in procedure which starts globally from (initial) rough data width (low scale) and proceeds to more and more finer local data width (higher scales). A simple solution for a multiscale approximation consists of appropriate “regularization” of the singular kernel. ˆ W t 7! ˆ.t/ D .1 t/1 ;
1 t < 1;
(37)
in (35) by the continuous kernel ˆ W t 7! ˆ .t/, t 2 Œ1; 1 , 2 .0; 2/, given by ˆ .t/ D
1 1t 1
; 1 t > ; ; 1 t :
(38)
In fact, it is not difficult to verify (see Freeden and Schreiner (2009) and the references therein) that 1 P . / D 4
Z S2
ˆ . /. . / / p. / dS . /; 2 S2 ;
(39)
where the surface curl-free space-regularized Green vector scaling kernel is given by . ; / 7! ˆ . /. . / /; ; 2 S2 ;
(40)
satisfies the following limit relation: lim sup jP . / P . /j D 0:
!0 2S2 >0
(41)
Geomathematics: Its Role, Its Aim, and Its Potential
41
Furthermore, in the scale discrete formulation using a strictly monotonically decreasing sequence .j /j 2N0 converging to zero with j 2 .0; 2 (for instance, j D 21j or j D 1 cos.2j /, j 2 N0 ), we are allowed to write the surface curl-free space-regularized Green vector scaling kernel as follows:
j . ; / D ˆj . /. . / /; ; 2 S2 :
(42)
The surface curl-free space-regularized Green vector wavelet kernel (cf. Fig. 27) then reads as j . ; /
D ‰j . /. . / /; ; 2 S2 ;
(43)
where ‰j D ˆj C1 ˆj
(44)
is explicitly given by 8 ˆ < ‰j .t/ D
ˆ :
0 1 1 1t j 1 1j j C1
; ;
1 t > j ; j 1 t > j C1 ;
;
j C1 > 1 t:
j . ; /
p. / dS . /;
(45)
Wj defined by Wj . / D
1 4
Z S2
2 S2 ;
(46)
leads us to the recursion Pj C1 D Pj C Wj :
(47)
Hence, we immediately get by elementary manipulations for all m 2 N (cf. Fig. 28) Pj Cm D Pj C
m1 X
Wj Ck :
(48)
kD0
The functions Pj represent low-pass filters of P . Obviously, Pj is improved by the band-pass filter Wj in order to obtain Pj C1 , while Pj C1 is improved by the band-pass filter Wj C1 in order to obtain Pj C2 , etc. Summarizing our results, we are allowed to formulate the following conclusion: three features are incorporated in our way of thinking about multiscale approximation by the use of locally supported wavelets, namely, basis property, decorrelation, and fast computation. More concretely, our vector wavelets are “building blocks” for huge discrete data sets. By virtue of the basis property, the function P can be better and better approximated from p with increasing scale j . Our wavelets have the
42
W. Freeden
75° N
180° w
90° w
0°
90° E
180° E
60° N 45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S
0
75° S
10
20
30
40
50
60
[m/s2]
60° N
75° N
180° w
0°
90° w
90° E
180° E
45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S
75° S
10
0
75° N 60° N
20 180° w
30 [m/s2]
40
0°
90° E
90° w
50
60
50
60
180° E
45° N 30° N 15° N 0° 15° S 30° S 45° S 60° S
0
75° S
10
20
30
40 [m/s2]
Fig. 27 Absolute values and the directions of the surface curl-free space-regularized Green vector scaling kernel j . ; /; ; 2 S2 , as defined by Eq. (40) with fixed and located at 00 N, 00 W and j D 21j for scales j D 1; 2; 3; respectively
2
2.5
3
3.5
4
4.5
5
90° W
85° W
60° W
3
250
4.5
5 60° W
250
95° W 5° N
90° W
95° W 5° N
90° W
35
40
45
0.5
100
150
200
5° S
2.5° S
0°
2.5° N
50
100
150
200
5° S
2.5° S
0°
2.5° N
85° W
30° S
50
100
150
200
5° S
2.5° S
0°
2.5° N
5
5
5
30° S
10
10
10
15° S
15
15
15
15° S
20
20
20
250
50
85° W
90° W
25
0° 0°
15° N
120° W 30° N
85° W
1
30° S 45° S 60° S 75° S
15° S
25
35
40
45
4
75° N 60° N 45° N 30° N
25
90° W
60° W
1 3.5
180° E
30
95° W 5° N
90° W
2.5
90° E
30
15° N
120° W 30° N
2
0°
30
35
40
45
1.5
90° w
50
100
150
200
250
5° S
2.5° S
0°
2.5° N
95° W 5° N
30° S
15° S
0°
0°
2.5
120° W 30° N
2
90° w
15° N
1.5
180° w
3
90° W
90° W
4
180° E
3.5
90° E
4.5
85° W
60° W
5
10
15
20
25
30
35
40
45
50
100
150
200
250
5
Fig. 28 “Zooming-in” strategy choosing a hotspot (here, Galapagos (00 N,910 W)) as target area. The colored areas illustrate the local support of the wavelets for increasing scale levels (from the PhD-thesis Wolf (2009), Geomathematics Group, University of Kaiserslautern)
5° S
2.5° S
0°
2.5° N
95° W 5° N
30° S
15° S
0°
15° N
90° W
0.5
1
0.5
120° W 30° N
60° S 75° S
30° S 45° S
60° S 75° S
30° S 45° S
1.5
180° w
0°
75° N 60° N 45° N 30° N
15° S
180° E
15° S
90° E
15° N
0°
0°
90° w
15° N
180° w
15° N 0°
75° N 60° N 45° N 30° N
Geomathematics: Its Role, Its Aim, and Its Potential 43
44
W. Freeden
power to decorrelate. In other words, the representation of data in terms of wavelets is somehow “more compact” than the original representation. We search for an accurate approximation by only using a small fraction of the original information of a potential. Typically, our decorrelation is achieved by vector wavelets which have a compact support (localization in space) and which show a decay toward high frequencies (see Fig. 28). The main calamity in multiscale approximation is how to decompose the function under consideration into wavelet coefficients and how to efficiently reconstruct the function from the coefficients. There is a “tree algorithm” (cf. also Fig. 11) that makes these steps simple and fast (see Freeden et al. 1998): Wj
Wj C1
& & Pj ! ˚ ! Pj C1 ! ˚ ! Pj C2 : : : : The fast decorrelation power of wavelets is the key to applications such as data compression, fast data transmission, noise cancellation, signal recovering, etc. With increasing scales j ! 1, the supports become smaller and smaller. This is the reason why the calculation of the integrals has to be extended over smaller and smaller caps (“zooming in”). Of course, downsizing spherical caps and increasing data widths are in strong correlation. Thus, the variable width of the caps with increasing scale parameter j enables the integration of data sets of heterogeneous data width for local areas without violating Weyl’s law of equidistribution.
“Back-Transfer” to Application The multiscale techniques as presented here will be used to investigate the anomalous gravity field particularly for areas in which mantle plumes and hotspots occur. In this respect it should be noted that mantle plume is a geoscientifical term which denotes an upwelling of abnormally hot rocks within the Earth’s mantle (cf. Fig. 13). Plumes (cf. Ritter and Christensen 2007) are envisioned to be vertical conduits in which the hot mantle material rises buoyantly from the lower mantle to the lithosphere at velocities as large as 1 m yr1 , and these quasicylindrical regions have a typical diameter of about 100–200 km. In mantle convection theory, mantle plumes are related to hotspots which describe centers of surface volcanism that are not directly caused by plate tectonic processes. A hotspot is a long-term source of volcanism which is fixed relative to the plate overriding it. A classical example is Hawaii. Due to the local nature of plumes and hotspots such as Hawaii, we have to use high-resolution gravity models. Because of the lack of terrestrial-only data, the GFZ-combined gravity model EIGEN–GLO4C is used, which consists of satellite data, gravimetry, and altimetry surface data. The “zooming-in” property of our analysis is of great advantage. Especially the locally compact wavelet turns out to be an essential tool of the (vectorial) multiscale decomposition of the deflections of the vertical and correspondingly the (scalar) multiscale approximation of the disturbing potential. In the multiscale analysis (cf. Figs. 29 and 30), several interesting observations can be detected. By comparing the different positions and different scales, we can
Geomathematics: Its Role, Its Aim, and Its Potential
45
Fig. 29 Approximation of the vector-valued vertical deflections ‚ in [ms2 ] of the Hawaiian region with smoothed Haar wavelets (a rough low-pass filtering at scale 6 is improved by several band-pass filters of scale j D 6; : : : ; 11; where the last picture shows the multiscale approximation at scale j D 12) (from the PhD-thesis Fehlinger (2009), Geomathematics Group, University of Kaiserslautern)
46
Fig. 30 (continued)
W. Freeden
Geomathematics: Its Role, Its Aim, and Its Potential
47
see that the maximum of the “energy” contained in the signal of the disturbing potential – measured in the so-called wavelet variances (see Freeden and Michel 2004 for more details) – starts in the North West of the Hawaiian islands for scale j D 2 and travels in east southern direction with increasing scale. It ends up, for scale j D 12, in a position at the geologically youngest island under which the mantle plume is assumed to exist. Moreover, in the multiscale resolution with increasing scale, more and more local details of the disturbing potential appear. In particular, the structure of the Hawaiian island chain is clearly reflected in the scale and space decomposition. Obviously, the “energy peak” observed at the youngest island of Hawaii is highly above the “energy intensity level” of the rest of the island chain. This seems to strongly corroborate the belief of a stationary mantle plume, which is located beneath the Hawaiian islands and that is responsible for the creation of the Hawaii–Emperor seamount chain, while the oceanic lithosphere is continuously passing over it. An interesting area is the southeastern part of the chain, situated on the Hawaiian swell, a 1,200 km broad anomalously shallow region of the ocean floor, extending from the island of Hawaii to the Midway atoll. Here a distinct geoidal anomaly occurs that has its maximum around the youngest island that coincides with the maximum topography, and both decrease in northwestern direction.
8.2
Circuit: Oceanic Circulation from Ocean Topography
Ocean flow (see chapters Time-Varying Mean Sea Level, Self-Attraction and Loading of Oceanic Masses, Unstructured Meshes in Large-Scale Ocean Modeling, and Asymptotic Models for Atmospheric Flows) has a great influence on mass transport and heat exchange. By modeling oceanic currents, we therefore gain, for instance, a better understanding of weather and climate. In what follows we devote our attention to the geostrophic oceanic circulation on bounded regions on the sphere (and in a first approximation, the oceanic surfaces under consideration may be assumed to be parts of the boundary of a spherical Earth model), i.e., to oceanic flow under the simplifying assumptions of stationarity, spherically reflected horizontal velocity, and strict neglect of inner frictions. This leads us to inneroceanic long-scale currents, which still give meaningful results – as, for example, for the phenomenon of El Niño.
J Fig. 30 (continued) Multiscale reconstruction of the disturbing potential T in Œm2 s2 from vertical deflections for the Hawaiian (plume) area using regularized vector Green functions (a rough low-pass filtering at scale j D 6 is improved by several band-pass filters of scale j D 6; : : : ; 11; where the last illustration shows the approximation of the disturbing potential T at scale j D 12) (from the PhD-thesis Fehlinger (2009), Geomathematics Group, University of Kaiserslautern)
48
W. Freeden
Mathematical Modeling of Ocean Flow The numerical simulation of ocean currents is based on the Navier–Stokes equation. Its formulation (see, e.g., Ansorge and Sonar 2009) is well known: let us consider a fluid occupying an arbitrary (open and bounded) subdomain G0 R3 at time t D 0. The vector function v W Œ0; tend G0 ! Gt R3 describes the motion of the particle positions 2 G0 with time, so that at times t 0 the fluid occupies the domain Gt D fv.tI /j 2 G0 g, respectively. Hence, Gt is a closed system in the sense that no fluid particle flows across its boundaries. The path of a particle 2 G0 is given by the graph of the function t 7! v.tI /; and the velocity of the fluid at a fixed location x D v.tI / 2 Gt by the derivative u.tI x/ D @t@ v.tI /: The derivation of the governing equations relies on the conservation of mass and momentum. The essential tool is the transport theorem, which shows how the time derivative of an integral over a domain changing with the time may be computed. The mass of a fluid occupying a domain is determined by the integral over the density of the fluid : Since the same amount of fluid occupying the R domain at time 0 later occupies the domain at time t > 0, we must have that G0 .0I x/ dV .x/ coincides with R Gt .tI x/ dV .x/ for all t 2 .0; tend : Therefore, the derivative of mass with respect to time must vanish, upon which the transport theorem yields for all t and Gt
Z Gt
@ .tI x/ C div .u/.tI x/ @t
dV .x/ D 0:
(49)
Since this is valid for arbitrary regions (in particular, for arbitrarily small ones), this implies that the integrand itself vanishes, which yields the continuity equation for compressible fluids @ C div .u/ D 0: @t
(50)
The momentum of a solid body is the product of its mass with its velocity Z .tI x/u.tI x/ dV .x/:
(51)
Gt
According to Newton’s second law, the same rate of change of (linear) momentum is equal to the sum of the forces acting on the fluid. We distinguish two types of forces, viz., body forces k (e.g., gravity, Coriolis force), which can be expressed as R .tI x/k.tI x/ dV .x/ with a given force density k per unit volume, and surface Gt R forces (e.g., pressure, internal friction) representable as @Gt .tI x/.x/ dS .x/, which includes the stress tensor .tI x/: Thus, Newton’s law reads d dt
Z
Z .tI x/u.tI x/ dV .x/ D Gt
Gt
Z .tI x/k.tI x/ dV .x/C
.tI x/.x/ dS .x/: @Gt
(52)
Geomathematics: Its Role, Its Aim, and Its Potential
49
If we now apply the product rule and the transport theorem componentwise to the term on the left and apply the divergence theorem to the second term on the right, we obtain the momentum equation @ .u/ C .u r/.u/ C .u/r u D k C r : @t
(53)
The nature of the oceanic flow equation depends heavily on the model used for the stress tensor. In the special case of incompressible fluids (here, ocean water) that is characterized by a density .tI x/ D 0 D const dependent neither on space nor on time, we find r u D 0, i.e., u is divergence-free (for a discussion of (53) on the unit sphere, the reader is also referred to Fengler and Freeden (2005) and the references therein). When modeling an inviscid fluid, internal friction is neglected and the stress tensor is determined solely by the pressure .tI x/ D P .tI x/i (i is the unit matrix). In the absence of inner friction (in consequence of, e.g., effects of wind and surface influences), we are able to ignore the derivative @u @t and, hence, the dependence on time. As relevant volume forces k, the gravity field w and the Coriolis force c D 2u ^ ! remain valid; they have to be observed. Finally, for largescale currents of the ocean, the nonlinear part does not play any role, i.e., the term .u r/u is negligible. Under all these very restrictive assumptions, the equation of motion (53) reduces to the following identity : 2! ^ u D
rP C w: 0
(54)
Even more, we suppose a velocity field of a spherical layer model. For each layer, i.e., for each sphere around the origin 0 with radius r. R/; the velocity field u can be decomposed into a normal field unor and a tangential field utan . The normal part is negligibly small in comparison to the tangential part (see the considerations in Pedlovsky (1979)). Therefore, we obtain with ! D 3 (note that the expression C . / D 2 . 3 /
(55)
is called the Coriolis parameter) the following separations of Eq. (54) (observe that utan .r / D 0; 2 S2 ) : C . / ^ utan .r / D
1 r P .r /; 0 r
(56)
and .2 ! ^ utan .r // D
1 @ P .r / gr : 0 @r
(57)
Equation (56) essentially tells us that the tangential surface gradient is balanced by the Coriolis force. For simplicity, in our approach, we regard the gravity
50
W. Freeden
acceleration as a normal field: w.r / D gr ; 2 S2 (with gr as mean gravity intensity). Moreover, the vertical Coriolis acceleration in comparison to the tangential motion is very small, that is, we are allowed to assume that .2 ! ^ utan .R // D 0 with C given by (55). On the surface of the Earth (here, r D R), we then obtain from (57) a direct relation of the product of the mean density and the mean gravity acceleration to the normal pressure gradient (hydrostatic approximation): @ P .R / D 0 gr : @r
(58)
This is the reason why we obtain the pressure by integration (see also chapters Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution and Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions in Geo-
magnetic Induction of this handbook) as follows: P .R / D 0 gR „.R / C PAtm ;
(59)
where PAtm denotes the mean atmospheric pressure. The quantity „.R / (cf. Fig. 31) is the difference between the heights of the ocean surface and the geoid at the point 2 S2 . The scalar function 7! „.R /; 2 S2 ; is called ocean topography. By use of altimeter satellites, we are able to measure the difference H between the known satellite height HSat and the (unknown) height of the ocean surface HOcean : H D HSat HOcean . After calculation of HOcean , we then get the ocean topography „ D HOcean HGeoid from the known geoidal height HGeoid . In connection with (56), this finally leads us to the equation 2 . 3 / ^ utan .R / D
gR r „.R /: R
(60)
Remembering the surface curl gradient L D ^ r , we are able to conclude ^ .! ^ utan .R // D
gR L „.R /; 2R
(61)
i.e., C . /utan .R / D
gR L „.R /: R
(62)
This is the equation of the geostrophic oceanic flow.
Mathematical Analysis Again, we have an equation of vectorial tangential type given on the unit sphere S2 : This time, however, we have to deal with an equation of the surface curl 3 gradient L S D s (with s. / D 2R gR . /utan .R / and S D „). The solution
Geomathematics: Its Role, Its Aim, and Its Potential
51
Fig. 31 Ocean topography (confer “German Priority Research Program: Mass transport and Mass distribution in the system Earth, DFG-SPP 1257”)
theory providing the surface stream function S from the surface divergence-free vector field s would be in accordance with our considerations above (with r replaced by L ). The computation of the geostrophic oceanic flow simply is a problem of differentiation, namely, the computation of the derivative L S . / D ^ r S . /; 2 S2 :
Development of a Mathematical Solution Method The point of departure for our intention to determine the geostrophic oceanic flow (as derived above from the basic hydrodynamic equation) is the ocean topography which is obtainable via satellite altimetry (see chapter Classical Physical Geodesy for observational details). As a scalar field on the spherical Earth, the ocean topography consists of two ingredients. First, on an Earth at rest, the water masses would align along the geoid related to a (standard) reference ellipsoid. Second, satellite measurements provide altimetric data of the actual ocean surface height which is also used in relation to the (standard) reference ellipsoid. The difference between these quantities is understood to be the actual ocean topography. In other words, the ocean topography is defined as the deviation of the ocean surface from the
52
W. Freeden
geoidal surface, which is here assumed to be due to the geostrophic component of the ocean currents. The data used for our demonstration are extracted from the French CLS01 model (in combination with the EGM96 model). The calculation of the derivative L S is not realizable, at least not directly. Also in this case, we are confronted with discrete data material that, in addition, is only available for oceanic areas. In the geodetic literature a spherical harmonic approach is usually used in the form of a Fourier series. The vectorial isotropic operator L (cf. Freeden and Schreiner 2009 for further details) is then applied – unfortunately under the leakage of its vectorial isotropy when decomposed in terms of scalar components – to the resulting Fourier series. The results are scalar components of the geostrophic flow. The serious difficulty with global polynomial structures such as spherical harmonics (with „ D 0 on continents!) is the occurrence of the Gibb’s phenomenon close to the coast lines (see, e.g., Nerem and Koblinski 1994; Albertella et al. 2008). In this respect it should be mentioned additionally that the equation of the geostrophic oceanic flow cannot be regarded as adequate for coastal areas; hence, our modeling fails for these areas. An alternative approach avoiding numerically generated oscillations in coastal areas is the application of kernels with local support, as, e.g., smoothed Haar kernels (see Freeden et al. (1998) and the references therein): ( ˆ.k/ .t/
D
0; kC1 .t .1 //k ; 2 kC1
< 1 t 2; 0 1 t :
(63)
.k/
For 2 .0; 2 , k 2 N; the function ˆ as introduced by (63) is .k 1/ times .k/ continuously differentiable on the interval Œ0; 2 . .ˆj /j 2N0 is a sequence tending to the Dirac function(al), i.e., a Dirac sequence (cf. Figs. 32 and 33). For a strictly monotonically decreasing sequence .j /j 2N0 satisfying j ! 0 for j ! 1, we obtain for the convolution integrals (low-pass filters) Z H.k/ . / D j
S2
2 ˆ.k/ j . /H . /dS . /; 2 S ;
(64)
the limit relation ˇ ˇ ˇ ˇ lim sup ˇH . / H.k/ . / ˇ D 0: j
j !1 2S2
(65)
An easy calculation yields
L ˆ.k/ j . / D
8 < 0; :
k.kC1/ .. /.1j //k1 2 jkC1
j < 1 2; ^ ; 0 1 j :
It is not hard to show (cf. Freeden and Schreiner 2009) that
(66)
Geomathematics: Its Role, Its Aim, and Its Potential 8
53
8 k=0 k=3
7
k=2 k=5
6
6
5
5
4
4
3
3
2
2
1
1
0
0
−1
−1
−2 −4
−3
−2
−1
0
1
2
scale j = 3, k = 0,2,3,5
3
j=0 j=2
7
4
−2 −4
−3
−2
−1
0
1
2
j=1 j=3
3
4
scale j = 0,1,2,3, k = 5 .k/
Fig. 32 Sectional illustration of the smoothed Haar wavelets ˆj .cos. // with Œ: ; j D 2j
2
Fig. 33 Illustration of the first members of the wavelet sequence for the smoothed Haar scaling function on the sphere (j D 2j , j D 2; 3; 4, k D 5)
ˇ ˇ ˇ ˇ lim sup ˇL H . / L H.k/ . / ˇ D 0: j
j !1 2S2
(67)
The multiscale approach by smoothed Haar wavelets can be formulated in a standard way. For example, the Haar wavelets can be understood as differences of two successive scaling functions. In doing so, an economical and efficient algorithm in a tree structure (Fast Wavelet Transform (FWT)) can be implemented appropriately.
“Back-Transfer” to Application Ocean currents are subject to different influence factors, such as wind field, warming of the atmosphere, salinity of the water, etc., which are not accounted for in our
54
W. Freeden Haar scaling function (k=3), scale 1
Haar wavelet (k=3), scale 1
0°30° E60° E90° E120° E150° E180° W150° W120° W90° W60° W30°W
0°30°E60° E90° E120° E150° E180°W150°W120°W90°W60°W30°W
75° N 60° N
75° N 60° N
45° N 30° N
45° N 30° N
15° N 0°
15° N
15° S 30° S
15° S
0° 30° S
45° S
45° S
60° S
60° S
75° S
75° S
+ –100
–50
cm
0
50
100
–10
0 cm
low–pass filtering (scale j = 1)
band–pass filtering (scale j = 1)
–20
Haar wavelet (k=3), scale 2
10
20
Haar wavelet (k=3), scale 2 0°30°E60° E90° E120° E150° E180°W150°W120°W90°W60°W30°W
0°30° E60° E90° E120° E150° E180° W150° W120° W90° W60° W30°W 75° N 60° N 45° N 30° N
75° N 60° N 45° N 30° N
15° N 0°
15° N
15° S 30° S
15° S
0° 30° S 45° S
45° S
60° S
60° S
75° S
75° S
+
+ –20
–10
0
10
20
cm
–20
–10
0 cm
band–pass filtering (scale j = 2)
band–pass filtering (scale j = 3)
Haar wavelet (k=3), scale 4
10
20
Haar wavelet (k=3), scale 5
0°30° E60° E90° E120° E150° E180° W150° W120° W90° W60° W30°W
0°30°E60° E90° E120° E150° E180°W150°W120°W90°W60°W30°W
75° N 60° N 45° N ° 30 N
75° N 60° N 45° N 30° N
15° N 0°
15° N
15° S 30° S
15° S
0° 30° S
45° S
45° S
60° S
60° S
75° S
75° S
+
+ –20
–10 cm
0
10
20
–15
–10
0 cm
band–pass filtering (scale j = 4)
band–pass filtering (scale j = 5)
Haar wavelet (k=3), scale 6
–5
5
10
15
Haar scaling function (k=3), scale 7
0°30° E60° E90° E120° E150° E180° W150° W120° W90° W60° W30°W
0°30°E60° E90° E120° E150° E180°W150°W120°W90°W60°W30°W
75° N 60° N
75° N
60° N
45° N
45° N 30° N
30° N 15° N 0°
15° N
15° S 30° S
15° S
0° 30° S
45° S
45° S
60° S
60° S
75° S
=
+ –10
–5
0
5
10
75° S
–100
–50
0
50
100
cm
cm
band–pass filtering (scale j = 6)
low–pass filtering (scale j = 7)
Fig. 34 Multiscale approximation of the ocean topography [cm] (a rough low-pass filtering at scale j D 1 is improved with several band-pass filters of scale j D 1; : : : ; 6; where the last picture shows the multiscale approximation at scale j D 7) (numerical and graphical illustration in cooperation with D. Michel and V. Michel 2006)
Geomathematics: Its Role, Its Aim, and Its Potential
55
Fig. 35 Ocean topography [cm] (top) and geostrophic oceanic flow [cm/s] (bottom) of the gulf stream computed by the use of smoothed Haar wavelets .j D 8; k D 5/ (numerical and graphical illustration in cooperation with D. Michel and V. Michel 2006)
56
W. Freeden August 1997
May 1997
June 1997
06.97
September 1997
09.97
July 1997
07.97
11.97
08.97
05.97
November 1997
02.98
Decamber 1997
12.97
October 1997
10.97
February 1998
March 1998
03.98
Januray 1997
01.98
April 1998
04.98
Fig. 36 Ocean topography during the El Niño period (May 1997–April 1998) from data of the satellite CLS01 model (numerical realization and graphical illustration in cooperation with V. Michel and S. Maßmann 2006)
modeling. Our approximation (see Figs. 34 and 35) must be understood in the sense of a geostrophic balance. An analysis shows that its validity may be considered as given on spatial scales of an approximate expansion of a little more than 30 km and on time scales longer than approximately 1 week. Indeed, the geostrophic velocity field is perpendicular to the tangential gradient of the ocean topography (i.e., perpendicular to the tangential pressure gradient). This is a remarkable property. The water flows along the curves of constant ocean topography, i.e., along isobars (see Fig. 34 for a multiscale modeling). Despite the essentially restricting assumptions necessary for the modeling, we obtain instructive circulation models for the internal ocean surface current for the northern or southern hemisphere, respectively (however, difficulties for the computation of the flow arise from the fact that the Coriolis parameter vanishes on the equator). An especially positive result is that the modeling of the ocean topography has made an essential contribution to the research in exceptional phenomena of internal ocean currents, such as El Niño. El Niño is an anomaly of the ocean–atmosphere system. It causes the occurrence of modified currents in the equatorial Pacific, i.e., the surface water usually flowing in western direction suddenly flows to the east. Geographically speaking, the cold Humboldt Current is weakened and finally ceases. Within only a few months, the water layer moves from Southeast Asia to South America. Water circulation has reversed. As a consequence, the Eastern Pacific is warmed up, whereas the water temperature decreases off the shores of Australia and Indonesia. This phenomenon has worldwide consequences on the weather, in the form of extreme droughts and thunderstorms. Our computations do not only help to visualize these modifications graphically, but they also offer the basis for future predictions of El Niño characteristics and effects (Fig. 36).
Geomathematics: Its Role, Its Aim, and Its Potential
8.3
57
Circuit: Seismic Processing from Acoustic Wave Tomography
The essential goal of seismic processing (see chapters Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives, Transmission Tomography in Seismology, Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery, Strategies in Adjoint Tomography, Multidimensional Seismic Compression by Hybrid Transform with Multiscale-Based Coding, and Tomography: Problems and Multiscale Solutions) is to transfer the existing signal that resulted from the integration of the wave equation under the expectation that designated properties of the target bedrock like the velocity field can be interpreted from the transformed signal. In this work, based on the regularization of Green’s functions (fundamental solutions), new wavelet techniques for a detailed bandpass filtering of acoustic seismic phenomena are formulated in order to get a local understanding and interpretability of scattered wave field potentials in deep geothermal research. More material can be found in Augustin (2014), Augustin et al. (2012), Bauer et al. (2014), Freeden and Nutz (2014), and Ostermann (2011).
Mathematical Modeling in Reservoir Detection In order to determine the structure, depth, and thickness of the target reservoir (see, e.g., Dahlen and Tromp 1998; Nolet 2008; Tarantola 1984; Yilmaz 1987), the standard seismic methods are applied to two- and/or three-dimensional seismic sections. All methods in use can be distinguished between time- and depthmigration strategies and between applications to post- and pre-stack data sets (see chapter Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives). The time-migration strategy is used to resolve conflicts in dipping events with different velocities. The depth-migration strategy handles strong lateral velocity variations associated to complex overburden structures. The numerical techniques used to solve the migration problem can generally be separated into three broad categories: (i) integral discretization methods such as Kirchhoff migration based on the solution of the eikonal equation; (ii) methods based on finitedifference schemes, e.g., depth continuation methods and reverse-time migration; and (iii) transform methods based on frequency–wave number implementations, e.g., frequency–space and frequency–wave number migration. All these migration methods usually rely on a certain approximation of the scalar acoustic or vectorial elastic wave equation (for more details, the reader is referred, e.g., to Nolet 2008; Yilmaz 1987 and the references therein). According to the geophysical requirements, highly accurate approximations and efficient numerical techniques must be realized in order to handle steep dipping events and complex velocity models with strong lateral and vertical variations, as well as to construct the subsurface image in a locally defined region with high resolution on available computational resources. In consequence, migration algorithms require an accurate velocity model. The adaptation of the interval velocity by use of an inversion by comparing the measured travel times with simulated travel times is called reflection tomography. There are many versions of reflection
58
W. Freeden recording truck
vibrator truck
sandstone P
limestone
P
P
Fig. 37 Principle of seismic reflection (from the PhD-thesis Ilyasov (2011), Geomathematics Group, University of Kaiserslautern)
tomography, but they all use ray-tracing techniques and they are formulated usually as a mathematical optimization problem. The most popular and efficient methods are ray-based travel time tomography, waveform and full-wave inversion tomography (FWI), and Gaussian beam tomography (GBT). The “true” velocity estimation is often obtained by an iterative process called migration velocity analysis (MVA), which uses the kinematic information gained by the migration and consists of the following steps: (initial step) perform a reflection tomography of the coarse velocity structure using a priori knowledge about the subsurface; (iterative step) migrate the seismic data sets and apply the imaging condition; and update the velocity function by tomography inversion. These approaches often have rather poor quality in the sense of the interpretability of the migration result. Since the interest of, e.g., geothermal projects is not only focused on structure heights and traps as in oil field practice but also on fault zones and karst structures under recent stress conditions, the interpretation for geothermal needs is significantly complicated (Fig. 37).
Mathematical Analysis of Acoustic Wave Propagation In the context of seismic processing imaging (see also chapters Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives, Identification of Current Sources in 3D Electrostatics, Transmission Tomography in Seismology, Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery, Strategies in Adjoint Tomography, and Multidimensional Seismic Compression by Hybrid Transform with Multiscale-Based Coding), it is usually assumed that shear stresses generated by the wave impulse and other kinds of damping can be neglected. As a consequence, wave propagation is treated as an acoustic phenomenon: pressure changes P .x; t/ imply volume changes dV
Geomathematics: Its Role, Its Aim, and Its Potential
59
that generate a displacement u.x; t/ which yield further pressure changes in the neighborhood of the volume. The relation between pressure changes and volume changes is assumed to be governed by Hooke’s law (see, e.g., Skudrzyk 1972; Achenbach 1973) P D K
dV V
(68)
as the basis of a linear elastic relation. Here, K is the bulk modulus of the material. In order to connect pressure changes to the displacement (see also Freeden and Nutz 2014), we observe that a small volume V may be written as V D dx1 dx2 dx3 . It is transformed to V 0 D dx10 dx20 dx30 . As the displacement ıu is defined via dxi0 D dxi C ıui ;
i 2 f1; 2; 3g;
(69)
we formally get
V V0 dV D V V D
dx1 dx2 dx3 .dx1 C ıu1 /.dx2 C ıu2 /.dx3 C ıu3 / dx1 dx2 dx3
D
dx1 dx3 ıu2 C dx2 dx3 ıu1 C dx1 dx2 ıu3 dx1 dx2 dx3
dx1 ıu2 ıu3 C dx2 ıu1 ıu3 C dx3 ıu1 ıu2 ıu1 ıu2 ıu3 dx1 dx2 dx3 dx1 dx2 dx3 ıu2 ıu3 ıu1 CR D C C dx1 dx2 dx3
D r u C R;
(70)
with R summarizing all terms of higher order in ıui which is negligible as ıui is assumed to be small. We obtain P D Kr u C S
(71)
with the source term S as a constitutive equation. Moreover, we assume the balance of linear momentum, or equivalently Newton’s second law, which in our case reads P .x C ıxi i ; t/ P .x; t/ D .x/ıxi ai ;
i 2 f1; 2; 3g ;
(72)
with the acceleration a and the canonical orthonormal basis of Euclidean space R3 : 1 ; 2 ; 3 . Under the further assumption that the acceleration is given by the secondorder time derivative of the displacement u, we get
60
W. Freeden
P .x C ıxi i ; t/ P .x; t/ @2 D .x/ 2 ui .x; t/ ıxi @t
(73)
and finally by considering the limit ıxi ! 0, @2 @ P .x; t/ D .x/ 2 ui .x; t/: @xi @t
(74)
These component equations can be summarized in vectorial form as rx P .x; t/ D .x/
@2 u.x; t/: @t 2
(75)
Assuming that P and u are sufficiently often differentiable, (71) and (75) can be combined by applying the second-order time derivative to (71) to gain @2 P .x; t/ D K.x/rx @t 2
@2 @2 u.x; t/ C 2 S .x; t/: 2 @t @t
(76)
Using (75) in (76), we obtain @2 P .x; t/ D K.x/ rx @t 2
@2 1 rx P .x; t/ C 2 S .x; t/: .x/ @t
(77)
Applying the product rule yields the identity rx
1 0 1 1 @ 1 rx P .x; t/ D .rx .x// rx rx P .x; t/A 2 .x/ .x/ „ ƒ‚ … .x/ Dx
.rx P .x; t// :
(78)
provided that is smooth enough. If the gradient of is negligibly small, we arrive at the non-divergence form of the acoustic wave equation @2 @2 2 P .x; t/ D c .x/ P .x; t/ C S .x; t/; x @t 2 @t 2
(79)
where the quantity s c.x/ D
K.x/ .x/
(80)
is called the propagation speed of a wave, which results in the purely divergence form of the acoustic wave equation
Geomathematics: Its Role, Its Aim, and Its Potential
1 @2 1 @2 P .x; t/ D S .x; t/: x c 2 .x/ @t 2 c 2 .x/ @t 2
61
(81)
The identity (81) ends the standard approach to the acoustic wave equation, thereby assuming a compressible, viscous (i.e., no attenuation) medium with no shear strength and no internal forces (i.e., in equilibrium). Remark 4. There is a large literature dealing with existence and uniqueness of the forward formulation of the acoustic wave equation (see, e.g., Evans (2002) and the references therein).
Development of a Mathematical Solution Method As is well known, a standard approach to acoustic wave equation (81) is a Fourier transform with respect to time: 1 U .x/ D p 2
Z P .x; t/ exp.i !t/ dt
(82)
R
leading to the reduced wave equation !2 U .x/ D W .x/; x C 2 c .x/
(83)
usually called the Helmholtz equation. Obviously, W is given by 1 W .x/ D p 2
Z
1 R
c 2 .x/
@2 S .x; t/ exp.i !t/ dt @t 2
(84)
for all x. We are interested in two solution procedures of the Helmholtz equation (83), namely: (1) Postprocessing: the decorrelation of U and c from a preprocessed (sufficiently suitable) solution U , (2) Inverse modeling: the determination of c under the a priori knowledge of a lowpass filtered “trend solution.” More precisely, we do not make the attempt here to solve the inverse problem of determining c as a whole. Instead, we base our investigations on already successful prework, for example, (i) integral discretization methods such as Kirchhoff migration based on the solution of the eikonal equation (see, e.g., Yilmaz 1987 and the references therein) and Gaussian beam procedures (see, e.g., Popov et al. 2006, 2008); (ii) methods based on the finite-difference schemes, e.g., depth continuation methods (cf. Claerbout 2009) and reverse-time migration (see, e.g., Baysal et al.
62
W. Freeden
Fig. 38 Scheme of seismic processing
Fig. 39 Results of seismic processing
1984; Yilmaz 1987; Popov et al. 2008; Nolet 2008); and (iii) transform methods based on frequency–wave number implementations, e.g., frequency–space and frequency–wave number migration (see, e.g., Yilmaz 1987; Claerbout 2009). In other words, in order to get information about the structure, depth, and thickness of a target reservoir, we start from today’s realistic assumption that standard seismic tomography results are available and meaningful, at least to some extent. We focus our attention on the interpretability of a “true” migration result obtained from elsewhere and/or the (local) improvement of a trend result (see Figs. 38 and 39). An essential tool is a new class of locally supported wavelets (see Freeden and Blick 2013; Freeden and Nutz 2014) derived from regularizations of Green functions for the Helmholtz operator:
Geomathematics: Its Role, Its Aim, and Its Potential
63
(1) Postprocessing: The Helmholtz equation (83) leads to the definition of the wave number k.x/ and the refraction index N .x/ as N .x/ D
! ! c0 c0 ; k.x/ D D D k0 N .x/; c.x/ c.x/ c0 c.x/
(85)
with c0 being a suitable constant reference velocity (see, e.g., Engl et al. 1996; Snieder 2002; Biondi 2006, and the references therein). Accordingly, the Helmholtz equation (83) can be rewritten as
x C k02 N 2 .x/ U .x/ D 0:
(86)
The region where N .x/ ¤ 1 represents the scattering object such that N .x/ 1 may be supposed to have compact support. Another standard assumption is that the difference between c.x/ and c0 should be sufficiently small. As a consequence, N 2 .x/ may be developed into a Taylor series up to order one with a center such that c.x0 / D c0 . This yields N 2 .x/ ' 1 C .x/
(87)
with a small perturbation parameter . Consequently, we have k 2 .x/ D k02 N 2 .x/ D k02 .1 C .x// :
(88)
With the same argument as explained before, the unknown function .x/ may be 2 supposed to have compact support. The wave operator x C c 2!.x/ may be separated in the following way: Ax D x C
!2 D x C k02 N 2 .x/ D x C k02 .1 C .x// c 2 .x/
D x C k02 C k02 .x/ D A.0/ C A.1/ ;
(89)
where we have used the abbreviations A.0/ D x C k02
(90)
A.1/ D k02 .x/:
(91)
and
Hence, the wave field U .x/ may be split into an incident wave field UI , corresponding to the wave propagating in the absence of the scatterer, and the scattered wave field US such that U D UI C US :
(92)
64
W. Freeden
This splitting leads us to A.0/ UI D x C k02 UI D 0; A.0/ US D x C k02 US D k02 .UI C US / D k02 U D A.1/ U:
(93) (94)
It should be mentioned that Eq. (93) formalizes that UI corresponds to the wave propagating in the absence of the scatterer. As the fundamental solution to the Helmholtz operator C k02 is known to be G. C k02 I jx yj/ D
1 e i k0 jxyj ; 4 jx yj
x ¤ y;
(95)
the functions UI and US ; respectively, can be represented as volume potentials Z UI .x/ D Z US .x/ D
R3
B
G. C k02 I jx yj/ W .y/ dV .y/;
(96)
G. C k02 I jx yj/ k02 .y/ U .y/ dV .y/ ƒ‚ … „
(97)
DF .y/
with the volume element dV and B D supp./ being the support of ; where W is given by (84). Remark 5. In seismic reflection modeling, point sources with certain spectra are usually chosen as unperturbed wave (for more details, see, e.g., Nolet 2008). Only if UI and US are available in the compact support B D supp./, a direct computation of by applying the Helmholtz operator C k02 to (97) is possible. However, it should be mentioned that US can be usually measured only away from the support B of . Nevertheless, in exceptional cases, local information is available inside boreholes, which is of particular interest. In the sense of the perturbation theory and with the conventional setting U 0 D UI ; U can be formally written as a series U D
1 X
k U .k/ ;
(98)
kD0
which yields 1 X kD0
k A.0/ C A.1/ U .k/ D W:
(99)
Geomathematics: Its Role, Its Aim, and Its Potential
65
By collecting terms which are of the same order in , we therefore get A.0/ U .0/ D W;
(100)
A.0/ U .k/ D A.1/ U .k1/ ;
k 2 N:
(101)
The scattered wave field is then given by
US D U UI D
1 X
k U .k/ :
(102)
kD1
This procedure is known as Born approximation (see, e.g., Snieder (2002) for more details). Considering only the first-order approximation, we obtain A.0/ UI DW;
(103)
A.0/ US D A.1/ UI :
(104)
The difference between (94) and (104) is crucial. On the right-hand side of (97), we find the sum of UI and US which makes this a nonlinear equation. On the right-hand side of (104), only UI appears, which is determined by (103), making the relation between scattered wave and perturbation of the medium linear. Remark 6. UI and the first-order, i.e., Born approximation of US can be represented by the potentials Z UI .x/ D
R3
G. C k02 I jx yj/ W .y/ dV .y/;
(105)
Z USI .x/ D
G. C k02 I jx yj/ k02 .y/ UI .y/ dV .y/: ƒ‚ … „ B
(106)
DFI .y/
The basic properties of volume integrals of type (97) can be summarized as follows (see, e.g., Müller 1969): the volume potential US is a metaharmonic function in R3 nB under the assumption of boundedness of F (i.e., . C k02 /US D 0 in R3 nB). For a continuous F in B, the potential (97) is of class C .1/ .R3 /, and under the assumption of Hölder continuity of F , we have . C k02 /USI .x/ D FI .x/
(107)
for all x 2 B. This equation indicates the direct relation between USI and FI in B. It is actually the key point of Born modeling as discussed in the literature (see, e.g., Marks 2006).
66
W. Freeden
Our postprocessing procedure starts from the nonlinear integral relation (96). The essential idea is to use a sequence of regularizations fGj . C k 2 I /g; j 2 N, for the kernels (95) given by 8 i k0 jxyj ˆ < e4jxyj ; Gj . C k02 I jx yj/ D ei k0 jxyj ˆ 3 : 8j
jxyj2 j2
jx yj > j ; ; jx yj j ;
(108)
where fj gj 2N denotes a positive monotonically decreasing sequence converging to 0 (e.g., a dyadic sequence given by j D 2j , j 2 N). The regularization (108) is constructed in such a way that each kernel Gj . C k02 I / is continuously differentiable and only dependent on the distance between two points x and y. Furthermore, under the assumption that F is bounded in B, we obtain for the “regularized version of the potential” (97) Z .US /j .x/ D
B
Gj . C k02 I jx yj/ F .y/ dV .y/
(109)
the limit relation US .x/ D .US /j .x/ C O.j2 /; j ! 0; for all x 2 R3 . It should be noted that the aforementioned approach of replacing the fundamental solution G. C k02 I jx yj/ by its regularized versions (109) initiates a multiscale method in canonical way (cf. Freeden and Blick 2013) based on ˇ ˇ Z ˇ ˇ lim sup ˇˇUS .x/ Gj . C k02 I jx yj/F .y/ dV .y/ˇˇ D 0: j !1 x2B
(110)
B
Each scaling function Gj . C k02 I / provides low-pass filtering of the signature US . In order to obtain multiscale components, we simply calculate the difference of two consecutive scaling functions and get the wavelet functions with respect to the scale parameter j . Of course, other types of regularizations can be chosen. In our approach, however, we restrict ourselves to Haar-related kernel functions (see Haar 1910 for one-dimensional theory). As a matter of fact, by using the regularizations of the fundamental solution of the Helmholtz operator C k02 , we are immediately led to locally supported wavelets via the scale discrete scaling equation
‰j . C k02 I jx yj/ D Gj C1 . C k02 I jx yj/ Gj . C k02 I jx yj/; j 2 N0 : (111)
Geomathematics: Its Role, Its Aim, and Its Potential
67
0.03
0.06
0.02
0.04
0.01
0.02
0
0
–0.01
–0.02
–0.02
–0.04
–0.03 –5
0
5
–0.06 –5
0
5
0
5
0.3
0.12 0.1
0.25
0.08 0.2 0.06 0.04
0.15
0.02
0.1
0
0.05
–0.02 0
–0.04 –0.06 –5
0
5
–0.05 –5
Fig. 40 Wavelet function ‰j in sectional illustration for k0 D 5 and j D
4 2j
; j D 0; : : : ; 3
Explicitly, we have (see Fig. 40 for graphical illustration) ! ! 8 ˆ e i k0 jxyj jx yj2 e i k0 jxyj jx yj2 ˆ ˆ 3 3 ; ˆ ˆ ˆ 8j C1 8j j2C1 j2 ˆ ˆ ˆ ˆ jx yj j C1 ; ˆ ˆ < 2 ! (112) ‰j .Ck0 I jx yj/ D e i k0 jxyj jx yj2 e i k0 jxyj ˆ ˆ ˆ C 3 ; ˆ ˆ 4jx yj 8j j2 ˆ ˆ ˆ ˆ ˆ j C1 < jx yj j ; ˆ ˆ : 0; j < jx yj:
68
W. Freeden
The convolution integral (113) indicates the difference of two sequential lowpass filters, i.e., it represents a band-pass filtering at the position x with respect to the scale parameter j : Z Wj .x/ D
B
‰j . C k02 I jx yj/ F .y/ dV .y/:
(113)
Wj includes all detail information contained in .US /j C1 but not in .US /j . In accordance with our construction, we therefore obtain for every L 2 N X
j CL1
.US /j CL D .US /j C
Wn :
(114)
nDj
The formula (114) shows the progress in the “zooming-in process,” which proceeds from scale j to scale j C L. Hence, the identity (114) describes the amount of improvement in the accuracy from level j to level j C L. Indeed, it uniformly follows for each position x and for each scale value j that US .x/ D .US /j .x/ C
1 X
Wn .x/;
(115)
nDj
i.e., the signal US consists of a (coarse) low-pass filtering and an infinite number of successive band-pass convolutions. Of course, in practice, only a finite number of band-pass filters has to be calculated to satisfy a certain error tolerance (for more multiscale aspects in constructive approximation, see Freeden and Michel 2004; Freeden and Gerhards 2013). Until now, we have only reconstructed the quantity US by virtue of a multiscale technique using building blocks. For practical purposes, the decorrelation of both F and are of interest. An elementary calculation using (108) yields the Helmholtz derivative Kj . C k02 I jx yj/ D x C k02 Gj C k02 I jx yj
D
8 j :
fKj g is a “Haar-type sequence,” and it approximately reduces to the Haar sequence fHj g ( Hj .jx yj/ D
1 3 4 3 ; jx yj j ; j
0;
jx yj > j :
Geomathematics: Its Role, Its Aim, and Its Potential
69
for sufficiently large j , i.e., for each k0 Kj . C k02 I jx yj/ D Hj .jx yj/ C O.j /; for j ! 1: If F is bounded, then it is clear that x C k02 Z
D B
Z B
Gj . C k02 I jx yj/ F .y/ dV .y/
(116)
x C k02 Gj C k02 I jx yj F .y/ dV .y/;
such that Z F .x/ D lim
j !1 B
Kj . C k02 I jx yj/ F .y/ dV .y/
(117)
Z
D lim
j !1 B
Hj .jx yj/ F .y/ dV .y/
for all x 2 B. Moreover, we have under the assumption of Hölder continuity of F (see, e.g., Müller 1969) Z F .x/ D .x C k02 / US .x/ D .x C k02 /
B
G. C k02 I jx yj/F .y/ dV .y/
(118) for all x 2 B. In other words, the “Helmholtz derivative” C k02 of the regularization of the fundamental solution leads back to a “Haar-type” singular integral (117) for detecting F .x/ D " k02 .UI C US /.x/ D " k02 U .x/
(119)
for all x 2 B. Vice versa, our approach offers the possibility to introduce alternative regularizations of the fundamental solution such that their “Helmholtz derivatives” represent singular-type kernels constituting Dirac-type sequences in B. All in all, the multiscale technique by regularized fundamental solutions enables us simultaneously to decompose the signal information of the wave field as well as the refraction index N based on the interrelation (116), however, under the “postprocessing assumption” that US is discretely known (to some extent) inside B. Once more, our understanding of the multiscale technique here does not preferably aim at the reconstruction of the signals, but instead in working out characteristic detail information that emerge from the difference of two consecutive scale-space representations. This practice (see also Freeden and Blick 2013) comes across as the decorrelation of signatures, such that postprocessing of density structures is the key element.
70
W. Freeden
(2) Inverse Modeling: In the following (cf. Freeden and Gerhards 2013), we make the attempt to apply the Haar philosophy to an approximate determination of F from US inside B. The regularization procedure of the volume potential as proposed for postprocessing is the essential tool. Let fj gj 2N0 be a monotonically decreasing sequence of positive values j such that limj !1 j D 0 (e.g., j D 2j ). Then, in accordance with (117), we are able to specify a sufficiently large integer J such that, for all x 2 B; Z F .x/ ' FJ .x/ D KJ . C k02 I jx yj/ F .y/ dV .y/ (120) B
as well as
Z
.US /.x/ ' .US /J .x/ D
B
GJ . C k02 I jx yj/ F .y/ dV .y/
(121)
(“' ”means that the error is negligible). If F is bounded, then we already know that (122) x C k02 .US /J .x/ D FJ .x/ for all x 2 B: In order to realize a fully discrete approximation of F , we have to apply approximate integration formulas over B leading to .US /.x/ '
NJ X
NJ J GJ . C k02 I jx yiNJ j/ wN i F .yi /;
(123)
i D1 NJ J where wN i , yi , i D 1; : : : ; NJ , are the known weights and knots, respectively. Using an appropriate integration formula, we are therefore led to a linear system to be solved in order to obtain insight into approximate information about . As already explained, for numerical realization, we may assume that .US /J 1 is available from elsewhere, at least discretely in points xkMJ 2 B; k D 1; : : : ; MJ ; to be needed in the solution process of the linear system. Obviously, we have to calculate all unknown coefficients NJ J aiNJ D wN i F .yi /;
i D 1; : : : ; NJ ;
(124)
from discrete values US xkMJ .US /J 1 xkMJ ; k D 1; : : : ; MJ : Then we have to solve a linear system NJ X ‰J 1 . C k02 I jxkMJ yiNJ j/ aiNJ ; US xkMJ .US /J 1 xkMJ ' i D1
k D 1; : : : ; MJ ; in order to determine the required coefficients aiNJ ; i D 1; : : : ; NJ .
(125)
Geomathematics: Its Role, Its Aim, and Its Potential
71
Remark 7. The linear system (125) is the bottleneck in inverse modeling, although ‰J 1 as constructed here possesses a local support. Formally, (125) can be written in a general matrix notation of the form Ad D u. In fact, there is a large literature for solving such a system by minimizing a penalty term of generic form P .d / D 12 kAd uk C R.d /; where R.d / is a measure of the size and/or complexity (see, e.g., chapters Quantitative Remote Sensing Inversion in Earth Science: Theory and Numerical Treatment, Transmission Tomography in Seismology, Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery, and Strategies in Adjoint Tomography for related problems). Even more, sparsity comes into play (see, e.g., chapters Sparsity in Inverse Geophysical Problems, Sparse Solutions of Underdetermined Linear Systems and the references therein). Once all coefficients aiNJ ; i D 1; : : : ; NJ are available (note that the integration J weights wN i ; i D 1; : : : ; NJ ; are known), the function F FJ 1 can be obtained in obvious way. From the knowledge of US ; .US /J 1 ; we are therefore able via (119) to model , hence, the wanted refraction index N .
“Back-Transfer” to Application A seismic prototype for test investigations in geoexploration is the 2D Marmousi model (see Martin et al. 2002). A velocity model is shown in Fig. 39 (taken from the PhD-thesis due to Ilyasov (2011)). The source wave field involving the Marmousi model in direct time (i.e., snapshots of the wave propagation after 0.6, 1.1, 1.6, 2.1, and 2.6 s) is illustrated in Figs. 41–45. A result of a reverse-time migration (RT) in the context of a finite-difference scheme (FDS) applied to the Marmousi data set using the velocity model (Fig. 39) is illustrated in Fig. 46.
Fig. 41 Wave propagation in the Marmousi velocity model after 0.6 s
Fig. 42 Wave propagation in the Marmousi velocity model after 1.1 s
72
W. Freeden
Fig. 43 Wave propagation in the Marmousi velocity model after 1.6 s
Fig. 44 Wave propagation in the Marmousi velocity model after 2.1 s
Fig. 45 Wave propagation in the Marmousi velocity model after 2.6 s (all pictures taken from the PhD-thesis Ilyasov (2011), Geomathematics Group, University of Kaiserslautern)
Fig. 46 A migration result of the Marmousi model by FDS (taken from the PhD-thesis Ilyasov (2011), Geomathematics Group, University of Kaiserslautern)
Geomathematics: Its Role, Its Aim, and Its Potential
73
−600 −800 −1000 −1200 −1400 −1600 −1800 −2000 −2200
3800 3600 3400 3200 3000 2800 2600 2400 2200 2000 1800
[m]
[m]
Fig. 47 Interpretation of the Marmousi model due to Martin et al. (2002)
−600 −800 −1000 −1200 −1400 −1600 −1800 −2000 −2200 1000 2000 3000 4000 5000 6000 7000 8000 [m]
1000 2000 3000 4000 5000 6000 7000 8000 [m]
low-pass filtering (scale j =4)
[m]
−1000 −1500 −2000 −2500 1000 2000 3000 4000 5000 6000 7000 8000 [m]
band-pass filtering (scale j =4) 600 500 400 300 200 100 0 −100 −200 −300 −400
400
−500
[m]
−500
400 300 200 100 0 −100 −200 −300 −400
300
−1000
200
−1500
100
−2000
−100
0 −200
−2500 1000 2000 3000 4000 5000 6000 7000 8000 9000 [m]
band-pass filtering (scale j =5)
−300
band-pass filtering (scale j =6) 100 150
−500
100
−1000
50
−1500
0
−2000
−50
−2500
−100 1000 2000 3000 4000 5000 6000 7000 8000 9000 [m]
band-pass filtering (scale j =7)
[m]
[m]
−1000
80 60
−500
40 20 0 −20 −40
−1500 −2000 −2500 1000 2000 3000 4000 5000 6000 7000 8000 9000
−60
[m]
band-pass filtering (scale j =8)
Fig. 48 Wavelet approximation of the velocity field in [m/s] by Helmholtz derivatives (following Freeden and Blick 2013)
74
W. Freeden
Fig. 49 Wavelet decorrelation (band-pass filtering) of the Marmousi migration model in [m] for scales j D 2; : : : ; 6 (following Freeden and Blick 2013)
Geomathematics: Its Role, Its Aim, and Its Potential
75
For the decorrelation behavior of the Fourier-transformed 3D Helmholtz wavelets, we limit our research to wave numbers k0 in the interval from k0 D 0:010 to k0 D 0:120. Our calculations show two illustrations, namely, for the wave members k0 D 0:047 and k0 D 0:099. Figure 49 shows the details for the scale parameters corresponding to a dyadic sequence. Relevant structural differences during the scale-dependent wavelet convolution become obvious for scale j D 3; : : : ; 6. Indeed, we are able to show that our decorrelation method highlights specific rock formations. By wavelet filtering of the migration result, we are not only able to specify the salt formation but also to dampen other rock formations as well as undesired noise phenomena caused by erroneous migration (see Fig. 47). Finally, in accordance with our construction (119), the Helmholtz derivative simultaneously leads back to a multiscale approximation of the velocity field using Haar-type trial functions. The wave number chosen for the illustrations in Fig. 48 is k0 D 0:099.
9
Final Remarks
The Earth is a dynamic planet in permanent change, due to large-scale internal convective material and energy rearrangement processes, as well as manifold external effects. We can, therefore, only understand the Earth as our living environment if we consider it as a complex system of all its interacting components. The processes running on the Earth are coupled with one another, forming ramified chains of cause and effect which are additionally influenced by man who intervenes into the natural balances and circuits. However, knowledge of these chains of cause and effect has currently still remained incomplete to a large extent. In adequate time, substantial improvements can only be reached by the exploitation of new measurement and observation methods, e.g., by satellite missions and by innovative mathematical concepts of modeling and simulation, all in all by geomathematics. As far as data evaluation is concerned in the future, traditional mathematical methods will not be able to master the new amounts of data neither theoretically nor numerically – especially considering the important aspect of a more intensively localized treatment with respect to space and time, embedded into a global concept. Instead, geoscientifically relevant parameters must be integrated into constituting modules; the integration must be characterized by three essential characteristics: good approximation property, appropriate decorrelation ability, and fast algorithms. These characteristics are the key for a variety of abilities and new research directions. Acknowledgements This introductory chapter is based on the German note “W. Freeden (2009): Geomathematik, was ist das überhaupt?, Jahresbericht der Deutschen Mathematiker Vereinigung (DMV), JB.111, Heft 3, 125–152.” I am obliged to the publisher Vieweg+Teubner for giving the permission for an English translation of essential parts of the original version. Particular thanks go to Dr. Helga Nutz for reading an earlier version and eliminating some inconsistencies.
76
W. Freeden
Furthermore, I would like to thank my Geomathematics Group, Kaiserslautern, for the assistance in numerical calculation as well as graphical illustration concerning the three exemplary circuits.
References Achenbach JD (1973) Wave propagation in elastic solids. North Holland, New York Albertella A, Savcenko R, Bosch W, Rummel R (2008) Dynamic ocean topography – the geodetic approach. IAPG/FESG Mitteilungen, 27, TU München Ansorge R, Sonar T (2009) Mathematical models of fluid dynamics, 2nd updated edn. Wiley-VCH, Weinheim Augustin M (2014) A method of fundamental solutions in poroelasticity to model the stress field in geothermal reservoirs. PhD-thesis, Geomathematics Group, University of Kaiserslautern Augustin M, Freeden W, Gerhards C, Möhringer S, Ostermann I (2012) Mathematische Methoden in der Geothermie. Math Semesterber 59:1–28 Bach V, Fraunholz W, Freeden W, Hein F, Müller J, Müller V, Stoll H, von Weizsäcker H, Fischer H (2004) Curriculare Standards des Fachs Mathematik in Rheinland-Pfalz (Vorsitz: W. Freeden). Studie: Reform der Lehrerinnen- und Lehrerausbildung, MWWFK Rheinland-Pfalz Bauer M, Freeden W, Jacobi H, Neu T (eds) (2014) Handbuch Tiefe Geothermie. Springer, Heidelberg Baysal E, Kosloff DD, Sherwood JWC (1984) A two-way nonreflecting wave equation. Geophysics 49(2):132–141 Beutelspacher S (2001) In Mathe war ich immer schlecht. Vieweg, Wiesbaden Biondi BL (2006) Three-dimensional seismic imaging. Society of Exploration Geophysicists, Tulsa Bruns EH (1878) Die Figur der Erde. Publikation Königl Preussisch Geodätisches Institut. P Stankiewicz, Berlin Claerbout J (2009) Basic earth imaging. Stanford University Press, Stanford Dahlen FA, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton Emmermann R, Raiser B (1997) Das System Erde – Forschungsgegenstand des GFZ. Vorwort des GFZ-Jahresberichts 1996/1997, GeoForschungsZentrum, Potsdam Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Kluwer Academic, Dordrecht/Boston Evans LD (2002) Partial differential equation, 3rd printing. American Mathematical Society, Providence Fehlinger T (2009) Multiscale formulations for the disturbing potential and the deflections of the vertical in locally reflected physical geodesy. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Dr. Hut, München Fengler MJ, Freeden W (2005) A non-linear Galerkin scheme involving vector and tensor spherical harmonics for solving the incompressible Navier–Stokes equation on the sphere. SIAM J Sci Comput 27:967–994 Freeden W (1998) The uncertainty principle and its role in physical geodesy. In: Progress in geodetic science at GW 98, pp 225–236, Shaker Verlag, Aachen Freeden W (1999) Multiscale modelling of spaceborne geodata. B.G. Teubner, Stuttgart/Leipzig Freeden W (2009) Geomathematik, was ist das überhaupt? Jahresbericht der Deutschen Mathematiker Vereinigung (DMV), Vieweg+Teubner, JB. 111, Heft, vol 3, pp 125–152 Freeden W (2011) Metaharmonic lattice point theory. CRC/Taylor & Francis, Boca Raton Freeden W, Blick C (2013) Signal decorrelation by means of multiscale methods. World Min 65(5):1–15 Freeden W, Gerhards C (2010) Poloidal and toroidal fields in terms of locally supported vector wavelets. Math Geosci 42:817–838
Geomathematics: Its Role, Its Aim, and Its Potential
77
Freeden W, Gerhards C (2013) Geomathematically oriented potential theory. CRC/Taylor & Francis, Boca Raton Freeden W, Gutting M (2013) Special functions of mathematical (geo-)sciences. Birkhäuser, Basel Freeden W, Maier T (2002) Multiscale denoising of spherical functions: basic theory and numerical aspects. Electron Trans Numer Anal 14:40–62 Freeden W, Mayer T (2003) Wavelets generated by layer potentials. Appl Comput Harm Anal (ACHA) 14:195–237 Freeden W, Michel V (2004) Multiscale potential theory (with applications to geoscience). Birkhäuser, Boston/Basel/Berlin Freeden W, Nutz H (2014) Mathematische Methoden. In: Bauer M, Freeden W, Jacobi H, Neu T, Herausgeber, Handbuch Tiefe Geothermie. Springer, Heidelberg Freeden W, Schreiner M (2009) Spherical functions of mathematical geosciences – a scalar, vectorial, and tensorial setup. Springer, Berlin/Heidelberg Freeden W, Wolf K (2008) Klassische Erdschwerefeldbestimmung aus der Sicht moderner Geomathematik. Math Semesterber 56:53–77 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geomathematics). Oxford/Clarendon, Oxford Freeden W, Michel D, Michel V (2005) Local multiscale approximations of geostrophic ocean flow: theoretical background and aspects of scientific computing. Mar Geod 28:313–329 Freeden W, Fehlinger T, Klug M, Mathar D, Wolf K (2009) Classical globally reflected gravity field determination in modern locally oriented multiscale framework. J Geod 83:1171–1191 Gauss, C.F. (1863) Werke, Band 5, Dietrich Göttingen Gerhards C (2011) Spherical multiscale methods in terms of locally supported wavelets: theory and application to geomagnetic modelling. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Dr. Hut, München Grafarend EW (2001) The spherical horizontal and spherical vertical boundary value problem – vertical deflections and geoidal undulations – the completed Meissl diagram. J Geod 75:363–390 Groten E (1979) Geodesy and the Earth’s gravity field I+II. Dümmler, Bonn Gutting M (2007) Fast multipole methods for oblique derivative problems. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Shaker, Aachen Heiskanen WA, Moritz H (1967) Physical geodesy. Freeman and Company, San Francisco Haar A (1910) Zur Theorie der orthogonalen Funktionssysteme. Math Ann 69:331–371 Helmert FR (1881) Die mathematischen und physikalischen Theorien der Höheren Geodäsie 1+2, B.G. Teubner, Leipzig Ilyasov M (2011) A tree algorithm for Helmholtz potential wavelets on non-smooth surfaces: theoretical background and application to seismic data processing. PhD-thesis, Geomathematics Group, University of Kaiserslautern Jakobs F, Meyer H (1992) Geophysik – Signale aus der Erde. Teubner, Leipzig Kümmerer B (2002) Mathematik. Campus, Spektrum der Wissenschaftsverlagsgesellschaft, pp 1–15 Lemoine FG, Kenyon SC, Factor JK, Trimmer RG, Pavlis NK, Shinn DS, Cox CM, Klosko SM, Luthcke SB, Torrence MH, Wang YM, Williamson RG, Pavlis EC, Rapp RH, Olson TR (1998) The development of the joint NASA GSFC and NIMA geopotential model EGM96. NASA/TP1998-206861, NASA Goddard Space Flight Center, Greenbelt Listing JB (1873) Über unsere jetzige Kenntnis der Gestalt und Größe der Erde. Dietrich, Göttingen Marks DL (2013) A family of approximations spanning the Born and Rytov scattering series. Opt Exp 14:8837–8848 Martin GS, Marfurt KJ, Larsen S (2002) Marmousi-2: an updated model for the investigation of AVO in structurally complex areas. In: Proceedings, SEG annual meeting, Salt Lake City Meissl P (1971) On the linearisation of the geodetic boundary value problem. Report No. 152, Department of Geodetic Science, The Ohio State University, Columbo, OH
78
W. Freeden
Michel V (2002) A multiscale approximation for operator equations in separable Hilbert spaces – case study: reconstruction and description of the Earth’s interior. Habilitation-thesis, Geomathematics Group, University of Kaiserslautern, Shaker, Aachen Michel V (2013) Lectures on constructive approximation – Fourier, spline, and wavelet methods on the real line, the sphere, and the ball. Birkhäuser, Boston Müller C (1969) Foundations of the mathematical theory of electromagnetic waves. Springer, Berlin/Heidelberg/New York Nashed MZ (1981) Operator-theoretic and computational approaches to ill-posed problems with application to antenna theory. IEEE Trans Antennas Propag 29:220–231 Nerem RS, Koblinski CJ (1994) The geoid and ocean circulation. In: Vanicek P, Christon NT (eds) Geoid and its geophysical interpretations. CRC, Boca Raton, pp 321–338 Neumann F (1887) Vorlesungen über die Theorie des Potentials und der Kugelfunktionen. Teubner, Leipzig, pp 135–154 Neunzert H, Rosenberger B (1991) Schlüssel zur Mathematik. Econ, Düsseldorf Nolet G (2008) Seismic tomography: imaging the interior of the Earth and Sun. Cambridge University Press, Cambridge Nutz H (2002) A unified setup of gravitational observables. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Shaker, Aachen Ostermann I (2011) Modeling heat transport in deep geothermal systems by radial basis functions. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Dr. Hut, München Pedlovsky J (1979) Geophysical fluid dynamics. Springer, New York/Heidelberg/Berlin Pesch HJ (2002) Schlüsseltechnologie Mathematik. Teubner, Stuttgart/Leipzig/Wiesbaden Popov MM, Semtchenok NM, Popov, Verdel AR (2006) Gaussian beam migration of multi-valued zero-offset data. In: Proceedings, international conference, days on diffraction, St. Petersburg, pp 225–234 Popov MM, Semtchenok NM, Popov PM, Verdel AR (2008) Reverse time migration with Gaussian beams and velocity analysis applications. In: Extended abstracts, 70th EAGE conference & exhibitions, Rome, F048 Ritter JRR, Christensen UR (eds) (2007) Mantle plumes – a multidisciplinary approach. Springer, Heidelberg Rummel R (2002) Dynamik aus der Schwere – Globales Gravitationsfeld. An den Fronten der Forschung (Kosmos, Erde, Leben), Hrsg. R. Emmermann u.a., Verhandlungen der Gesellschaft Deutscher Naturforscher und Ärzte, 122. Versammlung, Halle Rummel R, van Gelderen M (1995) Meissl scheme – spectral characteristics of physical geodesy. Manuscr Geod 20:379–385 Skudrzyk E (1972) The foundations of acoustics. Springer, Heidelberg Snieder R (2002) The Perturbation method in elastic wave scattering and inverse scattering in pure and applied science, general theory of elastic wave. Academic, San Diego, pp 528–542 Sonar T (2001) Angewandte Mathematik, Modellbildung und Informatik: Eine Einführung für Lehramtsstudenten, Lehrer und Schüler. Vieweg, Braunschweig, Wiesbaden Sonar T (2011) 3000 Jahre Analysis. Springer, Heidelberg/Dordrecht/London/New York Stokes GG (1849) On the variation of gravity at the surface of the earth. Trans Camb Philos Soc 8:672–712; Mathematical and physical papers by George Gabriel Stokes, vol II. Johanson Reprint Corporation, New York, pp 131–171 Tarantola A (1984) Inversion of seismic relation data in the acoustic approximation. Geophysics 49:1259–1266 Torge W (1991) Geodesy. Walter de Gruyter, Berlin Weyl H (1916) Über die Gleichverteilung von Zahlen mod Eins. Math Ann 77:313–352 Wolf K (2009) Multiscale modeling of classical boundary value problems in physical geodesy by locally supported wavelets. PhD-thesis, Geomathematics Group, University of Kaiserslautern, Dr. Hut, München Yilmaz O (1987) Seismic data analysis: processing, inversion and interpretation of seismic data. Society of Exploration Geophysicists, Tulsa
Navigation on Sea: Topics in the History of Geomathematics Thomas Sonar
Contents 1 General Remarks on the History of Geomathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The History of the Magnet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Early Modern England . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 The Gresham Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 William Gilberts Dip Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The Briggsian Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 The Computation of the Dip Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80 80 80 83 85 87 90 100 106 106
Abstract
In this chapter, we review the development of the magnet as a means for navigational purposes. Around 1600, knowledge of the properties and behavior of magnetic needles began to grow in England mainly through the publication of William Gilbert’s influential book De Magnete. Inspired by the rapid advancement of knowledge on one side and of the English fleet on the other, scientists associated with Gresham College began thinking of using magnetic instruments to measure the degree of latitude without being dependent on a clear sky, a quiet sea, or complicated navigational tables. The construction and actual use of these magnetic instruments, called dip rings, is a tragic episode in the history of seafaring since the latitude does not depend on the magnetic field of the Earth
T. Sonar () Computational Mathematics, Technische Universität Braunschweig, Braunschweig, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_2
79
80
T. Sonar
but the construction of a table enabling seafarers to take the degree of latitude is certainly a highlight in the history of geomathematics.
1
General Remarks on the History of Geomathematics
Geomathematics in our times is thought of being a very young science and a modern area in the realms of mathematics. Nothing is farer from the truth. Geomathematics began as man realized that he walked across a sphere-like Earth and that this had to be taken into account in measurements and computations. Hence, Eratosthenes can be seen as an early geomathematician when he tried to determine the circumference of the Earth by measurements of the sun’s position from the ground of a well and the length of shadows farther away at midday. Other important topics in the history of geomathematics are the struggles for an understanding of the true shape of the Earth which led to the development of potential theory and much of multidimensional calculus (see Greenberg 1995), the mathematical developments around the research of the Earth’s magnetic field, and the history of navigation.
2
Introduction
The history of navigation is one of the most exciting stories in the history of mankind and one of the most important topics in the history of geomathematics. The notion of navigation thereby spans the whole range from the ethnomathematics of Polynesian stick charts via the compass to modern mathematical developments in understanding the Earth’s magnetic field and satellite navigation via GPS. We shall concentrate here on the use of the magnetic needle for navigational purposes and in particular on developments having taken place in early modern England. However, we begin our investigations with a short overview on the history of magnetism following Balmer (1956).
3
The History of the Magnet
The earliest sources on the use of magnets for the purpose of navigation stem from China. During the Han epoche 202–220, we find the description of carriages equipped with compass-like devices so that the early Chinese imperators were able to navigate on their journeys through their enormous empire. These carriages were called tschinan-tsche, meaning “carriages that show noon.” The compasslike devices consisted of little humanlike figures which swam on water in a bowl, the finger of the stretched arm pointing always straight to the south. We do not know nowadays why the ancient Chinese preferred the southward direction instead of a northbound one. In a book on historical memoirs written by Sse-ma-tsien (or Schumatsian), we find a report dating back to the first half of the second century about a present that imperator Tsching-wang gave 1100 before Christ
Navigation on Sea: Topics in the History of Geomathematics
81
to the ambassadors of the cities of Tonking and Cochinchina. The ambassadors received five “magnetic carriages” in order to guide them safely back to their cities even through sand storms in the desert. Since the ancient Chinese knew about the attracting forces of a magnet, they called them “loving stones.” In a work on natural sciences written by Tschin-tsang-ki from the year 727, we read: The loving stone attracts the iron like an affectionate mother attracts their children around her; this is the reason where the name comes from.
It was also very early known that a magnet could transfer its attracting properties to iron when it was swept over the piece of iron. In a dictionary of the year 121, the magnet is called a stone “with which a needle can be given a direction,” and hence, it is not surprising that a magnetic needle mounted on a piece of cork and swimming in a bowl of water belonged to the standard equipment of larger Chinese ship as early as the fourth century. Such simple devices were called “bussola” by the Italians and are still known under that name. A magnetic needle does not point precisely to the geographic poles but to the magnetic ones. The locus of the magnetic poles moves in time so that a deviation has always to be taken into account. Around the year 1115, the problem of deviation was known in China. The word “magnet” comes from the Greek word magnes describing a sebacious rock which, according to the Greek philosopher and natural scientist Theophrastos (ca. 371–287 BC), was a forgeable and silver white rock. The philosopher Plato (428/27–348/47 BC) called magnetic rock the “stone of Heracles,” and the poet Lucretius (ca. 99–55 BC) used the word “magnes” in the sense of attracting stone. He attributed the name to a place named “Magnesia” where this rock could be found. Other classical Greek anecdotes call a shepherd named “Magnes” to account for the name. It is said that he wore shoes with iron nails and while accompanying his sheep suddenly could no longer move because he stood on magnetic rock. Homer wrote about the force of the magnet as early as 800 BC. It seems typical for the ancient Greek culture that one sought for an explanation of this force fairly early on. Plato thought of this force as being simply “divine.” Philosopher Epicurus (341–270) had the hypothesis that magnets radiate tiny particles – atoms. Eventually Lucretius exploited this hypothesis and explained the attracting force of a magnet by the property of the radiated atoms to clear the space between the magnet and the iron. Into the free space then iron atoms could penetrate, and since iron atoms try hard to stay together (says Lucretius), the iron piece would follow them. The pressure of the air also played some minor role in this theory. Lucretius knew that he had to answer the question why iron would follow but other materials would not. He simply declared that gold would be too heavy and timber would show too large porosities so that the atoms of the magnet would simply go through. The first news on the magnet in Western Europe came from Paris around the year 1200. A magnetic needle was used to determine the orientation. We do not know how the magnet came to Western Europe and how it was received but it is almost certainly true that the crusades and the associated contact with the peoples in the Mediterranean played a crucial role. Before William Gilbert around 1600
82
T. Sonar
Fig. 1 The magnetic perpetuum mobile of Peregrinus
Fig. 2 Elizabeth I (Armada portrait)
came up with a “magnetick philosophy,” it was the crusader, astronomer, chemist, and physician Peter Peregrinus De Maricourt who developed a theory of the magnet in a famous “letter on the magnet” dating back to 1269. He describes experiments with magnetic stones which are valid even nowadays. Peregrinus grinds a magnetic stone in the form of a sphere, places it in a wooden plate, and puts this plate in a bowl with water. Then he observes that the sphere moves according to the poles. He develops ideas of magnetic clocks and describes the meaning of the magnet with respect to the compass. He also develops a magnetic perpetuum mobile according to Fig. 1. A magnet is mounted at the tip of a hand which is periodically moving (says Peregrinus) because of iron nails on the circumference.
Navigation on Sea: Topics in the History of Geomathematics
83
Fig. 3 De Magnete
Peregrinus’ work was so influential in Western Europe that even 300 years after his death, he is still accepted as the authority on the magnet.
4
Early Modern England
In the sixteenth century, Spain and Portugal developed into the leading sea powers. Currents of gold, spices, and gemstones regorge from the South Americans into the home countries. England had missed connection. When Henry VIII died in 1547, only a handful of decaying ships were lying in the English sea harbors. His successor, his son Edward VI, could only rule for 6 years before he died young. Henry’s daughter Mary, a devout Catholic, tried hard to re-catholize the country her father had steered into protestantism and married Philipp II, King of Spain. Mary was fairly brutal in the means of the re-catholization and many of the protestant intelligentsia left the country in fear of their lives. “Bloody Mary” died in 1558 at the age of 47 and the way opened to her stepsister Elizabeth. Within one generation itself, Elizabeth I transformed rural England to the leading sea power on Earth. She
84
T. Sonar
Unmagnetized needle
Magnetized needle
Fig. 4 Norman’s discovery of the magnetic dip
Fig. 5 Dip rings: (a) the dip ring after Gilbert in De Magnete. (b) A dip ring used in the seventeenth century
was advised very well by Sir Walter Raleigh who clearly saw the future of England on the seas. New ships were built for the navy and in 1588 the small English fleet was able to drown the famous Spanish armada – by chance and with good luck; but this incident served to boost not only the feeling of self-worth of a whole nation but also the realization of the need of a navy and the need of efficient navigational tools. English mariners realized on longer voyages that the magnetic needle inside a compass lost its magnetic power. If that was detected, the needle had to be magnetized afresh – it had to be “loaded” afresh. However, this is not the reason why the magnet is called loadstone in the English language but only a mistranslation. The correct word should be lodstone – “leading stone” – but that word was actually never used (Pumfrey 2002).
Navigation on Sea: Topics in the History of Geomathematics
85
Fig. 6 Measuring the dip on the terrella in De Magnete
5
The Gresham Circle
In 1592, Henry Briggs (1561–1639), chief mathematician in his country, was elected examiner and lecturer in mathematics at St John’s College, Cambridge, which nowadays corresponds to a professorship. In the same year, he was elected Reader of the Physics Lecture founded by Dr. Linacre in London. One hundred years before the birth of Briggs, Thomas Linacre was horrified by the pseudomedical treatment of sick people by hairdressers and vicars who did not shrink back from chirurgical operations without a trace of medical instruction. He founded the Royal College of Physicians of London and Briggs was now asked to deliver lectures with medical contents. The Royal College of Physicians was the first important domain for Briggs to make contact with men outside the spheres of the two great universities, and, indeed most important, he met William Gilbert (1544–1603) who was working on the wonders of the magnetical forces and who revolutionized modern science only a few years later.
86
T. Sonar
While England was on its way to become the world’s leading sea force, the two old English universities Oxford and Cambridge were in an alarming state of sleepiness (Hill 1997, p. 16ff). Instead of working and teaching on the forefront of modern research in important topics like navigation, geometry, and astronomy, the curricula were directly rooted in the ancient Greek tradition. Mathematics included reading of the first four or five books of Euclid, and medicine was read after Galen and Ptolemy ruled in astronomy. When the founder of the English stock exchange (Royal Exchange) in London, Thomas Gresham, died, he left in his last will money and buildings in order to found a new form of university, the Gresham College, which is still in function. He ordered the employment of seven lecturers to give public lectures in theology, astronomy, geometry, music, law, medicine, and rhetorics mostly in English language. The salary of the Gresham professors was determined to be £50 a year which was an enormous sum as compared to the salary of the Regius professors in Oxford and Cambridge (Hill 1997, p. 34). The only conditions on the candidates for the Gresham professorships were brilliance in their field and an unmarried style of life. Briggs must have been already well known as a mathematician of the first rank since he was chosen to be the first Gresham professor of Geometry in 1596. Modern mathematics was needed badly in the art of navigation, and public lectures on mathematics were in fact already given in 1588 on behalf of the East India Company, the Muscovy Company, and the Virginia Company. Even before 1588, there were attempts by Richard Hakluyt to establish public lectures and none less than Francis Drake had promised £20 (Hill 1997, p. 34), but it needed the national shock of the attack of the Armada in 1588 to make such lectures come true. During his time in Gresham College, Briggs became the center of what we can doubtlessly call the Briggsian circle. Hill writes (1997, p. 37): He [i.e. BRIGGS] was a man of the first importance in the intellectual history of his age, . . . . Under him Gresham at once became a centre of scientific studies. He introduced there the modern method of teaching long division, and popularized the use of decimals.
The Briggsian circle consisted of true Copernicans: men like William Gilbert who wrote De Magnete; the able applied mathematician Edward Wright who is famous for his book on the errors in navigation; William Barlow, a fine instrument maker and men of experiments; and the great popularizer of scientific knowledge, Thomas Blundeville. Gilbert and Blundeville were protégés of the Earl of Leicester, and we know about connections with the circle of Raleigh in which the brilliant mathematician Thomas Harriot worked. Blundeville held contacts with John Dee who introduced modern continental mathematics and the Mercator maps in England (Hill 1997, p. 42). Hence, we can think of a scientific sub-net in England in which important work could be done which was impossible to do in the great universities. It was this time in Gresham College in which Briggs and his circle were most productive in the calculation of tables of astronomical and navigational importance. In the center of their activities was Gilbert’s “magnetick philosophy.”
Navigation on Sea: Topics in the History of Geomathematics
6
87
William Gilberts Dip Theory
The role of William Gilbert in shaping modern natural sciences cannot be overestimated, and a recent biography of Gilbert (Pumfrey 2002) emphasizes his importance in England and abroad. Gilbert, a physician and member of the Royal College of Physicians in London, became interested in navigational matters and the properties of the magnetic needle in particular by his contacts to seamen and famous navigators of his time alike. As a result of years of experiments, thought, and discussions with his Gresham friends, the book De Magnete, magneticisque corporibus, et de magno magnete tellure; Physiologia nova, plurimis & argumentis, & experimentis demonstrata was published in 1600. (I refer to the English translation Gilbert (1958) by P. Fleury Mottelay which is a reprint of the original of 1893. There is a better translation by Sylvanus P. Thompson from 1900, but while the latter is rare, the former is still in print.) It contained many magnetic experiments with what Gilbert called his terrella – the little Earth – which was a magnetical sphere. In the spirit of the true Copernican, Gilbert deduced the rotation of the Earth from the assumption of it being a magnetic sphere. Concerning navigation, Briggs, and the Gresham circle, the most interesting chapter in De Magnete is Book V: On the dip of the magnetic needle. Already in 1581, the instrument maker Robert Norman had discovered the magnetic dip in his attempts to straighten magnetic needles in a fitting on a table. He had observed that an unmagnetized needle could be fitted in a parallel position with regard to the surface of a table but when the same needle was magnetized and fitted again, it made an angle with the table. Norman published his results already in 1581 (R NORMAN – The New Attractive London, 1581). Even before Norman, the dip was reported by the German astronomer and instrument maker Georg Hartmann from Nuremberg in a letter to the Duke Albrecht of Prussia from 4th of March, 1544, see Balmer 1956, pp. 290–292), but nobody read the letter. In modern notion, the phenomenon of the dip is called inclination in contrast to the declination or variation of the needle. A word of warning is appropriate here: in Gilbert’s time, many authors used the word declination for the inclination. Anyway, Norman was the first to build a dip ring in order to measure the inclination. This ring is nothing else but a vertical compass. Already Norman had discovered that the dip varied with time! However, Gilbert believed that he had found the secrets of magnetic navigation. He explained the variation of the needle by land masses acting on the compass which fitted nicely with the measurements of seamen but is wrong, as we now know. Concerning the dip, let me give a summary of Gilbert’s work in modern terms. Gilbert must have measured the dip on his terrella many, many times before he was led to his: First hypothesis: There is an invertible mapping between the lines of latitude and the lines of constant dip. Hence, Gilbert believed to have found a possibility of determining the latitude on Earth from the degree of the dip. Let ˇ be the latitude and ˛ the dip. He then formulated his:
88
T. Sonar
Fig. 7 Third hypothesis
Second hypothesis: At the equator, the needle is parallel to the horizon, i.e., ˛ D 0ı . At the north pole, the needle is perpendicular to the surface of the Earth, i.e., ˛ D 90ı . He then draws a conclusion, but in our modern eyes, this is nothing but another: Third hypothesis: If ˇ D 45ı , then the needle points exactly to the second equatorial point. What he meant by this is best described in Fig. 7. Gilbert himself writes . . . points to the equator F as the mean of the two poles. (Gilbert 1958, p. 293)
Note that in Fig. 7, the equator is given by the line A F and the poles are B (north) and C so that our implicit (modern) assumption that the north pole is always shown on top of a figure is not satisfied. From his three hypotheses, Gilbert concludes correctly: First conclusion: The rotation of the needle has to be faster on its way from A to L than from L to B. Or, in Gilbert’s words, . . . the movement of this rotation is quick in the first degrees from the equator, from A to L, but slower in the subsequent degrees, from L to B, that is, with reference to the equatorial point F , toward C . (Gilbert 1958, p. 293)
And on the same page, we read: . . . it dips; yet, not in ratio to the number of degrees or the arc of the latitude does the magnetic needle dip so many degrees or over a like arc; but over a very different one, for this movement is in truth not a dipping movement, but really a revolution movement, and it describes an arc of revolution proportioned to the arc of latitude.
Navigation on Sea: Topics in the History of Geomathematics
89
Fig. 8 Gilbert’s geometrical construction of the mapping in De Magnete
This is simply the lengthy description of the following: Second conclusion: The mapping between latitude and dip cannot be linear. Now that Gilbert had made up his mind concerning the behavior of the mapping at ˇ D 0ı ; 45ı ; and 90ı , a construction of the general mapping was sought. It is exactly here where De Magnete shows strange weaknesses; in fact, we witness a qualitative jump from a geometric construction to a dip instrument. Gilbert’s geometrical description can be seen in Fig. 8. I do not intend to comment on this construction because this was done in detail elsewhere (Sonar 2001), but it is not possible to understand the construction from Gilbert’s writings in De Magnete. Even more surprising, while all figures in De Magnete are raw wood cuts in the quality shown before, suddenly there is a fine technical drawing of the resulting construction as shown in Fig. 9. The difference between this drawing and all other figures in De Magnete and the weakness in the description of the construction of the mapping between latitudes and dip angles suggest that at least this part was not written by Gilbert alone but by some of his friends in the Gresham circle. Pumfrey speaks of the dark secret of De Magnete (Pumfrey 2002, p. 173ff) and gives evidence that Edward Wright, whose On Certain Errors in Navigation had appeared a year before De Magnete, had his hands in some parts of Gilbert’s book. In Parsons and Morris (1939, pp. 61–67), we find the following remarks: Wright, and his circle of friends, which included Dr. W. Gilbert, Thomas Blundeville, William Barlow, Henry Briggs, as well as Hakluyt and Davis, formed the centre of scientific thought at the turn of the century. Between these men there existed an excellent spirit of co-operation, each sharing his own discoveries with the others. In 1600 Wright assisted Gilbert in the compilation of De Magnete. He wrote a long preface to the work, in which he proclaimed his belief in the rotation of the earth, a theory which Gilbert was explaining, and also contributed chapter 12 of Book IV, which dealt with the method of finding the amount of the variation of the compass. Gilbert devoted his final chapters to practical problems of navigation, in which he knew many of his friends were interested.
90
T. Sonar
Fig. 9 The fine drawing in De Magnete
There is no written evidence that Briggs was involved too but it seems very unlikely that the chief mathematician of the Gresham circle should not have been in charge in so important a development as the dip theory. We shall see later on that the involvement of Briggs is highly likely when we study his contributions to dip theory in books of other authors.
7
The Briggsian Tables
If we trust Ward (1740, pp. 120–129), the first published table of Henry Briggs is the table which represents Gilbert’s mapping between latitude and dip angles in Thomas Blundeville’s book The Theoriques of the seuen Planets, shewing all their diuerse motions, and all other Accidents, called Passions, thereunto belonging. Whereunto is added by the said Master Blundeuile, a breefe Extract by him made, of Magnus his Theoriques, for the better vnderstanding of the Prutenicall Tables, to calculate thereby the diuerse motions of the seuen Planets. There is also hereto added, The making, description, and vse, of the two most ingenious and necessarie Instruments for Sea-men, to find out therebye the latitude of any place vpon the Sea or Land, in the darkest night that is, without the helpe of Sunne, Moone, or Starre.
Navigation on Sea: Topics in the History of Geomathematics
Fig. 10 Title page of The Theoriques by Blundeville
91
92
T. Sonar First inuented by M. Doctor Gilbert, a most excellent Philosopher, and one of the ordinarie Physicians to her Maiestie: and now here plainely set down in our mother tongue by Master Blundeuile. London Printed by Adam Islip. 1602.
Blundeville is an important figure in his own right; see Taylor (1954, p. 173) and Waters (1958, pp. 212–214). He was one of the first and most influential popularizers of scientific knowledge. He did not write for the expert, but for the layman, i.e., the young gentlemen interested in so diverse questions of science, writing of history, mapmaking, logic, seamenship, or horse riding. We do not know much about his life (Campling 1921–1922), but his role in the Gresham circle is apparent through his writings. In The Theoriques, Gilbert’s dip theory is explained in detail, and a step-by-step description of the construction of the dip instrument is given. I have followed Blundeville’s instructions and constructed the dip instrument again elsewhere; see Sonar (2002). See also Sonar (2001). The final result is shown in Blundeville’s book as in Fig. 11. In order to understand the geometrical details, it is necessary to give a condensed description of the actual construction in Fig. 8 which is given in detail in The Theoriques. We start with a circle ACDL representing planet Earth as in Fig. 12. Note that A is an equatorial point while C is a pole. The navigator (and hence the dip instrument) is assumed to be in point N which corresponds to the latitude ˇ D 45ı . In the first step of the construction, a horizon line is sought, i.e., the line from the navigator in N to the horizon. Now a circle is drawn around A with radius AM (the Earth’s radius). This marks the point F on a line through A parallel to CL. A circle around M through F now gives the arc F M . The point H is constructed by drawing a circle around C with radius AM. The point of intersection of this circle with the outer circle through F is H . If the dip instrument is at A, the navigator’s horizon point is F. If it is in C, the navigator’s horizon will be in H. Correspondingly, drawing a circle through N with radius AM gives the point S ; hence, S is the point at the horizon seen from N. Hence, to every position N of the needle, there is a quadrant of dip which is the arc from M to a corresponding point on the outer circle through F . If N is at ˇ D 45ı latitude as in our example, we know from Gilbert’s third hypothesis that the needle points to D. The angle between S and the intersection point of the quadrant of dip with the direction of the needle is the dip angle. The remaining missing information is the point to which the needle points for a general latitude ˇ. This is accomplished by quadrants of rotation which implement Gilbert’s idea of the needle rotating on its way from A to C . The construction of these quadrants is shown in Fig. 13. We need a second outer circle which is constructed by drawing a circle around A through L. The intersection point of this circle with the line AF is B and the second outer circle is then the circle through B around M . Drawing a circle around C through L defines the point G on the second outer circle. These arcs, GL and BL, are the quadrants of
b
b b
Navigation on Sea: Topics in the History of Geomathematics
Fig. 11 The dip instrument in The Theoriques
93
94
T. Sonar
Fig. 12 The quadrant of dip
Fig. 13 The quadrant of rotation
rotation corresponding to the positions C and A of the needle, respectively. Assume again that the dip instrument is in N at ˇ D 45ı . Then the corresponding quadrant of rotation is constructed by a circle around N through L and is the arc OL. This arc is now divided in 90 parts, starting from the second outer circle (0ı ) and ending at L.90ı /. Obviously, in our example, according to Gilbert, the 45ı mark is exactly at D.
b
Navigation on Sea: Topics in the History of Geomathematics
95
Fig. 14 The final steps
Now we are ready for the final step. Putting together all our quadrants and lines, we arrive at Fig. 14. The needle at N points to the mark 45ı on the arc of rotation OL and hence intersects the quadrant of dip (arc S M ) in the point S . The angle of c is the dip angle ı. the arc SR We can now proceed in this manner for all latitudes from ˇ D 0ı to ˇ D 90ı in steps of 5ı . Each latitude gives a new quadrant of dip, a new quadrant of rotation, and a new intersection point R. The final construction is shown in Fig. 15. However, in Fig. 15, the construction is shown in the lower right quadrant instead of in the upper left and uses already the notation of Blundeville instead of those of William Gilbert. The main goal of the construction, however, is a spiral line which appears after the removal of all the construction lines, as in Fig. 16, and can already be seen in the upper left picture in Blundeville’s drawing in Fig. 11. The spiral line consists of all intersection points R. Together with a quadrant which can rotate around the point C of the mater, the instrument is ready to use. In order to illustrate its use, we give an example. Consider a seaman who has used a dip ring and measured a dip angle of 60ı . Then he would rotate the quadrant until the spiral line intersects the quadrant at the point 60ı on the inner side of the quadrant. Then the line A B on the quadrant intersects the scale on the mater at the degree of latitude; in our case 36ı , see Fig. 18. However, accurate reading of the scales becomes nearly impossible for angles of dip larger than 60ı , and the reading depends heavily on the accuracy of the construction of the spiral line. Therefore, Henry Briggs was asked to compute a table in order to replace the dip instrument by a simple table look-up. At the very end of Blundeville’s The Theoriques, we find the following appendix; see Fig. 19:
b
b
96
T. Sonar
Fig. 15 The construction of Gilbert’s mapping
90
Fig. 16 The mater of the dip instrument in The Theoriques E
50
60
70
80
C
40
30
0 5 10
20
F
A short appendix annexed to the former Treatise by Edward Wright, at the motion of the right Worshipful M. Doctor Gilbert Because of the making and using of the foresaid Instrument, for finding the latitude by the declination of the Magneticall Needle, will bee too troublesome for the most part of Seamen, being notwithstanding a thing most worthie to be put in daily practise, especially by such as undertake long voyages: it was thought meet by my worshipfull friend M. Doctor Gilbert, that (according to M. Blundeuile’s earnest request) this Table following should be
Navigation on Sea: Topics in the History of Geomathematics
90
80
A
60
70
Fig. 17 The quadrant
97
0
10
20
30
40
50 B
G
60
50
40
30
70
80 90
0
E G
50
60
70
80
C
10
90
20
A
40
0 5 10
20
30
B
F Fig. 18 Determining the latitude for 60ı dip
hereunto adioned; which M. Henry Briggs (professor of Geometrie in Gresham Colledge at London) calculated and made out of the doctrine and tables of Triangles, according to the Geometricall grounds and reason of this Instrument, appearing in the 7 and 8 Chapter of M. Doctor Gilberts fift[h] booke of the Loadstone. By helpe of which Table, the Magneticall declination being giuen, the height of the Pole may most easily be found, after this manner. With the Instrument of Declination before described, find out what the Magneticall declination is at the place where you are: Then look that Magneticall declination in the second Collum[n]e of this Table, and in the same line immediatly towards the left hand, you shall find the height of the Pole at the same place, unleße there be some variation of the declination, which must be found out by particular obseruation in euery place.
98
T. Sonar
Fig. 19 The appendix as found in Edward Wrights On errors in Navigation
The next page (which is the final page of The Theoriques) indeed shows the table. Fig. 19 shows the appendix. In order to make the numbers in the table more visible, I have retyped the table. We shall not discuss this table in detail, but it is again worthwhile to review the relations between Gilbert, Briggs, Blundeville, and Wright (Hill 1997), p. 36.: Briggs was at the center of Gilbert’s group. At Gilbert’s request he calculated a table of magnetic dip and variation. Their mutual friend Edward Wright recorded and tabulated much of the information which Gilbert used and helped in the production of De Magnete. Thomas Blundeville, another member of Brigg’s group, and, like Gilbert, a former protégé of the Earl of Leicester, popularized Gilbert’s discoveries in The Theoriques of the Seven Planets (1602), a book in which Briggs and Wright again collaborated.
It took Blundeville’s The Theoriques to describe the construction of the dip instrument accurately which nebulously appeared in Gilbert’s De Magnete. However, even Blundeville does not say a word concerning the computation of the table. Another friend in the Gresham circle, famous Edward Wright, included all of the necessary details in the second edition of his On Errors in Navigation
Navigation on Sea: Topics in the History of Geomathematics First column Heighs of the pole Degrees 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Second column Magnetical declination Deg. Min. 2 11 4 20 6 27 8 31 10 34 12 34 14 32 16 28 18 22 20 14 22 4 23 52 25 38 27 22 29 4 30 45 32 24 34 0 35 36 37 9 38 41 40 11 41 39 43 6 44 30 45 54 47 15 48 36 49 54 51 11
First column Heighs of the pole Degrees 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Second column Magnetical declination Deg. Min. 52 27 53 41 54 53 56 4 57 13 58 21 59 28 60 33 61 37 62 39 63 40 64 39 65 38 66 35 67 30 68 24 69 17 70 9 70 59 71 48 72 36 73 23 74 8 74 52 75 35 76 17 76 57 77 37 78 15 78 53
99 First column Heighs of the pole Degrees 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
Second column Magnetical declination Deg. Min. 79 29 80 4 80 38 81 11 81 43 82 13 82 43 83 12 83 40 84 7 84 32 84 57 85 21 85 44 86 7 86 28 86 48 87 8 87 26 87 44 88 1 88 17 88 33 88 47 89 1 89 14 89 27 89 39 89 50 90 0
(Wright 1610), the first edition of which appeared 1599. Much has been said about the importance of Edward Wright, (see, for instance, Parsons and Morris 1939), and he was certainly one of the first – if not the first – who was fully aware of the mathematical background of Mercator’s mapping; see Sonar (2001, p. 131ff). It is in Wright’s On Errors in Navigation where Wright and Briggs explain the details of the computation of the dip table which was actually computed by Briggs showing superb mastership of trigonometry. We shall now turn to this computation.
100
T. Sonar
Fig. 20 Figure A: CHAP. XIIII to find the inclination or dipping of the magneticall needle under the Horizon
8
The Computation of the Dip Table
In the second edition of Wright’s book On Errors in Navigation, we find in chapter XIIII: Let OBR be a meridian of the Earth, wherein let O be the pole, B the æquinoctal, and R the latitude (suppose 60 degrees), and let BD be perpendicular to AB in B and equal to the subtense OB; and drawing the line AD, describe therwith the arch DSV. Then draw the subtense OR, wherewith (taking R for the center) draw the lines RS equal to RO and parts AS equal to AD. Also because BR is assumed to be 60 deg., therefore let ST be 60 90 of the arch STO, and draw the line RT, for the angle ART shall be the cõplement of the magnetical needles inclinatiõ under the horizon, which may be found by the solution of the two triangles OAR and RAS after this manner:
Although here again other notation is used as in Blundeville’s book as well as in De Magnete, we can easily see the situation as described by Gilbert. Now the actual computation starts: First the triangle OAR is given because of the arch OBR, measuring the same 150 degr. and consequently the angle at R 15 degr. being equall to the equall legged angle at O; both which together are 30 degr. because they are the complement of the angle OAR (150 degr.) to a semicircle of 180 degr.
Navigation on Sea: Topics in the History of Geomathematics
101
Fig. 21 Figure B: The first step
The first step in the computation hence concerns the triangle OAR in Fig. 21. Since point R lies at 60ı (measured from B), the arc OBR corresponds to an angle of 90ı C 60ı D 180ı 30ı D 150ı. Hence, the angle at A in the triangle OAR is just 90ı C .90ı 30ı / D 150ı. Since OAR is isosceles, the angles at O and R are identical, and each is 15ı . Let us go on with Wright: Secondly, in the triangle ARS all the sides are given AR the Radius or semidiameter 10,000,000: RS equal to RO the subtense of 150 deg. 19,318,516: and AS equall to AD triple in power to AB, because it is equal in power to AB and BD, that is BO, which is double in power to AB.
The triangle ARS in Fig. 21 is looked at where S lies on the circle around A with radius AD and on the circle around R with radius OR. The segment AR is the radius of the Earth or the “whole sine.” Wright takes this value to be 107 . We have to clarify what is meant by subtense and where the number 19,318,516 comes from. Employing the law of sines in triangle OAR, we get sin 150ı OR D ; AR sin 15ı and therefore, it follows that OR D RO D AR
OR sin 150ı D AR D 19;318;516.:5257 : : :/: AR sin 15ı
Since O lies on the circle around R with radius OR as S does, we also have RO D RS. Furthermore, AS D AD since D as S lies on the circle around A with radius AD. Per constructionem, we have BD D OB, and using the theorem of Pythagoras, we conclude OB 2 D BD 2 D 2AB 2
102
T. Sonar
as well as AD 2 D AB 2 C BD 2 D AB 2 C 2AB 2 D 3AB 2 : This reveals the meaning of the phrase triple in power to AB: “the square is three times as big as AB.” Hence, it follows for AS: p AS D AD0 3 AB D 17;320;508.:0757 : : :/: It is somewhat interesting that Wright does not compute the square root but gives an alternative mode of computation as follows: Or else thus: The arch OB being 90 degrees, the subtense therof OB, that is, the tangent BD is 14,142,136, which sought in the table of Tangents, shall giue you the angle BAD 54 degr. 44 min. 8 sec. the secant whereof is the line AD that is AS 17,320,508.
p In the triangle ABD, we know the lengths of the segments AB and BD D OB D 2 AB D 14;142;135.:6237 : : :/. Hence, for the angle at A, we get tan †A D
Fig. 22 Figure C: Second step
BD D AB
p p 2 AB D 2; AB
Navigation on Sea: Topics in the History of Geomathematics
103
which results in †A D 54:7356 : : :ı D 54ı 440 800 . Using this value, it follows from sin †A D
OB BD D AD AD
that AD D AS D
p OB AB D 2 D 17;320;508.:0757 : : :/: sin †A sin 54ı 440 800
Wright goes on: Now then by 4 Axiom of the 2 booke of Pitisc.1 as the base or greatest side SR 19,318,516 is to ye summe of the two other sides SA and AR 27,320,508; so is the difference of them SX 7,320,508 to the segment of the greatest side SY 10,352,762; which being taken out of SR 19,318,516, there remaineth YR 8,965,754, the halfe whereof RZ 4,482,877, is the Sine of the angle RAZ 26 degr. 38 min. 2 sec. the complement whereof 63 degr. 21 min. 58 sec. is the angle ARZ, which added to the angle ARO 15 degr. maketh the whole angle ORS, 78 make 52 degr. 14 min. 38 sec. which taken out of ARZ 63 degr. 21 min. 58 sec. wherof 60 90 degr. 21 min. 58 sec. there remaineth the angle TRA 11 deg. 7 min. 20 sec. the cõplement whereof is the inclination sought for 78 degrees, 52 minutes, 40 seconds.
The “Axiom 4” mentiod is nothing but the Theorem of chords: If two chords in a circle intersect then the product of the segments of the first chord equals the product of the segments of the other.
Looking at Fig. 23, the theorem of chords is M S SX D SR S Y and since MS D AS C AB, it follows .AS C AB/ SX D SR S Y; resulting in SX SR D AS C AB SY Now the computations should be fully intelligible. Given are AS D 17;320;508; AB D 107 ; SR D OR D 19;318;516, and SX D AS AX D 7;320;508. Hence, SY D
SX .AS C AB/ D 10;352;762: SR
1 The Silesian Bartholomäus Pitiscus (1561–1613) authored the first useful text book on trigonometry: Trigonometriae sive dimensione triangulorum libre quinque, Frankfurt 1595, which was published as an appendix to a book on astronomy by Abraham Scultetus. First independent editions were published in Frankfurt 1599, 1608, 1612 and in Augsburg 1600. The first English translation appeared in 1630.
104
T. Sonar
Fig. 23 Figure D: The final step
Fig. 24 Figure E
The segment YR has length YR D SR SY D 8;965;754. Per constructionem, the point Z is the midpoint of YR. Half of YR is RZ D 4;482;877. From sin †RAZ D RZ=AR D 4;482;877=107 D 0:4482877, we get †RAZ D 26:6339ı D 26ı 380 200 . In the right-angled triangle ARZ, we see from Fig. 24 that †ARZ D 90ı 26ı 380 200 D 63ı 210 5800 . At a degree of latitude of 60ı , the angle ARO at R is 15ı since the obtuse angle in the isosceles triangle ORA is 90ı C 60ı D 150ı. Therefore, †ORS D †ARO C †ARZ D 78ı 210 5800 . The part TRS of this angle is 60/90 of it; hence, †TRS D 52ı 140 3800 . We arrive at †TRA D †ARZ †TRA D 63ı 210 5800 52ı 140 3800 D 11ı 70 2000 :
Navigation on Sea: Topics in the History of Geomathematics
105
Fig. 25 Figure F
Fig. 26 Figure G
The dip angle ı is the complement of the angle TRA, ı D 90ı †TRA D 78ı 520 4000 : Although the task of computing the dip if the degree of latitude is given is now accomplished, we find a final remark on saving of labor: The Summe and difference of the sides SA and AR being alwaies the same, viz. 27,320,508 and 7,320,508, the product of them shall likewise be alwaies the same, viz. 199,999,997,378,064 to be diuided by ye side SR, that is RO the subtense of RBO. Therefore there may be some labour saued in making the table of magneticall inclination, if in stead of the said product you take continually but ye halfe thereof, that is 99,999,998,689,032, and
106
T. Sonar
so diuide it by halfe the subtense RO, that is, by the sine of halfe the arch OBR. Or rather thus: As halfe the base RS (that is, as the sine of halfe the arch OBR) is to halfe the summe of the other two sides SA & AR 13,660,254, so is half the difference of the e 3,660,254 to halfe of the segment SY, which taken out of half the base, there remaineth RZ ye sine of RAZ, whose cõplement to a quadrãt is ye angle sought for ARZ. According to this Diagramme and demonstration was calulated the table here following; the first columne whereof conteineth the height of the pole for euery whole degree; the second columne sheweth the inclination or dipping of the magnetical needle answerable thereto in degr. and minutes.
Although we have taken these computations from Edward Wright’s book, there is no doubt that the author was Henry Briggs as is also clear from the foreword of Wright.
9
Conclusion
The story of the use of magnetic needles for the purposes of navigation is fascinating and gives deep insight into the nature of scientific inventions. Gilbert’s dip theory and the unhappy idea to link latitude to dip is a paradigm of what can go wrong in mathematical modeling. The computation of the dip table is, however, a brilliant piece of mathematics and shows clearly the mastery of Henry Briggs.
References Balmer H (1956) Beiträge zur Geschichte der Erkenntnis des Erdmagnetismus. Verlag H.R. Sauerländer, Aarau Campling A (1922) Thomas Blundeville of Newton Flotman, co. Norfolk (1522–1606). Norfolk Archaeol 21:336–360 Gilbert W (1958) De Magnete. Dover, New York Greenberg JL (1995) The problem of the Earth’s shape from Newton to Clairault. Cambridge University Press, Cambridge Hill Ch (1997) Intellectual origins of the English revolution revisited. Clarendon Press, Oxford Parsons EJS, Morris WF (1939) Edward Wright and his work. Imago Mundi 3:61–71 Pumfrey S (2002) Latitude and the magnetic Earth. Icon Books, Cambridge Sonar Th (2001) Der fromme Tafelmacher. Logos Verlag, Berlin Sonar Th (2002) William Gilberts Neigungsinstrument I: Geschichte und Theorie der magnetischen Neigung. Mitteilungen der Math. Gesellschaft in Hamburg, Band XXI/2, 45–68 Taylor EGR (1954) The mathematical practioneers of Tudor & Stuart England. Cambridge University Press, Cambridge Ward J (1740) The lives of the professors of Gresham College. Johnson Reprint Corporation, London Waters DJ (1958) The art of navigation. Yale University Press, New Haven Wright E (1610) Certaine errors in navigation detected and corrected with many additions that were not in the former edition as appeareth in the next pages. Printed by Felix Knights, London
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: the History of the Geomagnetic Atlases Karin Reich and Elena Roussanova
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Samuel Dunn (1723–1794) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Biographical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Dunn’s Geomagnetic Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 John Churchman (1753–1805) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Biographical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The Magnetic Atlas (Philadelphia 1790) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 The Magnetic Atlas (2. Edition, London 1794) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 The Magnetic Atlas (3. Edition, New York, 1800; 4. Edition, London, 1804) . . . . . 4 Christopher Hansteen (1784–1873) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Biographical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The Magnetic Atlas (Christiania 1819) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Expedition to Russia (1828–1830) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Hansteen’s Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Carl Friedrich Gauss (1777–1855) and Wilhelm Weber (1804–1891) . . . . . . . . . . . . . . . . 5.1 The Beginning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 The “Magnetic Association” in Göttingen (1834–1843) . . . . . . . . . . . . . . . . . . . . . . 5.3 Gauss’ “General Theory of Geomagnetism” (1839) . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Gauss’ and Weber’s “Atlas of Geomagnetism” (Leipzig 1840) . . . . . . . . . . . . . . . . . 5.5 The End of the “Magnetic Association” in Göttingen . . . . . . . . . . . . . . . . . . . . . . . . 6 Excursus: Berghaus’ “Physical Atlas” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The Term “Atlas” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
108 109 109 110 116 116 118 121 124 126 126 128 130 130 131 131 132 133 133 138 138 139 141
K. Reich () Department of Mathematics, University of Hamburg, Germany e-mail: [email protected] E. Roussanova Saxonian Academy of Sciences and Humanities in Leipzig, Leipzig, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_94
107
108
K. Reich and E. Roussanova
Abstract
In the beginning there were geomagnetic charts which were interesting mainly for seafaring nations. The first geomagnetic atlas was printed in London in 1776; its author was the mathematician, cartographer, and astronomer Samuel Dunn, whose aim had been to ameliorate the navigation especially to support the trading of England with the East Indies. The American John Churchman, however, was mainly surveyor; his magnetic atlas was published in four editions, in 1790, 1794, 1800, and 1804. Churchman was in contact with George Washington and with Thomas Jefferson, as far as his geomagnetic charts were concerned; he also became a member of the Academy of Sciences in St. Petersburg. Churchman was convinced that the magnetic pole in the north could be found in northern Canada. The Norwegian astronomer and physicist Christopher Hansteen was convinced that there were two magnetic poles in the north and two in the south; his atlas was published in 1819. One of the magnetic poles in the north should be in Siberia. Hansteen found support by the king of Sweden and Norway so that he undertook an expedition to Siberia (1828–1830). Carl Friedrich Gauss and Wilhelm Weber began to study geomagnetism in 1831: They believed that there were only two magnetic poles, one in the north and one in the south. They were able to calculate their positions by means of Gauss’ new theory of geomagnetism (1839); as sailors found out, their coordinates turned out to be nearly correct. Gauss’ and Weber’s Atlas is without doubt the most famous; it was published in Leipzig in 1840, including 18 geomagnetic charts. On two of these charts, equipotential lines were presented for the first time in history.
1
Introduction
It was Gerhard Mercator (1512–1594) who made the word “atlas” common use. His posthumously published work “Atlas sive Cosmographicae Meditationes de Fabrica Mundi et Fabricati Figura” (Mercator 1595) paved the way for it to become a part of our everyday language. His atlas contained maps of both the earth and the sky. Since this time, the word “atlas” has been understood as a collection of terrestrial and celestial maps. Declination (deviation of the compass needle) has always been particularly important for seafarers. One of the earliest maps to show declination lines, the famous “Tabula nautica,” was drawn by Edmond Halley (1656–1742) and published in 1701. This map was based on a special form of projection known as Mercator projection, first introduced by Mercator in 1569 in his map of the world “Nova et aucta orbis terrae descriptio ad usum navigantium emendate accommodata.” As the title of Mercator’s world map indicated, this kind of projection was extremely important to seafarers, as it was conformal. It quickly also became clear that magnetic declination is not constant but subject to continual change. As early as the eighteenth century, numerous maps were published which were updated versions of the “Tabula nautica,” for example, by
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
109
James Dodson (ca.1705–1757), William Mountaine (ca.1700–1779), Johann Gustaf Zegollström (1724–1787), etc. Moreover, both world maps and regional maps were published in which declination lines were only marked in certain areas. Inclination also played a role in geomagnetic research. In the eighteenth century, maps were therefore also published with the inclination marked in lines. Intensity maps came into being in the nineteenth century; one of the first ones was published by Alexander von Humboldt (1769–1859) in 1804 (Hellmann 1895). The very existence of maps with geomagnetic lines suggested the idea of publishing an atlas containing maps of the world and special maps with declination lines, inclination lines, etc. The “Atlas of Geomagnetism” presented by Carl Friedrich Gauss and Wilhelm Weber in 1840 is without a doubt the most wellknown and famous; the maps published in it were exceptionally important. However, mention should also be made of its predecessors, which could hardly have been more different: the atlas published in 1776 by the Englishman Samuel Dunn, which contained 9 maps; the atlas by the American geodesist John Churchman, which contained one or two maps, four editions of which appeared by 1804; and finally the historical atlas published by the Norwegian physicist and astronomer Christopher Hansteen in 1819, which contained 15 maps. However, the aims of the authors varied just as much as the atlases themselves. These will be explained later.
2
Samuel Dunn (1723–1794)
2.1
Biographical Notes
Apparently for a long time, there was only one biography of Samuel Dunn in existence, which however is based on later descriptions, i.e., those published in the “Dictionary of National Biography” (Godwin 1888). Recently a revised and enlarged biography was published in the Internet (Heard 2013). According to these sources, Dunn was born in 1723 in Crediton in Devonshire, where his father died in 1744. Dunn initially earned his living by directing a school, where he also taught “writing, accounts, navigation, and other mathematical science.” He then moved to the school in Bowdown Hill, where he taught until Christmas 1751. Finally Dunn moved to London, where he taught at various schools and also gave private tuition. In 1757, he invented the “universal planisphere, or terrestrial and celestial globes in plano.” In the year 1758 Dunn became “master of an academy, for boarding and qualifying young gentlemen in arts, sciences, and languages, and for business at Chelsea.” Dunn was able to use the Ormond Observatory for astronomical observations and saw a comet there in 1760.1 Dunn informed the Royal Society of his astronomical observations. In 1763, he stopped working as a schoolteacher in Chelsea and moved to Brompton Park near Kensington, where he gave private
1
Comet 1759 III [sic], Great Comet, visible from January 7, 1760, to February 11, 1760.
110
K. Reich and E. Roussanova
tuition. In 1764, he went on a journey to France. In 1769 the Astronomer Royal Nevil Maskelyne (1732–1811) invited him, to observe the transit of Venus. When Dunn published his “New Atlas of the Mundane System; or of Geography and Cosmography” in 1774, he lived at 6 Clement’s Inn, close to Temple Bar. His scientific reputation was by now so great that he was appointed “mathematical examiner of the candidates for the East India Company Service.” This position enabled him to publish more of his work; in 1777 he lived at Covent Garden and in 1780 at 1 Boar’s Head Court, Fleet Street. He died in January 1794; his will was dated January 5, 1794; he was buried on January 23 at St. Dunstan-in-the-West. Dunn published nine treatises in the “Philosophical Transactions of the Royal Society” in London along with numerous astronomical monographs, particularly in the field of practical astronomy; other publications included works and maps (atlases) for seafaring and navigation.
2.2
Dunn’s Geomagnetic Maps
In the year 1774 Dunn published in London for the first time his work: “A New Atlas of the Mundane System, or of Geography and Cosmography: Describing the Heavens and the Earth, the Distances, Motions, and Magnitudes, of the Celestial Bodies; The Various Empires, Kingdoms, States, and Republics, Throughout the Known World: With a Particular Description of the Latest Discoveries; The Whole Elegantly Engraved On Sixty-Two Copper Plates; To These Is Prefixed a General Introduction to Geography and Cosmography, in Which the Elements of These Sciences Are Compendiously Deduced from Original Principles, and Traced from Their Invention to the Latest Improvements. With a general introduction.” This atlas was a great success; the sixth edition was published in 1810, by which time the number of copper plates had grown to 64 (Dunn 1810). Despite being primarily geographical, geomagnetism also played a role. The work contains a chapter titled “Magnetic Needle,” which presents Edmond Halley’s “Tabula nautica” from the year 1701 right at the beginning. Dunn then distinguished between three different types of declination lines: § 206 Lines of Variation, which are usually delineated on the chart, may be considered as one or the other of these three kinds, namely 1st Lines of Equal Variation, which run nearly Eastward and Westward on the chart. 2dly Crooked or Worm Lines, which run nearly northward and southward on the chart. 3dly Parabolic Lines, which do somewhat resemble the path which a body describes when it is thrown otherwise than perpendicular to the horizon. According to their usefulness Dunn mentioned: § 207 “The first kind of lines [: : :] can [: : :] be but of little use to mariners.” § 208 “The second kind [: : :] are of excellent use.” § 209 “The third kind: parabolic lines, are useful in the same manner as the worm lines before described” (Dunn 1810, p. 13).
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
111
Fig. 1 Dunn, Samuel: A new Atlas of the Mundane system. 6 Ed. with additions and considerable improvements (London 1810): cosmography epitomised, in six copperplate delineations. Chart after page 22, right column. State Library Berlin, shelf mark 2ı Kart. B 902
Dunn’s “A New Atlas” contains a plate “Cosmography Epitomised” with pictures of a compass, the worm lines, and the parabolic lines on the variation chart (right column) (Fig. 1). His work “The Navigator’s Guide to the Oriental or Indian Seas: Or, the Description and Use of a Variation Chart of the Magnetic Needle, Designed for Shewing the Longitude, Throughout the Principal Parts of the Atlantic, Ethiopic, and Southern Oceans, Within a Degree, or Sixty Miles. With an Introductory Discourse, Concerning the Discovery of the Magnetic Variation, the finding the Longitude Thereby, and Several Useful Tables” was published in London the following year (Dunn 1775). It did not contain any maps; however, two folios prepared by Dunn, i.e., variation maps of the North and South Atlantic oceans, respectively, appeared that same year (Hellmann 1895, p. 22). These maps were also included in Dunn’s groundbreaking book of geomagnetic maps, “A New Atlas of Variations of the Magnetic Needle for the Atlantic, Ethiopic, Southern and Indian Oceans, Drawn from a Theory of the Magnetic System, Discovered and Applied to Navigation” (London 1776, Fig. 2). Even in the title, Dunn emphasized that his maps were based on a new “theory of the magnetic system” he had evolved himself. He also utilized the astronomical and
112
K. Reich and E. Roussanova
Fig. 2 Title page of Samuel Dunn’s “A New Atlas” (London 1776). The royal library of Copenhagen, shelf mark KBK 2–852, x-2013/28 (Photography by Henrik Dupont)
magnetic observations made by the captains of the ships in the service of the East India Company. The atlas was intended to make it easier for seafarers to navigate to the East Indies, as the declination lines enabled them to establish their longitude on these oceans to the nearest degree or 60 miles. This work was not published by a publishing company; instead, the author had it printed and distributed via “Maiden Lane, Covent Garden.” This is probably why only a few copies found their way into libraries. This atlas begins with a letter “To the Honourable the Court of Directors of the United Company of Merchants of England trading to the EAST INDIES” which dates from November 6, 1776. Further Dunn reports: Under whose Predecessors, near two Centuries since, the British Mathematician, Edward Wright of London, published his Invention of the true Sea Chart, commonly called Mercator’s, changing the Angles made by the Merdidians and Rhumb-lines2 into Rectilinear ones and thereby reducing the whole Process of Navigation, or the Art of Sailing on the Oceans, to the Doctrine of Plane Triangles; AND WHOSE PATRONAGE HATH ENCOURAGED THE PUBLICATION OF THIS BOOK. (Dunn 1776, p. IV)
2
Rhumb line, i.e., loxodrome.
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
113
The Edward Wright (1561–1615) already quoted here was a mathematician and geodesist; he was one of the authors to whom Dunn felt he owed a particular debt. In 1589, Wright took part in an expedition to the Azores; he published his findings in 1599 in his work “Certaine Errors of Navigation,” a second edition of which was published in 1610 and a third in 1657. In this work, Wright gave a detailed explanation of what he understood by “Mercator projection” and described the five most important errors which were critical for establishing a ship’s exact position at sea. As Dunn also used this type of projection, he provided a detailed description in his “Introduction”: The first good Effect arising from this Invention of the true Sea Chart is, that in it the Rhumb-line Bearings are straight Lines, and consequently, by Help of a straight Rule and a Pair of Compasses, those Bearings are easily shewn by the Chart. [: : :] The second good Effect is, that all the Cases of Sailing are solved by Proportions [: : :] Another Advantage which this Chart has: the Course may be accurately set off on it, as also the Distance sailed, by observing a proper Method. (Dunn 1776, pp. V–VI)
Mercator projection is indeed conformal, which made it particularly important for navigation at sea. The first map reproduced in Dunn’s atlas is therefore nothing other than Wright’s map; like Wright’s, it contains no magnetic declination lines. Dunn’s “Introduction” consists of 18 chapters. Dunn finally gets to the point in the ninth chapter: The Variation of the Magnetic Needle, or its Horizontal Deflection from the true Meridian of the Place of Observation, hath been considered as of the greatest Importance in Navigation, by every Man of either Learning or Ingenuity in Mathematics, Philosophy, or Nautical Affairs, who hath given his Opinion concerning it. Hypotheses have been formed, but none of them have agreed with Observations, nor hat it been possible to draw the Variation-lines by them.
Dunn based his maps on a new theory which he described as follows: By the Word Theory, I mean what indicates the Cause of the Variation at different Places and Times, and plainly demonstrates it by Mathematical Principles and Philosophical Laws. The drawing of accurate Variation Charts of the Magnetic Needle by a Theory must sound very strangely to Philosophers, as nothing of this Kind hath hitherto been thought possible.
Dunn mentioned another advantage of his method (chapter 14): Another Advantage which ariseth from the Discovery of a Theory of the Variation is, that the Variation Charts may not only be drawn from a few Observations, but they may be drawn for Years past and to come; from which ariseth, an easy Method of making them applicable for the intermediate Years, with Errors which are very inconsiderable.
Ultimately Dunn came to the conclusion (chapter 17): Charts properly constructed to lesser and greater Scales, after Wright’s Manner, will be of Use to Navigators, in indicating the course to be sailed from one Place to another, whether Distances be great or small. Variation Charts of the Magnetic Needle, accurately and properly drawn, will be of Use in allowing for the Variation. Both of these being applied at Sea, will enable the Navigator to pursue his Voyage in cloudy Weather, or when no Astronomical Observations can be made, with that Certainty which otherwise he cannot expect. It is the Plan and Design of this Work to institute such, but to complete it will require some Time and Judgement. (Dunn 1776, p. VI)
114
K. Reich and E. Roussanova
Fig. 3 Chart 2: Variation chart of the Atlantic Ethiopic and Indian Oceans for the year 1770 delineated according to Mercator’s or Wright’s projections agreeable with the latest and best observations by S. Dunn (November 6, 1776). The royal library of Copenhagen, shelf mark KBK 2–852, x-2013/28. N.B. This is the first variation chart of those seas that has ever been drawn by a theory and found to agree nearly with observations (Photography by Henrik Dupont)
The introduction is followed by 9 maps measuring about 50 60 cm each: 1. A Wright’s Chart of the Atlantic Ethiopic and Indian Oceans (November 6, 1776) (Fig. 3). 2. Variation Chart of the Atlantic Ethiopic and Indian Oceans for the year 1770, delineated according to Mercator’s or Wright’s projections agreeable with the latest and best observations by S. Dunn (November 6, 1776). 3. A Variation Chart of the Atlantic Ethiopic and Indian Oceans for the year 1800 (November 6, 1776). 4. A Variation Chart of the Atlantic Ocean for the year 1776 (November 6, 1776). 5. A Variation Chart of the Atlantic Ethiopic and Indian Oceans for the year 1776 (November 6, 1776). 6. Continuation of plate (5). This chart is designed for determining the longitude in those seas within a degree or 60 miles (November 10, 1775). 7. A Variation Chart of the Indian Ocean for South of the Line for 1776 (November 6, 1776) (Fig. 4).
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
115
Fig. 4 Chart 7: A variation chart of the Indian Ocean for South of the line for 1776 (November 6, 1776). The royal library of Copenhagen, shelf mark KBK 2–852, x-2013/28 (Photography by Henrik Dupont)
8. Continuation of plate (7). The lines in the west part of this chart are designed for determining the longitude (November 6, 1776). 9. India with the magnetic variations for 1776 (November 6, 1776). Hellman’s comment on this work “A New Atlas” was: “This rare atlas contains seven [sic] large-scale declination maps of superb technical execution; magnetic maps on a larger scale have possibly never been published before” (Hellmann 1895, p. 22).3 Similar maps on a much smaller scale are found in Dunn’s work “A new epitome of practical navigation; or guide to the Indian Seas,” published in London in 1777 (Dunn 1777): Plate 12: A miniature variation chart of the Atlantic Ocean for the years from 1770 to 1820 3
In the original German: “Dieser seltene Atlas enthält sieben Deklinationskarten grossen Maassstabes in vorzüglicher technischer Ausführung; vielleicht sind magnetische Karten in grösserem Maassstabe niemals publicirt worden” (Hellmann 1895, p. 22).
116
K. Reich and E. Roussanova
Plate 13: A miniature variation chart of the Ethiopic Ocean for the years 1770 to 1820 Plate 14: A miniature variation chart of the Indian Ocean for the years from 1770 to 1820
3
John Churchman (1753–1805)
3.1
Biographical Notes
Unless otherwise mentioned, the following description is based on Silvio Bedini’s two-part article (Bedini 2000). Churchman came from a Quaker family which settled in Nottingham, Maryland, in 1704. Three generations of this family were surveyors: born in 1753, John was John III of his family and the most famous of its surveyors. John III, from now on referred to merely as John, learnt surveying from his father George (1730–1814). John published numerous maps, including a map of the peninsula between Delaware and Chesapeake; these in particular helped to make him famous. In 1779, John presented a “memorial” to the “American Philosophical Society” (only founded in 1743) with a request to have his map published. This petition was granted, and two editions of the map were published in the years 1786 and 1787 with the following inscription “To the American Philosophical Society this Map of the Peninsula between Delaware and Chesopeak Bays with said Bays and Shored adjacent drawn from the most Accurate Survey is humbly inscribed by John Churchman.” This map was reprinted several times by US Geological Survey with the title “Delaware at the Time of the Ratification of the Constitution.” Besides working as a surveyor, John also ran numerous businesses, particularly with land, i.e., he bought land in grand style and saw that it was resold. John Churchman occupied himself with geomagnetism from the late 1770s on. He developed a magnetic needle theory which aimed to improve the accuracy with which longitude could be established at sea. This was mainly to be facilitated by new maps marked with declination lines. In 1777, an early exposition of his geomagnetic theory appeared in the Philadelphia Press. Here he assumed that there were two satellites, moons, orbiting the earth, one around the North Pole and the other around the South Pole. In 1785, he made a perpetuum mobile which was driven by magnetic forces. In 1787, he presented a new theory of magnetic needle declination to the “American Philosophical Society.” However, this theory was not supported by the respective scholars; David Rittenhouse (1732–1796) was one of those who rejected it.4
4 David Rittenhouse became a member of the “American Philosophical Society” in 1768; he was its president from 1791 to 1796. He was also an astronomer at the University of the State of Pennsylvania, from where the College of Philadelphia was founded in 1791.
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
117
However, Churchman was not discouraged; instead, he sent letters aiming to obtain the support of prominent personages and scientists. This strategy was successful. He published some of the replies he received himself (Churchman (1790), Appendix, pp. 1–5 as well as Churchman (1794), p. 65–76); others were reproduced in exchanges of letters which have only recently been published. April 10, 1787, is the date of Churchman’s first letter to Joseph Banks (1743– 1820), who became President of the Royal Society in 1788 and held this office until 1820. We know of seven letters exchanged by Churchman and Banks (Banks 2007): – Churchman to Banks, April 10, 1787, Philadelphia (letter no. 723) – Banks to Churchman, September 1, 1787, Soho Square (letter no. 768) – Churchman to Banks, including a “Memorial To the Honourable Commissioners of Longitude for the Nation of England,” September 29, 1787, Philadelphia (letter no. 776) – Banks to Churchman, without date, Soho Square (letter no. 777) – Churchman to Banks, May 8, 1788, Philadelphia (letter no. 840) – Churchman to Banks, January 7, 1792, Philadelphia (letter no. 1086) – Churchman to Banks, March 8, 1804, Boston (letter no. 1767) Banks’ response was very positive; he invited Churchman to visit England. Churchman accepted this invitation and stayed in London from 1792 to 1796. In 1787, Churchman also contacted his compatriot Thomas Jefferson (1743– 1826), whom he initially addressed as “Dear Friend,” “Esteemed Friend,” and later as “My Honourable Friend.” Jefferson was stationed in Paris as a diplomat from 1785 to 1789; in 1802, he was elected the third President of the USA, an office which he held until 1809. Ten letters are known to have passed between Churchman and Jefferson between 1787 and 1802 (Jefferson 1950– 2013): – Churchman to Jefferson, June 6, 1787, with an enclosure from April 10, 1787, Philadelphia (vol. 11, pp. 397–399) – Jefferson to Churchman, August 8, 1787, Paris (vol. 12, pp. 5–6) – Churchman to Jefferson, November 22, 1787, Philadelphia (vol. 11, pp. 374–375) – Churchman to Jefferson, May 15, 1789, Philadelphia (vol. 15, pp. 129–130) – Jefferson to Churchman, September 18, 1789, Paris (vol. 15, pp. 439–440) – Jefferson to Churchman, [November 24, 1790] (vol. 18, p. 68) – Churchman to Jefferson, January 13, 1791, South 2nd Street No.183 (vol. 18, pp. 492–493) – Churchman to Jefferson, April 2, 1792, No.183 South 2nd Street (vol. 23, pp. 363–364) – Jefferson to Churchman, April 4, 1792, Philadelphia (vol. 23, pp. 369–370) – Churchman to Jefferson, May 7, 1802, Boston (vol. 37, pp. 424–426)
118
3.2
K. Reich and E. Roussanova
The Magnetic Atlas (Philadelphia 1790)
Churchman first published his work “An Explanation of the Magnetic Atlas, or Variation Chart, Hereunto Annexed Projected on a Plan Entirely New, by Which the Magnetic Variation on any Part of the Globe May Be Precisely Determined” in Philadelphia in 1790. This work was accompanied by a magnificent large map with the inscription “George Washington, President of the United States this magnetic Atlas or variation Chart Is humbly inscribed by John Churchman” (Fig. 5). This map, which showed only the northern hemisphere, had a diameter of 60.5 cm and consisted of 12 strips which were probably meant to be affixed to a globe; the diameter of the globe would probably have been about 39 cm. According to the great magnetic map expert Gustav Hellmann, this was the first map with declination
Fig. 5 Dedication: “George Washington, President of the United States of America. This magnetic Atlas or variation chart is humbly inscribed by John Churchman” (Philadelphia 1790). State library Berlin, shelf mark W 780
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
119
lines ever published in America (Hellmann 1895, p. 22). Churchman had already contacted Washington 1 year previously, in 1789; George Washington (1732–1799) was the first President of the newly founded USA from 1789 to 1797. In all, 8 letters from the period between 1789 and 1792 have survived; however, rather than answering himself, Washington left his secretary Tobias Lear (1762–1816) to write his replies (Washington 1983–2011): – Churchman to Washington, May 7, 1789, New York, Water Street No.66 (vol. 2, pp. 225–227) – Churchman to Washington, August 9, 1790, Philadelphia (vol. 6, p. 222) – Lear to Churchman, August 28, 1790 (vol. 6, p. 222) – Churchman to Lear, September 8, 1791, Philadelphia (vol. 8, pp. 512–514) – Lear to Churchman, September 10, 1791, Philadelphia (vol. 8, pp. 512–514) – Churchman to Washington, December 29, 1791 (vol. 9, pp. 342–344) – Churchman an Washington, July 14, 1792, Bank Street Baltimore (vol. 10, p. 540) – Churchman an Washington, September 5, 1792, Baltimore (vol. 11, pp. 71–74) At that time, a map with declination lines, i.e., with geomagnetic data, was so important that it was presented to the President of the USA as a gift, which was accepted with all due consequence. On August 9, 1790, Churchman wrote to Washington from Philadelphia, enclosing his recent publication as a token of his best respects for the president: Being convinced that no name would be likely to stamp so great a value on the work as that of the personage to whom it was dedicated, he hopes to be pardoned for the Liberty which he has taken in this respect.
Secretary Lear responded on Washington’s behalf: The President of the United States has received a Copy of the Magnetic Atlas or Variation Chart, together with the book of explanation which you have been so polite as to send him and requests your acceptance of this thanks for the same. “I am, moreover, ordered by the President to inform you, that being ever desirous of encouraging such publications as tend to promote useful knowledge, he requests you will consider him as a subscriber of your work.” (Washington (1983–2011) vol. 6, p. 222, also in Churchman (1794), pp 71–72)
On this map (Fig. 5), the declination lines are drawn in quite thickly on the oceans but only at large intervals on the continents; none of the declination lines are specially marked, nor is the zero line. There is one and only one magnetic point close to Baffin Bay, at a latitude of about 76ı and a longitude about 78ı west of Greenwich. In his letter to George Washington dated December 29, 1791, Churchman outlined the following plan: he wanted to organize an expedition to Baffin Bay to find the magnetic North Pole which was presumably located there (Washington 1983–2011, vol. 9, pp. 342–344). However, Washington did not respond to this proposal.
120
K. Reich and E. Roussanova
In his document “Explanation of the Magnetic Atlas,” Churchman quoted his predecessors, including Dunn, but only his “A New Atlas of the Mundane System” in the edition of 1788. Moreover, Churchman gave a particularly detailed explanation of the geomagnetic theory published by Leonhard Euler (1707–1783); Euler was the first scientist to produce a map with declination lines projected stereographically (Reich and Roussanova 2012, pp. 147–148). Churchman apparently had sent his essay “Explanation of the Magnetic Atlas” also to the Academy of Sciences in St. Petersburg. At that time, Johann Albrecht Euler (1734–1800), oldest son of the mathematician Leonhard Euler, held the post of Permanent Secretary. Like his father, Johann Albrecht was particularly interested in geomagnetism. On January 31, 1791, the minutes of the meetings held at the Academy in St. Petersburg recorded the following: Monsieur le Conseiller de Collèges Roumovsky5 remit un extrait en langue russe que Monsieur l’Adjoint Konoff6 a fait d’un imprimé anglois de Monsieur Churchman: An Explanation of the Magnetic Atlas or variation chart etc. by John Churchman Philadelphia 1790; que Son Altesse Madame la Princesse de Daschkow7 avoit reçu pendant les dernières vacances, et envoyé à Messieurs les Adadémiciens, pour qu’ils l’examinent et en disent leur sentiment. Monsieur Churchman donne dans cet imprimé une méthode aisée de déterminer par le moyen de quelques tables et d’une sphère la déclinaison de l’aiguille magnétique pour un lieu proposé quelconque et à chaque temps donné: il s’agissoit donc d’examiner si cet imprimé répond pleinement à la question physico-mathématique que Madame la Princesse avoit choisie pour le prix académique de 17938 et s’il n’en rend pas la publication superflue. Messieurs les Académiciens chargés de cet examen rapportèrent donc en conformité de cet ordre, qu’ayant trouvé que les temps périodiques des deux points magnétiques de la Terre, sur lesquels Monsieur Churchman a dressé ses tables et sa carte magnétique, ne sont fondés que sur des observations faites en deux temps, années 1657 et 1790, et qu’ils en ont été même déterminés par un calcul très sujet à caution, ils sont tous d’avis que l’imprimé de Monsieur Churchman ne doit pas empêcher l’Acadmémie de proposer la question telle qu’elle a été donnée par Monsieur l’Academicien Krafft.9 Le Secrétaire10
5
Stepan Jakovlevich Rumovskij (1734–1812), astronomer; in 1753 he became assistant at the Academy of Sciences in St. Petersburg; during the years 1754 to 1756 he was guest scholar of Leonhard Euler in Berlin; in 1756 he succeeded Michail Lomonosov as director of the Geographical Department at the Academy of Sciences in St. Petersburg; in 1763 he became extraordinary and in 1767 ordinary professor at the Academy; during the years 1800–1803, he acted as its vice-president. 6 His correct name his Kononov and not Konoff. Aleksej Kononovich Kononov (1766–1795), physicist; since 1789 he was assistant and since 1795 extraordinary professor at the Academy of Sciences in St. Petersburg. 7 The princess Yekaterina Romanovna Dashkova (1743–1810) was directress of the Academy of Sciences in St. Petersburg from 1783 to 1796. 8 The Academy of Sciences in St. Petersburg had posed the following prize question for the year 1793: to present a magnetic chart of the world for the beginning of the nineteenth century, where the magnetic poles were indicated. This chart should be similar to the “Tabula nautica,” published by Edmond Halley in 1701 (Procès-verbaux 1911, pp 256–258). The original text was published in Latin and in Russian; a German translation in Reich and Roussanova (2012), p. 142. 9 Wolfgang Ludwig Krafft (1743–1814), physicist at the Academy of Sciences in St. Petersburg. 10 Johann Albrecht Euler.
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
121
fut en conséquence chargé d’en dresser le programme et de le soumettre à l’approbation de la Conférence pour être imprimé et publié. (Procès-verbaux 1911, pp. 251–252)
Churchman received information about this meeting; Yekaterina Dashkova notified him on February 27/March 11, 1791: The Contents of your letter, which we received with the enclosed Magnetic Atlas, and its explanation, in due time, were the more interesting and agreeable to the Imperial Academy of Sciences, at the same matter is the subject of a Premium even now proposed by our Academy, as you will see by the printed advertisement I send you herewith. The progress you have already made gives me a pleasant hope, this important matter will derive no small increase from your ingenious works; and I make no doubt but your labours will greatly contribute to the final solution of this question. By the communication of your further enquiries and discoveries, especially relating to the southern hemisphere, the calculation of an universal set of tables, and the ascertaining of the exact revolutions of the two magnetic points round the poles of the earth, by a greater number of observations, you will very much oblige your humble servant, princess of Daschkaw. (Churchman 1794, p. 74)
However, Churchman did not receive the prize awarded in 1793; instead, it went to the Copenhagen-based physicist Christian Gottlieb Kratzenstein (1723– 1795). The most important scientific institution in Philadelphia was that later known as the “American Philosophical Society,” cofounded by Benjamin Franklin (1706– 1790) in 1743. It should be mentioned that Yekaterina Dashkova and Benjamin Franklin met in Paris on February 3, 1781. In consequence, Dashkova became the first female and first Russian member of the “American Philosophical Society,” while Franklin was the first American to become a member of the Academy of Sciences in St. Petersburg. Franklin and Dashkova maintained a constant correspondence, which was one of the highlights presented at the exhibition “The Princess and the Patriot: Ekaterina Dashkova, Benjamin Franklin, and the Age of Enlightenment” in Philadelphia in 2006 (Prince 2006). After his magnetic atlas, Churchman wanted to publish a book about gravity. He sent an exposition, a prospectus, to St. Petersburg: Son Altesse Madame la Princesse de Daschkow envoya pour être présenté de la part de Monsieur Churchman, auteur de l’Atlas magnétique, une fueille imprimée en anglois contenant Proposals for publishing a Dissertation on Gravitation containing conjectures concerning the case of the several kinds of attraction. Le Secrétaire fera circuler ce prospectus parmi Messieurs les Académiciens et Adjoints. (Procès-verbaux 1911, p. 286, January 16, 1792)
However, this work was never published.
3.3
The Magnetic Atlas (2. Edition, London 1794)
While Churchman was staying in England between 1792 and 1796, a second edition of his “Magnetic Atlas” appeared in London in 1794. This second edition contained two geomagnetic maps, one of the northern hemisphere (Fig. 6) and the other of the southern hemisphere (Fig. 7).
122
K. Reich and E. Roussanova
Fig. 6 John Churchman: The magnetic Atlas, or variation charts, 2. edition (London 1794): Northern Hemisphere. State library Berlin, shelf mark 4ı My 3506
Both maps are dated July 1, 1794, measure approx. 59.5 cm in diameter, and – like the map published in 1790 – consisted of 12 strips which would have covered a complete globe. It must be noted that the map portraying the northern hemisphere (Fig. 6) is not identical with the map drawn for George Washington in 1790 (Fig. 5). The map of the northern hemisphere indicates a “magnetic pole” located in Northern Canada; a circle has been drawn around it on which the magnetic point and the magnetic nadir have been marked. The migration of the magnetic North Pole is also marked, i.e., its position in 1600, 1650, 1700, 1750, 1800, 1850, and 1900. The “first magnetic meridian for 1700” and the “first magnetic meridian for 1794” both pass through West Africa. A circle called the “magnetic orbit” was again drawn around the magnetic North Pole, and the magnetic point and magnetic nadir were marked on it.
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
123
Fig. 7 John Churchman: The magnetic Atlas, or variation charts, 2. edition (London 1794): Southern Hemisphere. State library Berlin, shelf mark 4ı My 3506
With regard to the southern hemisphere, the magnetic South Pole is located south of West Australia between 100ı and 110ı east of Greenwich; as in the map of the northern hemisphere, a magnetic orbit is drawn and a magnetic point and magnetic nadir marked. Close to the magnetic point, there is a text stating “This Magnetic Point is fixed by more authentic observations on the Chart than in page 40 & 41.” Churchman sent copies of the second edition of his “Magnetic Atlas” to President Washington and his friend Jefferson. Jefferson gave his volume to the “American Philosophical Society” as a gift (Washington (1983–2011), vol. 2, p. 222 and Jefferson (1950–2013), vol. 37, p. 425). Yet another copy of the second edition went to the Academy of Sciences in St. Petersburg, whereupon he was elected an honorary member of the Academy in 1795, winning 7:4 of the votes. It seems he was the second American after Benjamin Franklin to be honored in this way. The minutes record:
124
K. Reich and E. Roussanova
Monsieur le Directeur en fonction et gentil-homme de la Chambre de S. M. J. Paul de Bacounin,11 fit communiquer une lettre angloise adressée à Madame la Princesse de Daschkow par Monsieur Churchman et datée de Londre le 21 octobre, laquelle Son Altesse lui a envoyée de Moscou pour la faire lire à la Conférence, et lui proposer ensuite ce physicien américain, dont elle a déjà reçu diverses choses, pour être reçu au nombre de ces associés externes. Monsieur Churchman rapporte avoir envoyé à l’Académie un atlas magnétique, relativement auquel il fait quelques remarques sur la situation des pôles magnétiques du globe terrestre; il propose ensuite d’envoyer dans la partie occidentale de l’Amérique septentrionale un observateur pour y determiner la déclinaison et l’inclinaison de l’aiguille aimantée et il s’offre de s’y rendre lui-même si l’adadémie trouve bon de lui bonifier les frais du voyage; il raconte à cette occasion qu’un jeune négociant anglois a déjà fait par terre le trajet remarquable de l’Amérique septentrionale. Enfin Madame la Princesse de Daschkow lui ayant fait présent d’un exemplaire complet des Actes académiques, il indique un comptoir marchand à St- Pétersbourg, auquel l’Académie est priée de le remettre. La Conférence n’yant aucune résolution à prendre ni réponse à faire aux propositions de Monsieur Churchman, elle procéda à son élection par voie ordinaire du scrutin et Monsieur John Churchman fut reçu au nombre des Académiciens étrangers par sept voix contre quatre. (Procès-verbaux 1911, p. 409, January 8, 1795)
Like Benjamin Franklin, Churchman too owed his nomination as a member of the Academy of Sciences in St. Petersburg to Yekaterina Dashkova (Prince 2006, pp. 15–16). This second edition of Churchman’s “The Magnetic Atlas” was sent to a list of subscribers, including the scientists Joseph Banks and William Herschel (1738– 1822) in England and the Academies in Berlin, Lisbon, Copenhagen, etc.
3.4
The Magnetic Atlas (3. Edition, New York, 1800; 4. Edition, London, 1804)
After Churchman’s return to the USA, a third edition of his “Magnetic Atlas or Variation Charts” was published in 1800 in New York with some additions; the title page, for example, referred to the author as “Fellow of the Russian Imperial Academy.” According to a letter to Banks dated February 4, 1801, Churchman sent a copy of this edition to the Royal Society (Banks 2007, vol. 5, p. 343). He also sent a copy to his friend Jefferson, who was now the President of the USA (Jefferson 1950–2013, vol. 37, p. 425). Still another copy is located in Carl Friedrich Gauss’ private library in Göttingen (Fig. 8): In 1802, Churchman went to Europe again; his itinerary included both Copenhagen and St. Petersburg. Nikolaus Fuss (1755–1826), who was married to one of Leonhard Euler’s granddaughters, had been Permanent Secretary in St. Petersburg since 1800. The minutes of the Academy of Sciences in St. Petersburg record: Le Secrétaire présente de la part de Mr. John Churchman, membre externe de l’Académie, arrivé de Philadelphie: A variation chart, by John Churchman, Imperial Russian Academi-
11
Pavel Petrovich Bakunin (1776–1805), since 1794 vice-director of the Academy of Sciences in St. Petersburg, from 1796 to 1798 director.
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
125
Fig. 8 Title page of John Churchman: The magnetic Atlas or variation charts of the whole terraqueous globe: comprising a system of the variation and dip of the needle by which the observations being truly made, the longitude may be ascertained. New York 1800. State and University Library Göttingen, Gauss library no. 130
cian. Le Secrétaire notifia en même tems qu’il s’est fait délivrer, avec la permission de Son Excellence Mr. le Président, six exemplaires de la carte magnétique de feu Mr. Kratzenstein, publiée par l’Académie, afin de les remettre à ce physicien américain qui travaille depuis dix ans au perfectionnement d’une carte pareille. (Procès-verbaux 1911, p. 1011, August 18, 1802)
Churchman was accordingly given Kratzenstein’s map, which had been published in St. Petersburg in 1795, as a gift.12 Churchman probably spent the winter in St. Petersburg, as the “National Cyclopedia” reports: He spent the winter in that high latitude perfecting his observations and corresponding with several European philosophers, the main object with all being the discovery of the law governing the constant variation, dip and declination of the magnetic needle in different parts of the earth. (Anonymus 1907)
12
Christian Gottlieb Kratzenstein: “Mappa exhibens declinationes acus magneticae ad initium saeculi decimi noni pro obtinendo praemio ab Academia Scientiarum imperiali Petropolitana ad annum 1793 proposito.” Kept at the State Library in Berlin, shelf mark 2ı Kart. W 750. This map was accompanied by Kratzenstein’s essay “Tentamen, resolvendi problema geographicomagneticum a perillustri Academia imperiali Petropolitana in annum 1793 propositum” (St. Petersburg 1798).
126
K. Reich and E. Roussanova
Another journey to England was the occasion for the publication of the fourth and final edition of Churchman’s “Magnetic Atlas” in London in 1804; it included yet more additions. Churchman died on July 17, 1805, on board the ship taking him back to the USA; he was accordingly buried at sea.
4
Christopher Hansteen (1784–1873)
4.1
Biographical Notes
Born on September 26, 1784, in Christiania (now Oslo), Hansteen studied at Copenhagen University, where his most important teachers were the astronomer Thomas Bugge (1740–1815) and the physicist Hans Christian Oersted (1777– 1851). In 1806, Hansteen started teaching at the Latin School in Frederiksborg at the north of the island of Zealand in Denmark. Even then, he was already making his first observations and compiling his first thoughts on geomagnetism. A map with declination lines which Hansteen drew by hand and is dated April 2, 1810, still exists: “Mappa exhibens declinationis magneticas pro anno 1730; ad Meridianum Londini secundum tabulas Mountainiis et Dodsonis et observationes Middletonii constructa”13 (Enebakk and Johansen 2011, between the pages 16 and 17). In 1811, the Danish Academy of Sciences in Copenhagen posed the following prize question: “Are we obliged to assume that magnetic phenomena can only be explained by the existence of several magnetic axes in the earth, or is one enough?” Hansteen was awarded the prize in 1812. In 1814, Hansteen became a lecturer at the new university in Christiania, which had only been founded in 1811. In 1816, he was appointed Professor of Astronomy and Applied Mathematics there and also became the director of the observatory. In 1819, he had the opportunity to take a long journey and visited London and Paris (Enebakk and Johansen 2011). Hansteen’s radical, monumental work “Investigations into Terrestrial Magnetism” (“Untersuchungen über den Magnetismus der Erde”) (Hansteen 1819, Fig. 9) was a detailed answer to the prize question posed by the Danish Academy in 1811. In the preface, Hansteen explained what had inspired his interest in geomagnetism: this was a globe at the school in Frederiksborg, on which an area close to the South Pole was marked with an ellipse; its focal points were the earth’s two magnetic South Poles, one stronger (regio fortior), one weaker (regio debilior) (Hansteen 1819, p. VII–XII). Hansteen himself remained a lifelong advocate of Halley’s theory that the earth had two magnetic axes, i.e., that it had four magnetic poles, two in the north and two in the south. Hansteen’s vast work deals with
13 William Mountaine (ca.1700–1779); James Dodson (ca.1705–1757); Christopher Middleton (died in 1770).
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
127
Fig. 9 Title page of Hansteen’s “Investigations into Terrestrial Magnetism.” State and University Library Göttingen, Gauss library no. 856
the magnetic phenomena known at that time; one lengthy chapter is devoted to Leonhard Euler’s geomagnetic theory. The “Investigations” has an appendix in which Hansteen recorded all the magnetic observation data obtained at sea and on land which was available to him. This appendix, i.e., the observation data, is the basis on which the maps published in his atlas were drawn. The observation data included data recorded by ship’s captains at sea and data obtained by scientists. It is worth noting that Hansteen also mentioned the observation data obtained by Edward Wright on his voyage to the Azores in 1589 (Hansteen 1819, Appendix, p. 41). Hansteen’s work triggered many discussions and reactions, for example, from David Brewster (1781–1868) in Edinburgh, Johann Tobias Mayer (1752–1830) in Göttingen, Ludwig Wilhelm Gilbert (1769–1824) in Leipzig, Caspar Horner (1774– 1834) in Zurich, and Edward Sabine (1788–1883) in Britain (Reich and Roussanova 2015, Chap. 2.6). The “Investigations into Terrestrial Magnetism” was the first detailed work devoted solely to geomagnetism; it therefore played a significant role which can by no means be overestimated. It was largely responsible for many physicists and astronomers beginning to investigate the topic of geomagnetism during the ensuing period. One of these was Gauss.
128
4.2
K. Reich and E. Roussanova
The Magnetic Atlas (Christiania 1819)
Hansteen’s “Investigations into Terrestrial Magnetism” was accompanied by a “Magnetic Atlas” (Magnetischer Atlas gehörig zum Magnetismus der Erde) which appeared the same year (1819): Neither Dunn’s “A new atlas of variations of the magnetic needle” nor Churchman’s atlas was particularly important for Hansteen; however, he is sure to have been familiar with Dunn’s work, which was available in Copenhagen. Hansteen also quoted Samuel Dunn in a letter to Wilhelm Ludwig Gilbert dated April 15, 1823 (Hansteen 1823, p. 159).
Unlike its predecessors, Hansteen’s “Magnetic Atlas” is purely historical, i.e., it only contains geomagnetic maps depicting situations in previous years. This “Magnetic Atlas” consists solely of tables with no text at all. The 7 plates presented here encompass 15 maps including both declination and inclination maps. Hansteen’s atlas was therefore the first to include inclination maps. Another remarkable feature is that Hansteen’s maps show geomagnetic lines not only on the oceans but also on the continents. Like his predecessors, Hansteen used Mercator projection; however, some of his maps were drawn using stereographic projection. Plate I: three variation maps for the year 1600 (no. I), 1700 (no. II), and 1756 (no. III) Plate II: five variation maps for the year 1770 (no. IV), 1710 (no. V), 1720 (no. VI), and 1730 (no. VII); two inclination maps for the year 1700 (no. VIII) and 1800 (no. IX) Plate III: two variation maps for the year 1800 (no. X) and 1744 (no. XI) Plate IV: polar projection of a segment of the northern and southern hemispheres to clarify the location and movement of the magnetic poles between the years 1600 and 1800 (Fig. 10) Plate V: map of both hemispheres with complete drawings of the magnetic equator and the variation lines for both magnetic axes as per Euler’s first theory14 Plate VI: Mappa Hydrographica sistens Declinationes Magneticas anni 1787 (Fig. 11) Tafel VII: Mappa Hydrographica sistens Inclinationes Magneticas anni 1780 (Fig. 12) The print plates for all these maps still actually exist (Enebakk and Johansen 2011, pp. 66–67). All Hansteen’s maps are based on observation data and not on some kind of theory, as was the case with Dunn and Churchman.
14
A hand-drawn map still exists; see Enebakk and Johansen (2011, between the pages 32 and 33).
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
129
Fig. 10 Hansteen’s “Atlas,” plate IV. State and University Library Göttingen, shelf mark 2 PHYS III, 8480
Fig. 11 Hansteen’s “Atlas,” plate VI. State and University Library Göttingen, shelf mark 2 PHYS III, 8480
130
K. Reich and E. Roussanova
Fig. 12 Hansteen’s “Atlas,” Plate VII State and University Library Göttingen, shelf mark 2 PHYS III, 8480
4.3
Expedition to Russia (1828–1830)
Churchman conceived a plan of searching for the magnetic North Pole close to Baffin Bay; however, he was unable to realize it (see here. chapter 3.2) Hansteen believed the magnetic North Pole was located in Siberia, which is why he considered organizing an expedition to Russia. This expedition started in 1828 and lasted until 1830; it was totally devoted to geomagnetic research. Alexander von Humboldt’s trip to Russia took place in 1829 while Hansteen was still there. Humboldt was also researching geomagnetism, but his expedition was also dedicated to other topics. Hansteen returned to Norway in 1830.
4.4
Hansteen’s Maps
Hansteen’s cartographic work mainly centered around the years from 1819 to 1833. His “Magnetic Atlas” was created at the beginning of this period, but it is important to note that Hansteen did not publish any other atlas after 1819. However, several of the maps in his “Magnetic Atlas” were later reproduced on several occasions (Reich and Roussanova 2015, Appendix 1). Hansteen initially only published declination and inclination maps, but from 1824/5 onwards, he started publishing maps with intensity lines. These mainly comprised the wonderful intensity and isodynamic maps for which he became
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
131
famous. It was Hansteen who coined the term “isodynamic lines” for intensity lines. The terms “isogonic lines” for declination lines and “isoclinic lines” for lines with the same angle of inclination also came from Hansteen (Hellmann 1895, pp. 14–15, 25).
5
Carl Friedrich Gauss (1777–1855) and Wilhelm Weber (1804–1891)
5.1
The Beginning
Born on April 30, 1777, in Brunswick, Carl Friedrich Gauss went down in history as a mathematician, astronomer, geodesist, and physicist. In 1807, he was appointed Professor of Astronomy at Göttingen University, where he was also the director of the observatory. Gauss had already developed an interest in geomagnetism in his younger years; his private library included the third edition of Churchman’s “Magnetic Atlas,” published in New York in 1800, as well as Hansteen’s “Investigations into Terrestrial Magnetism” (Hansteen 1819).15 Whether Gauss owned also Hansteen’s “Atlas” is unclear, because the Gauss library is no longer complete. However, Gauss only adopted geomagnetism as his area of research when Wilhelm Weber became his colleague and friend in 1831. Wilhelm Weber was born on October 24, 1804, in Wittenberg; in 1822, he began studying physics at the university in Halle, where he obtained his doctorate in 1827. The area of research with which he mainly occupied his time in Halle was acoustics; he published numerous works on this subject. Gauss and Weber met for the first time at the seventh symposium of the “Association of German Nature Researchers and Physicians” (Gesellschaft Deutscher Naturforscher und Ärzte) held in Berlin in September 1828; this event was held under the aegis of Alexander von Humboldt, in whose private establishment Gauss was accommodated during the symposium. At the symposium, Weber presented his latest findings, all in the field of acoustics; although acoustics was not his topic, Gauss was deeply impressed and seems to have quickly developed a great affection for the young man. When the post of Professor of Physics at Göttingen University fell vacant in 1830, Weber was appointed to the post, not without Gauss’ intervention. Wilhelm Weber moved to Göttingen in September 1831. This signaled the beginning of a new era in the field of geomagnetic research. Gauss and Weber complemented each other perfectly – Gauss the great theoretician and Weber the great experimenter. The field of electrodynamics and geomagnetism was new ground for both scientists in terms of physics research. Breathtaking results were quickly obtained both theoretically and experimentally. Gauss was able to present
15
State and University Library Göttingen, Gauss library no. 130 (Churchman) and no. 856 (Hansteen).
132
K. Reich and E. Roussanova
his “Intensitas vis magneticae terrestris ad mensuram “Intensitas vis magneticae ad mensuram absolutam absolutam revocata” to the Royal Society of Sciences in Göttingen as early as December 15, 1832 (Gauss 1832); the Latin original did not appear until 1841 (Gauss 1841). However, the physicist Johann Christian Poggendorff (1796–1877) succeeded in publishing a German translation in 1833 in his famous journal “Annalen der Physik” (Gauss 1833). The first electromagnetic telegraph was installed in Göttingen that same year (1833); an aerial magnetic double line ran across Göttingen from Weber’s physics cabinet to Gauss’ observatory. The first telegram was sent at Easter 1833. A plan was soon conceived to set up a magnetic observatory on the site of the astronomical observatory. This was finished by the end of 1833; the new institution was equipped with a new magnetometer, a device which had only recently been developed. Observations began in January 1834.
5.2
The “Magnetic Association” in Göttingen (1834–1843)
The “Magnetic Association” (Magnetischer Verein) in Göttingen was founded at the time the new magnetic observatory was first put to use. The principles developed by Alexander von Humboldt in particular were applied and developed further in Göttingen. Humboldt already had a small network in 1829; all these stations recorded the intensity on specified dates at prescribed intervals with the aid of a Gambey’s instrument. The method consisted of making corresponding observations and graphically depicting the data thus obtained (Reich and Roussanova 2012/2013, Part I). Gauss and Weber succeeded in quickly expanding the network, which came to involve a lot of observation sites all working in cooperation. Humboldt’s method of making corresponding observations was expanded and perfected (Reich and Roussanova 2012/2013, Part II). The success quickly achieved by Gauss and Weber attracted numerous renowned physicists and astronomers to Göttingen and led to the construction of many new magnetic observatories throughout the world, not only in Europe but also in Russia, the USA, India, and various places in the British Empire. While the first publications by Gauss and Weber appeared in the “Annalen der Physik” and the “Astronomische Nachrichten,” a new journal, the “Results of the Observations by the Magnetic Association” (Resultate aus den Beobachtungen des Magnetischen Vereins), was published since 1836; volumes 1 and 2 were published in Göttingen and volumes 3, 4, 5, and 6 in Leipzig. Scientific essays by various authors were published, new instruments presented, new magnetic observatories described, etc. However, the core of the journal was the presentation of the data received and collected in Göttingen, which was commented on in detail in the “Explanatory notes on the scheduled drawings and observation data” (Erläuterungen zu den Terminszeichnungen und den Beobachtungszahlen). Gauss was the author for the first two volumes, Weber for the others. Each volume was accompanied by numerous plates, 50 in all, 25 of which illustrated the corresponding observations.
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
5.3
133
Gauss’ “General Theory of Geomagnetism” (1839)
The publication of Gauss’ treatise “General Theory of Geomagnetism” (Allgemeine Theorie des Erdmagnetismus) in the “Results” in 1839 was a milestone in the development of geomagnetic science (Gauss 1839). Here Gauss introduced the term “potential,” which revolutionized the further history of geomagnetic research and was also used in other areas of physics (Reich 2011, pp. 48–50). It became possible to calculate data and draw maps using the potential equation V/R. These maps were completely new; nothing like them had ever been seen before, based as they were on a mathematical theory. The quality of the theory was established by comparing these maps with those based on observation data; Gauss was very satisfied with their conformity. Gauss’ “General Theory” was accompanied by six calculated maps of the world; these were the first maps he published: – Maps I and II: “map for the values of V/R” using both Mercator and stereographic projection – Maps III and IV: “map for the calculated values of declination” using both Mercator and stereographic projection – Maps V and VI: “map for the calculated values of the whole intensity” using both Mercator and stereographic projection
Maps I and II are the most remarkable; they are completely new inasmuch as they show equipotential lines, known by Gauss as “balance lines” (Gleichgewichtslinien). Such a thing had never existed before. From then on, maps were to include declination, inclination, intensity, and equipotential lines. The “Leipzig Allgemeine Zeitung” of August 6, 1839, included an extensive discussion titled “On the general theory of geomagnetism discovered by Gauss.”16
5.4
Gauss’ and Weber’s “Atlas of Geomagnetism” (Leipzig 1840)
The publication of several maps naturally gave rise to the idea of publishing an atlas. This appeared just 1 year later with the title “Atlas of Geomagnetism, traced according to the elements of the theory. Supplement from the Observations of the Magnetic Association, with the collaboration of C. W. B. Goldschmidt edited by Carl Friedrich Gauss and Wilhelm Weber” (Atlas des Erdmagnetismus nach den Elementen der Theorie entworfen. Supplement aus den Beobachtungen des Magnetischen Vereins unter Mitwirkung von C. W. B. Goldschmidt, herausgegeben von Carl Friedrich Gauß und Wilhelm Weber) (Gauss and Weber 1840; Fig. 13).
16
Anonymus: Über die von Gauß entdeckte allgemeine Theorie des Erdmagnetismus [Review]. In: Leipziger Allgemeine Zeitung, August 6, 1839, no. 218, supplement, p. 2566.
134
K. Reich and E. Roussanova
Fig. 13 Title page of the “Atlas of Geomagnetism” (Leipzig 1840). State Library Berlin, shelf mark Kart LS HM 4ı My 3677
In the Preface, we read: My honoured friend Professor Weber, who balks at no sacrifice when it comes to rendering a service to science, undertook to realise the data with great completeness in a number of maps which depict all magnetic conditions everywhere on earth as determined by this theory. [: : :] This explanation exhausts everything required to understand the maps and evaluate their usefulness so completely that nothing remains for me to do but to express the wish that this laborious and meritorious work will be duly recognised by friends of the natural sciences.17 (Gauss Werke 12, pp. 377–378)
17
In the original German: “Mein verehrter Freund, Herr Professor Weber, der keine Aufopferung scheuet, wo es gilt der Wissenschaft einen Dienst zu leisten, unternahm es, eine solche Versinnlichung durch eine Anzahl von Karten zu veranstalten, die in grösster Vollständigkeit alle
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
135
The observation-based map used for comparison was/were: – In the case of declination, Peter Barlow’s map (Barlow 1833) – In the case of inclination, Caspar Horner’s maps (Horner 1836/1842) – In the case of intensity, Edward Sabine’s map (Sabine 1838) The “Atlas” published by Gauss and Weber in 1840 begins with a very comprehensive text and is accompanied by 18 maps. This text explains each of the maps in detail. The maps were based on the geomagnetic parameters calculated for 1,262 sites around the terrestrial globe; 10,096 values were accordingly calculated and recorded in tables (Gauss Werke 12, p. 386, including the last four tables). The scientists who contributed to the maps in the atlas were Wilhelm Weber, the observer Benjamin Goldschmidt (1807–1851) who worked in Göttingen, the Russian astronomer Aleksandr Nikolaevich Drashusov (1816–1890) who was staying in Göttingen at that time, and the mathematician Heinrich Eduard Heine (1821–1881), who was studying with Gauss at that time and later became famous in his own right. This “Atlas” was also completely new inasmuch as it was not a historical atlas, but one which depicted the present situation. It contained 18 maps, 9 with Mercator projection and 9 with stereographic projection; maps XIII, XIV, XVII, and XVIII conform with the corresponding maps already published in 1839: maps I and II “map for the values of V/R” (see Fig. 14), maps VII and VIII “map for the calculated values of western intensity Y.” In detail: – The maps I (Fig. 14) und II: “map for the values of V/R” (Karte für die Werthe von V/R) – The maps III und IV: “ideal distribution of magnetism on the earth’s surface” (Ideale Vertheilung des Magnetismus auf der Erdoberfläche) – The maps V und VI (Fig. 15): “map for the calculated values of the northern intensity X” (Karte für die berechneten Werthe der nördlichen Intensität X) – The maps VII und VIII: “map for the calculated values of the western intensity Y” (Karte für die berechneten Werthe der westlichen Intensität Y) – The maps IX und X: “map for the calculated values of the vertical intensity Z” (Karte für die berechneten Werthe der verticalen Intensität Z) – The maps XI und XII: “map for the calculated values of the horizontal intensity” (Karte für die berechneten Werthe der horizontalen Intensität)
magnetischen Verhältnisse für die ganze Erdoberfläche, so wie jene Theorie sie ergiebt, graphisch darzustellen. [: : :] Diese Erklärung erschöpft alles, was zum Verständnis der Karten und zur Beurtheilung des Nutzens, welchen sie leisten können, nöthig ist, so vollständig, dass mir nichts hinzuzusetzen übrig bleibt als der Wunsch, dass diese mühsame und verdienstliche Arbeit bei den Freunden der Naturwissenschaften gerechte Anerkennung finden möge” (Gauss Werke vol. 12, pp. 377–378).
136
K. Reich and E. Roussanova
Fig. 14 Map I: Map for the values of V/R – equipotential lines (Gauss and Weber 1840). State library Berlin, shelf mark Kart LS HM 4ı My 3677
Fig. 15 Map VI: Map for the calculated values of the northern intensity X, in stereographic projection (Gauss and Weber 1840) state library Berlin, shelf mark Kart LS HM 4ı My 3677
– The maps XIII (Fig. 16) und XIV: “map for the calculated values of the declination” (Karte für die berechneten Werthe der Declination) – The maps XV (Fig. 17) und XVI: “map for the calculated values of the inclination” (Karte für die berechneten Werthe der Inclination) – The maps XVII und XVIII: “map for the calculated values of the whole intensity” (Karte für die berechneten Werthe der ganzen Intensität)
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
137
Fig. 16 Map XIII: Map for the calculated values of the declination in Mercator projection, (Gauss 1839; Gauss and Weber 1840) state library Berlin, shelf mark Kart LS HM 4ı My 3677
Fig. 17 Map XV: Map for the calculated values of the inclination in Mercator projection state library in Berlin, shelf mark Kart LS HM 4ı My 3677
138
K. Reich and E. Roussanova
Map I, the map with the equipotential lines (Fig. 14), is the pièce de résistance; in the original atlas, it was printed on special paper. The version of map I published in 1839 contained considerably fewer lines; the version published in 1840 was a significant improvement. The inclination lines were new to the atlas, as none of the maps in Gauss’ “General Theory” included them. All four geomagnetic parameters, i.e., declination, inclination, intensity, and potential, were now determined theoretically for the first time, “like planets and comet trails through their elements.” The following summary states: “This geomagnetic atlas thus opens the series of atlases which will appear at suitable intervals to clearly present the fundamental data of the history of geomagnetism. No excurse into the history of past times can be made here”18 (Gauss Werke 12, pp. 404–405).
5.5
The End of the “Magnetic Association” in Göttingen
The exceptionally fruitful collaboration between Gauss and Weber was shaken to its core when Wilhelm Weber lost his professorial post in Göttingen in December 1837. He was one of the so-called Göttingen Seven who cosigned the protest of professors in Göttingen against the breach in the constitution made by the new king of Hanover, Ernst August I. (1771–1851, king since 1837). Weber initially remained in Göttingen even though he no longer had a job; however, at Easter 1843 he went to Leipzig University, where he became Professor of Physics. Weber’s departure signaled the end of the “Magnetic Association”; the “Results of the Observations by the Magnetic Association” was discontinued. Both Gauss in Göttingen and Weber in Leipzig turned to other areas of research; geomagnetism ceased to be a topic of interest to either. When Weber returned to Göttingen in 1849 and was restored to his former professorial post, it was too late to build on past successes. It was politics that put an abrupt end to the first large-scale global research network.
6
Excursus: Berghaus’ “Physical Atlas”
Heinrich Karl Berghaus (1797–1884) was a geographer and cartographer; he studied in Berlin and was a professor at the “Building Academy” (Bauakademie) in Berlin from 1824 to 1855. Berghaus and Humboldt were friends for more than 40 years and maintained a lively correspondence. In 1845, Humboldt published the first volume of his acclaimed “Cosmos” (Humboldt 1845–1862). Heinrich
18 In the original German: “[: : :] (wie Planeten- und Cometenbahnen durch ihre Elemente) [: : :]. Der gegenwärtige Atlas des Erdmagnetismus eröffnet also die Reihe von Atlassen, welche in angemessenen Zwischenzeiten erscheinen sollen, um von nun an die Grunddata der Geschichte des Erdmagnetismus vollständig und übersichtlich vor Augen zu legen. Auf die Geschichte der vergangenen Zeit kann hier nicht eingegangen werden” (Gauss Werke vol. 12, pp. 404–405).
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
139
Berghaus was the author and originator of the atlas which accompanied it. Berghaus’ “Physical Atlas or collection of maps depicting the main organic and inorganic natural phenomena according to their geographical dispersion and distribution” (Physikalischer Atlas oder Sammlung von Karten, auf denen die hauptsächlichsten Erscheinungen der anorganischen und organischen Natur nach ihrer geographischen Verbreitung und Vertheilung bildlich dargestellt sind) was published between 1837 and 1843 (Berghaus 1837–1843). At that time, this “Atlas” was the first of its kind; Berghaus’ “Atlas” included maps of the following eight areas: 1. 2. 3. 4. 5. 6. 7. 8.
Meteorology and climatology Hydrology and hydrographics Geology Magnetism of the earth Geography of plants Geography of animals Anthropography Ethnography
Part 4 of the “Physical Atlas” was accordingly devoted to geomagnetism; it contained five maps by different cartographers: three declination maps with both Mercator and stereographic projection and two intensity maps, also with Mercator and stereographic projection. Geomagnetism consequently only accounts for a comparatively small part of the “Physical Atlas,” which comprises altogether 90 pages. A second edition of Berghaus’ “Physical Atlas” also appeared in Gotha between 1849 and 1863. The third edition published in Gotha between 1886 and 1892 by Hermann Berghaus (1828–1890), a nephew of Heinrich Berghaus, is particularly interesting (Berghaus 1886–1892). The author of the section “Geomagnetism” was Georg Neumayer (1826–1909). This section included maps previously published by Hansteen, Gauss, and Weber and a map with equipotential lines calculated for 1885 (Fig. 18); it was the second of its kind, following the Atlas published by Gauss and Weber in 1840 (see Reich and Roussanova 2015, chapter 3.11).
7
The Term “Atlas”
Tobias Mayer’s “Mathematical Atlas” (Mathematischer Atlas), published in Augsburg in 1745, is particularly important. Mayer’s aim was to “extract the most necessary and useful information and place it in the hands of lovers of these marvellous [mathematical] sciences briefly yet in an easy, clear manner”19 (Mayer
19
In the original German: das “nötighste und nützlichste auszulesen, und auf eine kurze, jedoch leichte und deutliche Art denen Geneigten Liebhabern dieser herrlichen Wissenschaften (sc. der mathematischen) in die Hände zu liefern” (Mayer 1745, Vorwort).
140
K. Reich and E. Roussanova
Fig. 18 Map with declination and equipotential lines, calculated for 1885, print 1891. Berghaus’ “Physical Atlas, section “Geomagnetism,” 3. edition (Gotha 1892). State library Berlin, shelf mark 2ı W 183
1745, preface). This was followed by all mathematical areas customary at that time, twelve in all, presented with the help of 60 + 8 plates. Mayer’s “Mathematical Atlas” facilitated the use of the word “atlas” in other areas concerned with collections of maps and tables. The term “atlas” was again used just a short time later for geomagnetic maps; geomagnetic atlases presented terrestrial maps with geomagnetic details, mostly lines, initially only declination lines. In 1776, Dunn’s atlas was in its early stages. The subsequent history of geomagnetic atlases reflects the development of geomagnetic science; ultimately all four types of geomagnetic lines were included in the magnetic atlases. During the first half of the nineteenth century, the term atlas came into use in yet more areas, i.e., in both physics and medicine. While the term “Physical Atlas” was still used in physics with reference to the earth and illustrations of the earth’s surface, this was no longer the case in medicine. Here the word “atlas” was used to mean merely a collection of plates, e.g., “Obstetrical Atlas” (Geburtshülflicher Atlas) (Kilian) or “Atlas of Pathological Anatomy for Practising Physicians” (Atlas
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
141
der pathologischen Anatomie für praktische Ärzte) (Albers 1832–1862). Thereafter, the word “atlas” was no longer restricted to a collection of any kind of images of the earth but was also used to describe any pictorial collection. Acknowledgements The authors want to say thank you very much to the following persons and institutions: Henrik Dupont, the Royal Library of Copenhagen; Wolfgang Crom, Steffi Mittenzwei, and Holger Scheerschmidt, the State Library in Berlin, department of maps; and Bärbel Mund and Helmuth Rohlfing, the State and University Library Göttingen, department of manuscripts and rare books.
References Albers FH (1832–1862) Atlas der pathologischen Anatomie für praktische Aerzte. Bonn 1832–1862 Anonymus (1907) Churchman, John, scientist. Natl Cyclopedia Am Biogr 9:287. New York Banks J (2007) The Scientific Correspondence of Sir Joseph Banks, 1765-1820. Ed. by Neil Chambers, vols 1–6. London Barlow P (1833) On the present situation of the magnetic lines of equal variation, and their changes on the terrestrial surface. Philos Trans R Soc Lond Teil 2:667–673. London Bedini S (2000) John Churchman and his magnetic Atlas, Part I and II. In: History corner, professional surveyor magazine, vol 20. Issue November and December 2000. OnlineRessource, Part 1: http://www.profsurv.com/magazine/article.aspx?i=669; Part 2: http://www. profsurv.com/magazine/article.aspx?i=681 Berghaus H (1837–1843) Physikalischer Atlas oder Sammlung von Karten, auf denen die hauptsächlichsten Erscheinungen der anorganischen und organischen Natur nach ihrer geographischen Verbreitung und Vertheilung bildlich dargestellt sind. Gotha 1837–1843 Berghaus H (1886–1892) Physikalischer Atlas. Dritte Ausgabe. 75 Karten in sieben Abteilungen. Vollständig neu bearbeitet. Gotha 1886–1892 Churchman J (1790) An explanation of the magnetic atlas, or variation chart, hereunto annexed projected on a plan entirely new; by which the magnetic variation on any part of the globe may be precisely determined. Philadelphia 1790 Churchman J (1794) The magnetic atlas, or Variation charts of the whole terraqueous globe: comprising a system of the variation & dip of the needle by which the observations being truly made, the longitude may be ascertained. Second edition of (Churchman 1790): London 1794 (VII, 76 S., 3 charts). 3rd edn, New York 1800 (with additions, p 82). 4th edn, London 1804 (with considerable additions, p 86) Dunn S (1775) The Navigator’s guide to the oriental or Indian seas: or, the description and use of a variation chart of the magnetic needle, designed for shewing the longitude, throughout the principal parts of the Atlantic, Ethiopic, and Southern Oceans, within a degree, or sixty miles. With an introductory discourse, concerning the discovery of the magnetic variation, the finding the longitude thereby, and several useful tables. London 1775 Dunn S (1776) A new atlas of variations of the magnetic needle for the Atlantic, Ethiopic, Southern and Indian Ocean; drawn from a theory of the magnetic system, discovered and applied to navigation. London 1776. (p VI + 9 Charts) Dunn S (1777) A new epitome of practical navigation: or guide to the Indian Seas. Containing, I. The elements of mathematical learning used and applied in the theory and practice of nautical affairs. II. The theory of navigation, deduced from original principles. III. The method of correcting and determining the longitude at sea; by the variation of the magnetic needle. IV. The practice of navigation, in all kinds of sailing. The whole illustrated with a variety of copperplates. London 1777
142
K. Reich and E. Roussanova
Dunn S (1810) A new Atlas of the Mundane system; or, of geography and cosmography: describing the heavens and the earth, the distances, motions, and magnitudes, of the celestial bodies. (First edition London 1774). The Sixth Edition With Additions and Considerable Improvements. London 1810 Enebakk V, Johansen NV (2011) Christopher Hansteens annus mirabilis. Reisen til London og Paris 1819. Oslo 2011 Gauss CF (Werke) 1 edition = vol 1, 2 Göttingen 1863; vol 3 Göttingen 1866; vol 4 Göttingen 1873; vol 5 Göttingen 1867; vol 6 Göttingen 1874; vol 7 Gotha 1871; 2nd edition and new edition = 12 volumes, 1870–1933. Reproductions Hildesheim, New York (1973 and 1981) Gauss CF (1832) Anzeige der “Intensitas vis magneticae terrestris ad mensuram absolutam revocata.” Göttingische Gelehrte Anzeigen 1832, pp 2041–2048, 2049–2058 (24. December, Nr. 205). In: Gauss Werke 5, pp 293–304. Corrected version in: Astronomische Nachrichten 10, 1833, Nr. 238, col. 349–360 Gauss CF (1833) Die Intensität der erdmagnetischen Kraft zurückgeführt auf absolutes Maaß. Annalen der Physik und Chemie 28 (= 104), 1833, pp 241–272, 591–615 Gauss CF (1839) Allgemeine Theorie des Erdmagnetismus. Resultate aus den Beobachtungen des magnetischen Vereins im Jahre 1838. Leipzig 1839, pp 1–57. In: Gauss Werke 5, pp 119–175 Gauss CF (1841) Intensitas vis magneticae terrestris ad mensuram absolutam revocata. Commentantiones societatis regiae scientiarum Gottingensis recentiores 8, (1832–1837) 1841, commentationes classis mathematicae pp 3–44. In: Gauss Werke 5, pp 79–118 Gauss CF, Weber W (1840) Atlas des Erdmagnetismus nach den Elementen der Theorie entworfen. Supplement zu den Resultaten aus den Beobachtungen des magnetischen Vereins unter Mitwirkung von C. W. B. Goldschmidt. Leipzig 1840. In: Gauss Werke 12, pp 335–408 Godwin G (1888) Dunn, Samuel (d. 1794). Dictionary of National Biography 16, 1888, pp 210– 212 Hansteen Chr (1819) Untersuchungen über den Magnetismus der Erde. Translation by P. Treschow Hanson. Christiania 1819. Accompanied by: Magnetischer Atlas gehörig zum Magnetismus der Erde. Christiania 1819 Hansteen Chr (1823) Zur Geschichte und zur Vertheidigung seiner Untersuchungen über den Magnetismus der Erde, und kritische Bemerkungen über die hierher gehörigen Arbeiten der Herren Biot und Morlet. (= Schreiben an Gilbert vom 13.4.1823). Annalen der Physik und der physikalischen Chemie 15 (= 75), 1823, pp 145–196 Heard N (2013) Samuel Dunn 1723–1794. Online Resource: http://www.heardfamilyhistory.org. uk/SamuelDunn.html Hellmann G (1895) E. Halley, W. Whiston, J. C. Wilcke, A. von Humboldt, C. Hansteen. Die ältesten Karten der Isogonen, Isoklinen, Isodynamen 1701, 1721, 1768, 1804, 1825, 1826. Berlin 1895. Reprint Nendeln; Liechtenstein 1969 Horner JC (1836/1842) [Kapitel] XVII. Magnetismus der Erde. Physikalisches Wörterbuch, herausgegeben von Johann Samuel Traugott Gehler, neu bearbeitet von Brandes, Gmelin, Horner, Muncke, Pfaff, vol. 6, second part, Leipzig 1836, pp 1023–1147. Charts in: KupferAtlas zu Johann Samuel Traugott Gehler’s Physikalischem Wörterbuch, Leipzig 1842 Jefferson Th (1950–2013) The papers of Thomas Jefferson, vols. 1–40. Princeton University Press, Princeton Kilian HF (ca. 1835-1838) Geburtshülflicher Atlas. Düsseldorf [sine anno] Mayer T (1745) Mathematischer Atlas, in welchem auf 60 Tabellen alle Theile der Mathematik vorgestellet; und nicht allein überhaupt zu bequemer Wiederholung, sondern auch den Anfängern besonders zur Aufmunterung durch deutliche Beschreibung u. Figuren entworfen werden. Augsburg 1745. In: Tobias Mayer: Schriften zur Astronomie, Kartographie, Mathematik und Farbenlehre, vol 4 (Historia scientiarum), ed. by Erhard Anthes and others, Hildesheim 2009 Mercator G (1595) Atlas sive Cosmographicae Meditationes de Fabrica Mundi et Fabricati Figura. Duisburg Prince SA (2006) The princess & the patriot. Ekaterina Dashkova, Benjamin Franklin, and the age of enlightenment. American Philosophical Society, Philadelphia
Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: The. . .
143
Procès-verbaux (1911) Procès verbaux des séances de l’académie Impériale des sciences depuis sa fondation jusqu’à 1803. Tome 4,1: 1786–1803. St. Petersburg 1911 Reich K (2011) Alexander von Humboldt und Carl Friedrich Gauß als Wegbereiter der neuen Disziplin Erdmagnetismus. HiN – Alexander von Humboldt im Netz 22, 2011, pp 35–55. Online-Ressource: http://www.uni-potsdam.de/u/romanistik/humboldt/hin/hin22/reich.htm Reich K, Roussanova E (2012) Meilensteine in der Darstellung von erdmagnetischen Beobachtungen in der Zeit von 1701 bis 1849 unter besonderer Berücksichtigung des Beitrages von Russland. In: Beschreibung, Vermessung und Visualisierung der Welt, ed. by Ingrid Kästner und Jürgen Kiefer. Aachen 2012, pp 137–160 Reich K, Roussanova E (2012/2013) Visualising geomagnetic data by means of corresponding observations. Alexander von Humboldt, Carl Friedrich Gauß and Adolph Theodor Kupffer [part 1]. GEM – International Journal on Geomathematics 3, 2012, 1, pp 1–16. Online Ressource: http://www.springerlink.com/content/x807664661171577. [part 2]. GEM – International Journal on Geomathematics 4, 2013, 1. pp 1–25. Online Ressource: http://link.springer.com/content/ pdf/10.1007/s13137-012-0043-4 Reich K, Roussanova E (2015) Die Erforschung des Erdmagnetismus durch Christopher Hansteen und Carl Friedrich Gauß. Der Briefwechsel beider Gelehrten im historischen Kontext. Manuscript in print Sabine E (1838) Report on the variations of the magnetic intensity abserved at different Points of the Earth’s surface. Report of the seventh meeting of the British Association for the advancement of science held at Liverpool in September 1837, vol VI, London 1838, pp 1–35 von Humboldt A (1845–1862) Kosmos. Entwurf einer physischen Weltbeschreibung. Stuttgart und Tübingen, vol 1, 1845; vol 2, 1847; vol 3, 1850; vol 4, 1858; vol 5, 1862. Further edition by Magnus Enzensberger, mit einem Nachwort versehen von Ottmar Ette und Oliver Lubrich, Frankfurt am Main 2004 Washington G (1983–2011) The papers of George Washington, Presidential Series, vols 1–58, Charlottesville and others, 1983–2011
Part II Observational and Measurement Key Technologies
Earth Observation Satellite Missions and Data Access Henri Laur and Volker Liebig
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 End-to-End Earth Observation Satellite Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Space Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Ground Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Overview on ESA Earth Observation Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 ERS-1 and ERS-2 Missions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Envisat Mission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Proba-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Earth Explorers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 GMES and Sentinels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Meteorological Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 ESA Third Party Missions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Some Major Results of ESA Earth Observation Missions . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Land . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Ocean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Cryosphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Atmosphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 ESA Programs for Data Exploitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 User Access to ESA Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 How to Access the EO Data at ESA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
148 149 150 150 153 153 154 155 155 156 158 160 161 161 162 163 165 165 167 168 169 169 170
H. Laur () European Space Agency (ESA), Head of Earth Observation Mission Management Division, ESRIN, Frascati, Italy e-mail: [email protected] V. Liebig European Space Agency (ESA), Director of Earth Observation Programmes, ESRIN, Frascati, Italy e-mail: [email protected]
© Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_3
147
148
H. Laur and V. Liebig
Abstract
This article provides an overview on Earth Observation satellites, describing the end-to-end elements of an Earth Observation mission and then focusing on the European Earth Observation programs. Some significant results obtained using data from European missions (ERS, Envisat) are provided. Finally the access to Earth Observation data through the European Space Agency, free of charge, is described.
1
Introduction
Early pictures of the Earth seen from space became icons of the Space Age and encouraged an increased awareness of the precious nature of our common home. Today, images of our planet from orbit are acquired continuously and have become powerful scientific tools to enable better understanding and improved management of the Earth. Satellites provide clear and global views of the various components of the Earth system – its land, ice, oceans, and atmosphere – and how these processes interact and influence each other. Space-derived information about the Earth provides a whole new dimension of knowledge and services which can benefit our lives on a day-to-day basis. Earth Observation satellites supply a consistent set of continuously updated global data which can offer support to policies related to environmental security by providing accurate information on various environmental issues, including global change. Meteorological satellites have radically improved the accuracy of weather forecasts and have become a crucial part of our daily life. Earth Observation data are gradually integrated within many economic activities, including exploitation of natural resources, land-use efficiency, or transport routing. The European Space Agency (ESA) has been dedicated to observing the Earth from space ever since the launch of its first meteorological mission Meteosat in 1977. Following the success of this first mission, the subsequent series of Meteosat satellites, the ERS, and Envisat missions have been providing a growing number of users with a wealth of valuable data about the Earth, its climate, and changing environment. It is critical, however, to continue learning about our planet. As our quest for scientific knowledge continues to grow, so does our demand for accurate satellite data to be used for numerous practical applications related to protecting and securing the environment. Responding to these needs, ESA’s Earth Observation programs comprise a science and research element, which includes the Earth Explorer missions, and an applications element, which is designed to facilitate the delivery of Earth Observation data for use in operational services, including the well-established meteorological missions with the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT). In addition, the GMES (Global Monitoring for Environment and Security) Sentinel missions, which form part of the GMES Space Component, will collect robust, long-term relevant
Earth Observation Satellite Missions and Data Access
149
datasets. Together with other satellites, their combined data archives are used to produce Essential Climate Variables for climate monitoring, modeling, and prediction. Mathematics is of crucial importance in Earth Observation: onboard data compression and signal processing are common features of many Earth Observation satellites, inverse problems for data applications (e.g., the determination of the gravitational field from ESA GOCE satellite measurements), data assimilation to combine Earth Observation data with numerical models, etc.
2
End-to-End Earth Observation Satellite Systems
In this chapter, the end-to-end structure of a typical satellite system shall be explained. This will enable the user of Earth Observation (EO) data to better understand the process of gathering the information he or she is using. Earth Observation satellites usually fly on the so-called low Earth orbit (LEO) which means an orbit altitude of some 250–1,000 km above the Earth’s surface and an orbit inclination close to 90ı , i.e., a polar orbit. Another orbit used mainly for weather satellites like Meteosat is the geostationary orbit (GEO). This is the orbit in which the angular velocity of the satellite is exactly the same as the one of Earth (360ı in 24 h). If the satellite is positioned exactly over the equator (inclination 0ı ), then the satellite has always the same position relative to the Earth’s surface. This is why the orbit is called geostationary. Using this position, the satellite’s instruments can always observe the same area on Earth. The price to pay for this position is that the orbit altitude is approx. 36,000 km which is very far compared to a LEO. A special LEO is the sun-synchronous orbit which is used for all observation systems which need the same surface illumination angle of the sun. This is important for imaging systems like optical cameras. For this special orbit, typically 600– 800 km in altitude, the angle between the satellite orbit and the line between sun and Earth is kept equal. As the Earth rotates once per year around the sun, the satellite orbit has also to rotate by 360ı in 365 days. This can be reached by letting the rotational axis of the satellite orbit precess, i.e., rotate, by 360/365 degrees per day to keep exactly pace with the Earth’s rotation around the sun. As the Earth is not a perfect sphere, the excess mass at the equator creates an angular momentum which lets a rotating system precess like a gyroscope. The satellite has to fly with an inclination of approx. 98ı which is close to an orbit flying over the poles. As the Earth is rotating under this orbit, the satellite almost “sees” the Earth’s entire surface during several orbits. Typical LEO satellites are ERS, Envisat, or SPOT (see Sect. 3). All Earth Observation satellite systems consist of a space segment and a ground segment. If we regard the whole chain of technical infrastructures to acquire, downlink, process, and distribute the EO data, we speak about an end-to-end system.
150
H. Laur and V. Liebig
Fig. 1 Envisat satellite with its platform (service module and solar array) and the payload composed of ten instruments
2.1
Space Segment
The space segment (S/S) mainly consists of the spacecraft, i.e., the EO satellite which has classically a modular design and is subdivided into satellite platform (or bus) and satellite payload. Figure 1 shows as example the Envisat satellite. The satellite bus offers usually all interfaces (mechanical, thermal, energy, data handling) which are necessary to run the payload instruments mounted on it. In addition, it contains usually all housekeeping, positioning, and attitude control functions as well as a propulsion system. It should be mentioned that small satellites often have a more integrated approach. In some cases, the space segment can consist of more than one satellite, e.g., if two or more satellites fly in tandem, such as Swarm (see Sect. 3). The space segment may also include a data relay satellite (DRS), positioned in a geostationary orbit and used to relay data between the EO satellite and the Earth when the satellite is out of visibility of the ground stations, such as ESA Artemis satellite data relay for the Envisat satellite.
2.2
Ground Segment
The ground segment (G/S) provides the means and resources to manage and control the EO satellite, to receive and process the data produced by the payload
Earth Observation Satellite Missions and Data Access
151
Fig. 2 Main elements of an EO ground segment including the Flight Operations Segment (FOS) and the Payload Data Segment (PGS)
instruments, and to disseminate and archive the generated products. In general, the ground segment can be split into two major elements (see Fig. 2): – The Flight Operations Segment (FOS), which is responsible for the command and control of the satellite – The Payload Data Segment (PDS), which is responsible for the exploitation of the instruments data Both G/S elements have different communication paths: satellite control and telecommand uplink use normally S-band, whereas instrument data is downlinked to ground stations via X-band either directly or after onboard recording. In addition, Ka-band or laser links with very high bandwidth are used for inter-satellite links needed by data relay satellites like Artemis or the future EDRS (European Data Relay Satellite).
Flight Operations Segment The Flight Operations Segment (FOS) at ESA is under the responsibility of the European Space Operation Centre (ESOC) located in Darmstadt, Germany. The mandate of ESOC is to conduct mission operations for ESA satellites and to establish, operate, and maintain the related ground segment infrastructure. Most of ESA EO satellites are controlled and commanded from ESOC control rooms. ESOC’s involvement in a new mission normally begins with the analysis of possible operational orbits or trajectories and the calculation of the corresponding launch windows – selected to make sure that the conditions encountered in this early
152
H. Laur and V. Liebig
phase remain within the spacecraft capabilities. The selection of the operational orbit is a complex task with many trade-offs involving the scientific objectives of the mission, the launch vehicle, the spacecraft, and the ground stations. ESOC’s activity culminates during the Launch and Early Orbit Phase (LEOP), with the first steps after the satellite separates from the launcher’s uppermost stage, including the deployment of antennas and solar arrays as well as critical orbit and attitude control maneuvers. After the LEOP, ESOC starts the operations of the FOS, including generally the command and control of the satellite; the satellite operations uploading, based on the observation plans prepared by the Payload Data Segment; the satellite configuration and performance monitoring; the orbit prediction, restitution, and maintenance; and the contingency and recovery operations following satellite anomalies. ESOC does also provide a valuable service for avoidance of collision with space debris, a risk particularly high for LEO satellites orbiting around 800 km altitude. More information about ESOC activities can be found at http://www.esa.int/esoc.
Payload Data Segment The EO Payload Data Segment (PDS) at ESA is under the responsibility of ESRIN, located in Frascati, Italy, and known as the ESA Centre for Earth Observation, because it also includes the ESA activities for developing the EO data exploitation. The PDS provides all services related to the exploitation of the data produced by the instruments embarked on board the ESA EO satellites and is therefore the gateway for EO data users. The activities performed within a PDS are: – The payload data acquisition using a network of worldwide acquisition stations as well as data relay satellites such as Artemis – The processing of products either in near real time from the above data acquisitions or on demand from the archives – The monitoring of product quality and instrument performance, as well as the regular upgrade of data processing algorithms – The archiving and long-term data preservation and the data reprocessing following the upgrade of processing algorithms – The interfaces to the user communities from user order handling to product delivery, including planning of instrument observations requested by the users – The availability of data products on Internet servers or through dedicated satellite communication (see Sect. 5 for data access) – The development of new data products and new services in response to evolving user demand The ESA PDS is based on a decentralized network of acquisition stations and archiving centers. The data acquisition stations located close to the poles can “see” most of the LEO satellite passes and are therefore used for acquiring the data recorded on board during the previous orbit. Acquisition stations located away from polar areas are used essentially to acquire data collected over their station mask
Earth Observation Satellite Missions and Data Access
153
Fig. 3 Example of a network of acquisition stations for the data transmitted by the ESA Envisat satellite. In addition to the above ground stations, the Envisat data was also transmitted to ground through the ESA data relay satellite Artemis
(usually 4,000 km diameter). The archiving centers are generally duplicated to avoid data loss in case of fire or accidents (Fig. 3). In carrying out the PDS activities, ESA works closely with other space agencies as well as with coordination and standardization bodies. This includes a strong interagency collaboration to acquire relevant EO data following a natural disaster. More information about ESRIN activities can be found at http://www.esa.int/ esrin.
3
Overview on ESA Earth Observation Programs
3.1
Background
The European Space Agency (ESA) is Europe’s gateway to space. Its mission is to shape the development of Europe’s space capability and ensure that investment in space continues to deliver benefits to the citizens of Europe and the world. ESA is an international organization with 20 Member States. By coordinating the financial and intellectual resources of its members, it can undertake programs and activities far beyond the scope of any single European country. ESA’s programs are designed to find out more about Earth, its immediate space environment, our Solar System, and the Universe, as well as to develop satellite-based technologies and services, and to promote European industries. ESA works closely with space organizations inside and outside Europe. ESA’s Earth Observation program, known as the ESA Living Planet program, embodies the fundamental goals of Earth Observation in developing our knowledge
154
H. Laur and V. Liebig
of Earth, preserving our planet and its environment, and managing life on Earth in a more efficient way. It aims to satisfy the needs and constraints of two groups, namely, the scientific community and also the providers of operational satellitebased services. The ESA Living Planet program is therefore composed of two main components: a science and research element in the form of Earth Explorer missions and an operational element known as Earth Watch designed to facilitate the delivery of Earth Observation data for use in operational services. The Earth Watch element includes the future development of meteorological missions in partnership with EUMETSAT and also new missions under the European Union’s GMES initiative, where ESA is the partner responsible for developing the Space Component. The past ERS and Envisat missions contributed both to the science and to the applications elements of the ESA Living Planet program. General information on ESA Earth Observation programs can be found at http:// www.esa.int/esaEO.
3.2
ERS-1 and ERS-2 Missions
The ERS-1 (European remote sensing) satellite, launched in 1991, was ESA’s first Earth Observation satellite on low Earth orbit; it carried a comprehensive payload including an imaging C-band Synthetic Aperture Radar (SAR), a radar altimeter, and other instruments to measure ocean surface temperature and winds at sea. ERS2, which overlapped with ERS-1, was launched in 1995 with an additional sensor for atmospheric ozone research. At their time of launch, the two ERS satellites were the most sophisticated Earth Observation spacecraft ever developed and launched in Europe. These highly successful ESA satellites have collected a wealth of valuable data on the Earth’s land surfaces, oceans, and polar caps and have been called upon to monitor natural disasters such as severe flooding or earthquakes in remote parts of the world. ERS-1 was unique in its systematic and repetitive global coverage of the Earth’s oceans, coastal zones, and polar ice caps, monitoring wave heights and wavelengths, wind speeds and directions, sea levels and currents, sea surface temperatures, and sea ice parameters. Until ERS-1 appeared, such information was sparse over the polar regions and the southern oceans. The ERS missions were both an experimental and a preoperational system, since it has had to demonstrate that the concept and the technology had matured sufficiently for successors such as Envisat and that the system could routinely deliver to end users some data products such as reliable sea ice distribution charts and wind maps within a few hours of the satellite observations. The experimental nature of the ERS missions was outlined shortly after the launch of ERS-2 in 1995 when ESA decided to link the two spacecrafts in the first ever “tandem” mission which lasted for 9 months. During this time the increased frequency and level of data available to scientists offered a unique opportunity to observe changes over a very short space of time, as both satellites orbited Earth only 24 h apart and to experiment innovative SAR measurement techniques.
Earth Observation Satellite Missions and Data Access
155
The ERS-1 satellite ended its operations in 2000, far exceeding its 3 years planned lifetime. The ERS-2 satellite operated until 2011, i.e., 16 years after its launch, maximizing the benefit of the past investment. More information on ERS missions can be found at http://earth.esa.int/ers/.
3.3
Envisat Mission
Envisat, ESA’s second-generation remote-sensing satellite, launched in 2002, not only provided continuity of many ERS observations – notably the ice and ocean elements – but added important new capabilities for understanding and monitoring our environment, particularly in the areas of atmospheric chemistry and ocean biological processes. Envisat was the largest and most complex satellite ever built in Europe. Its package of ten instruments made major contributions to the global study and monitoring of the Earth and its environment, including global warming, climate change, ozone depletion, and ocean and ice monitoring. Secondary objectives were more effective monitoring and management of the Earth’s resources and a better understanding of the solid Earth. As a total package, Envisat capabilities exceeded those of any previous or planned Earth Observation satellite. The payload included three new atmospheric sounding instruments designed primarily for atmospheric chemistry, including measurement of ozone in the stratosphere. The advanced C-band Synthetic Aperture Radar collected high-resolution images with a variable viewing geometry, with wide swath and selectable dual polarization capabilities. A new imaging spectrometer was included for ocean color and vegetation monitoring, and there were improved versions of the ERS radar altimeter, microwave radiometer, and visible/near-infrared radiometers, together with a new very precise orbit measurement system. Combined with ERS-1 and ERS-2 missions, the Envisat mission is an essential element in providing long-term continuous data sets that are crucial for addressing environmental and climatological issues. In addition, the Envisat mission further promoted the gradual transfer of applications of Earth Observation data from experimental to preoperational and operational exploitation. Although its nominal lifetime was 5 years, ESA could operate the Envisat satellite for 10 years, i.e., until 2012 when a failure in the power subsystem abruptly ended the satellite control. More information on Envisat mission can be found at http://envisat.esa.int.
3.4
Proba-1
Launched in 2001, the small Project for On-Board Autonomy (Proba) satellite was intended as a 1-year ESA technology demonstrator. Once in orbit, however, its unique capabilities and performance made it evident that it could make big contributions to science so its operational lifetime is currently extended until 2012 to
156
H. Laur and V. Liebig
serve as an Earth Observation mission. Its main payload is a spectrometer (CHRIS), designed to acquire hyperspectral images with up to 63 spectral bands. Also aboard is the high-resolution camera (HRC), which acquires 5 m black and white images.
3.5
Earth Explorers
The Earth Explorer missions form the science and research element of ESA’s Living Planet program and focus on the atmosphere, biosphere, hydrosphere, cryosphere, and the Earth’s interior with the overall emphasis on learning more about the interactions between these components and the impact that human activity is having on natural Earth processes. The Earth Explorer missions are designed to address key scientific challenges identified by the science community while demonstrating breakthrough technology in observing techniques. By involving the science community right from the beginning in the definition of new missions and introducing a peer-reviewed selection process, it is ensured that a resulting mission is developed efficiently and provides the exact data required by the user. The process of mission selection has given the Earth science community an efficient tool for advancing the understanding of the Earth system. The science questions addressed also form the basis for development of new applications of Earth Observation. This approach also gives Europe an excellent opportunity for international cooperation, both within the wide scientific domain and also in the technological development of new missions (ESA 2006). The family of Earth Explorer missions is a result of this strategy. Currently, there are seven Earth Explorers missions and a further three undergoing feasibility study:
GOCE (Gravity Field and Steady-State Ocean Circulation Explorer) GOCE was dedicated to measuring the Earth’s gravity field and modeling the geoid with unprecedented accuracy and spatial resolution to advance our knowledge of ocean circulation, which plays a crucial role in energy exchanges around the globe, sea-level change and Earth interior processes. GOCE also made significant advances in the field of geodesy and surveying. Launched in 2009 with a 2-years planned lifetime, the mission operations were extended by another 2.5 years including a lowering of the satellite in order to obtain an even better geoid spatial resolution. After 4.5 years of successful operations, the GOCE satellite run out of propellant and the satellite re-entered and burned into the atmosphere in November 2013. [This book includes a chapter dedicated to Geodesy and the GOCE mission, written by R Rummel] SMOS (Soil Moisture and Ocean Salinity) Launched in 2009, SMOS observes soil moisture over the Earth’s landmasses and salinity over the oceans. Soil moisture and sea-surface salinity are two variables in Earth’s water cycle that scientists need on a global scale for a variety of applications, such as oceanographic, meteorological, and hydrological forecasting, as well as research into climate change.
Earth Observation Satellite Missions and Data Access
157
CryoSat Launched in 2010, CryoSat acquires accurate measurements of the thickness of floating sea ice, so that seasonal to interannual variations can be detected, and also surveys the surface of continental ice sheets to detect small elevation changes. CryoSat’s main objective is to provide regional trends in Arctic perennial sea ice thickness and mass and to determine the contribution that the Antarctic and Greenland ice sheets are making to mean global rise in sea level. After 2 years of operations, CryoSat’s results confirmed, for the first time, that the decline in sea ice coverage in the polar region has been accompanied by a substantial decline in ice volume. Swarm Launched in 2013, Swarm is a constellation of three satellites that will provide high-precision and high-resolution measurements of the strength and direction of the Earth’s magnetic field. The geomagnetic field models resulting from the Swarm mission will provide new insights into the Earth’s interior, will further our understanding of atmospheric processes related to climate and weather, and will also have practical applications in many different areas such as space weather and radiation hazards. ADM-Aeolus (Atmospheric Dynamics Mission) Due for launch in 2016, ADM-Aeolus will be the first space mission to measure wind profiles on a global scale. It will improve the accuracy of numerical weather forecasting and advance our understanding of atmospheric dynamics and processes relevant to climate variability and climate modeling. ADM-Aeolus is seen as a mission that will pave the way for future operational meteorological satellites dedicated to measuring the Earth’s wind fields. EarthCARE (Earth Clouds Aerosols and Radiation Explorer) Due for launch in 2018, EarthCARE is being implemented in cooperation with the Japanese Aerospace Exploration Agency (JAXA). The mission addresses the need for a better understanding of the interactions between cloud, radiative, and aerosol processes that play a role in climate regulation. Future Earth Explorers In 2005, ESA released the latest opportunity for scientists from ESA Member States to submit proposals for ideas to be assessed for the next in the series of Earth Explorer missions. As a result, 24 proposals were evaluated and a shortlist of six missions underwent assessment study. Subsequently, three proposals were selected for the next stage of development (feasibility study). This process led to the selection of the BIOMASS mission as ESA’s seventh Earth Explorer mission. The BIOMASS mission aims to take measurements of forest biomass to assess terrestrial carbon stocks and fluxes.
158
H. Laur and V. Liebig
The process for selecting the eighth Earth Explorer mission is on going. More information on Earth Explorer missions can be found at: http://www.esa.int/esaLP/ LPearthexp.html.
3.6
GMES and Sentinels
The Global Monitoring for Environment and Security (GMES) program, now called Copernicus, has been established to fulfill the growing need among European policy-makers to access accurate and timely information services to better manage the environment, understand and mitigate the effects of climate change, and ensure civil security. Under the leadership of the European Commission, Copernicus relies largely on data from satellites observing the Earth. Hence, ESA is developing and managing the Copernicus Space Component. The European Commission, acting on behalf of the European Union, is responsible for the overall initiative, setting requirements, and managing the Copernicus services. To ensure the operational provision of the Earth Observation data, the Space Component includes a series of five space missions called “Sentinels,” which are being developed by ESA specifically for GMES. In addition, data from satellites that are already in orbit or are planned will also be used for the initiative. These “Contributing Missions” include both existing and new satellites, whether owned and operated at European level by the EU, ESA, and EUMETSAT or on a national basis. They also include data acquired from non-European partners. The GMES Space Component forms part of the European contribution to the worldwide Global Earth Observation System of Systems (GEOSS). The acquisition of reliable information and the provision of services form the backbone of Europe’s GMES initiative. Services are based on data from a host of existing and planned EO satellites from European and national missions, as well as a wealth of measurements taken in situ from instruments carried on aircraft, floating in the oceans, or positioned on the ground. Services provided through GMES fall so far into five main categories: services for land management, services for the marine environment, services relating to the atmosphere, services to aid emergency response, and services associated with security. The service component of GMES is under the responsibility of the European Commission. More information on GMES can be found at http://www.esa.int/esaLP/LPgmes. html.
Sentinel Missions The success of Copernicus will be achieved largely through a well-engineered Space Component for the provision of EO data to turn them into services for continuous monitoring the environment. The GMES Space Component comprises five types of new satellites called Sentinels that are being developed by ESA specifically to meet the needs of GMES.
Earth Observation Satellite Missions and Data Access
159
The Sentinel missions include radar and super-spectral imaging for land, ocean, and atmospheric monitoring. The so-called a and b models of the first three Sentinels (Sentinel-1a/1b, Sentinel-2a/2b, and Sentinel-3a/3b) are currently under industrial development, with the first satellite, Sentinel-1A, launched in 2013. Sentinel-1 will provide all-weather, day and night C-band radar imaging for land and ocean services and will allow to continue SAR measurements initiated with ERS and Envisat missions; Sentinel-2 will provide high-resolution optical imaging for land services; Sentinel-3 will carry an altimeter, optical, and infrared radiometers for ocean and global land monitoring as a continuation of Envisat measurements; and Sentinel-4 and Sentinel-5 will provide data for atmospheric composition monitoring (Sentinel-4 from geostationary orbit and Sentinel-5 from low Earth polar orbit). Sentinel-4 and Sentinel-5 will be instruments carried on the next generation of EUMETSAT meteorological satellites – Meteosat Third Generation (MSG) and post-EUMETSAT Polar System (EPS), respectively. However a dedicated Sentinel5 precursor mission is planned to be launched in 2015 to reduce the data gap between Envisat and Sentinel-5.
Copernicus Contributing Missions Before data from the Sentinel satellites is available, missions contributing to Copernicus play a crucial role ensuring that an adequate dataset is provided for the Copernicus services. The role of the Contributing Missions will, however, continue to be essential once the Sentinels are operational by complementing Sentinel data and ensuring that the whole range of observational requirements is satisfied. Contributing Missions are operated by national agencies or commercial entities within ESA’s Member States, EUMETSAT, or other third parties. GMES Contributing Missions data initially address services for land and ice and also focus on ocean and atmosphere. Current services mainly concentrate on the following observation techniques: – Synthetic Aperture Radar (SAR) sensors, for all-weather day/night observations of land, ocean, and ice surfaces – Medium- and low-resolution optical sensors for information on land cover, for example, agriculture indicators, ocean monitoring, coastal dynamics, and ecosystems – High-resolution and medium-resolution optical sensors – panchromatic and multispectral – for regional and national land monitoring activities – Very high-resolution optical sensors for targeting specific sites, especially in urban areas as for security applications – High accuracy radar altimeter systems for sea-level measurements and climate applications – Radiometers to monitor land and ocean temperature – Spectrometer measurements for air quality and atmospheric composition monitoring
160
H. Laur and V. Liebig
A ground segment, facilitating access to both Sentinel and Contributing Missions data, complements the GMES space segment. More information on GMES data can be found at http://gmesdata.esa.int.
3.7
Meteorological Programs
Meteosat With the launch of the first Meteosat satellite into a geostationary orbit in November 1977, Europe gained the ability to gather weather data over its own territory with its own satellite. Meteosat began as a research program for a single satellite by the European Space Research Organisation, a predecessor of ESA. Once the satellite was in orbit, the immense value of the data it provided led to the move from a research to an operational mission. ESA launched three more Meteosat satellites before the founding of EUMETSAT, organization partner of ESA for the meteorological programs. Launched in 1997, Meteosat-7 was the last of the first generation of Meteosat satellites. The first generation of seven Meteosat satellites brought major improvements to weather forecasting. But technological advances and increasingly sophisticated weather forecasting requirements created demand for more frequent, more accurate, and higher-resolution space observation. To meet this demand, EUMETSAT started the Meteosat Second Generation (MSG) program, in coordination with ESA. In 2002, EUMETSAT launched the first MSG satellite, renamed Meteosat-8 when it began routine operations to clearly maintain the link to earlier European weather satellites. It was the first of four MSG satellites, which are gradually replacing the original Meteosat series. The Meteosat Third Generation (MTG) will take the relay in 2018/2019 from Meteosat-11, the last of a series of four MSG satellites. MTG will enhance the accuracy of forecasts by providing additional measurement capability, higher resolution, and more timely provision of data. Like its predecessors, MTG is a joint project between EUMETSAT and ESA that followed the success of the firstgeneration Meteosat satellites. MetOp Launched in 2006, in partnership between ESA and EUMETSAT, MetOp is Europe’s first polar-orbiting satellite dedicated to operational meteorology. It represents the European contribution to a new cooperative venture with the United States providing data to monitor climate and improve weather forecasting. MetOp is a series of three satellites to be launched sequentially over 14 years, forming the space segment of EUMETSAT’s Polar System (EPS). MetOp carries a set of “heritage” instruments provided by the United States and a new generation of European instruments that offer improved remote sensing capabilities to both meteorologists and climatologists. The new instruments augment the accuracy of temperature humidity measurements, readings of wind speed and direction, and atmospheric ozone profiles.
Earth Observation Satellite Missions and Data Access
161
Preparations have started for the next generation of this EUMETSAT Polar System, the so-called Post-EPS. More information can be found at http://www.eumetsat.int.
3.8
ESA Third Party Missions
ESA uses its multi-mission ground systems to acquire, process, distribute, and archive data from other satellites – known as Third Party Missions (TPM). The data from these missions are distributed under specific agreements with the owners or operators of those missions, which can be either public or private entities outside or within Europe. ESA Third Party Missions are addressing most of the existing observation techniques: – Synthetic Aperture Radar (SAR) sensors, e.g., L-band instrument (PALSAR) on board the Japanese Space Agency ALOS – Low-resolution optical sensors, e.g., MODIS sensors on board the US Terra and Aqua satellites, SeaWIFS sensor on board the US OrbView-2 satellite, etc. – High-resolution optical sensors, e.g., Landsat imagery (15–30 m), SPOT-4 data (10 m), a 10 m radiometer instrument (AVNIR-2) on board the ALOS, medium resolution (20–40 m) sensors on board the Disaster Monitoring Constellation (DMC), etc. – Very high-resolution optical sensors, e.g., US Ikonos-2 (1–4 m), Korean Kompsat-2 (1–4 m), Taiwan Formosat-2 (2–8 m), Japanese PRISM (2.5 m) on board the ALOS, etc. – Atmospheric chemistry sensors, e.g., Swedish Odin, Canadian SciSat-1, and Japanese GOSAT satellites The complete list of Third Party Missions currently supported by ESA is available at http://earth.esa.int/thirdpartymissions/.
4
Some Major Results of ESA Earth Observation Missions
Since 1991, the flow of data provided by the consecutive ERS-1, ERS-2, and Envisat missions has been converted in extensive results, giving new insights into our planet. These results encompass many fields of Earth science, including land, ocean, ice, and atmosphere studies, and have shown the importance of long-term data collection to identify trends such as those associated to climate change. The ERS and Envisat missions are valuable tools not only for Earth scientists but gradually also for public institutions providing operational services such as
162
H. Laur and V. Liebig
sea ice monitoring for ship routing or UV index forecast. The ERS and Envisat data have also stimulated the emergence of new analysis techniques such as SAR Interferometry.
4.1
Land
One of the most striking results of the ERS and Envisat missions for land studies is the development of the interferometry technique using Synthetic Aperture Radar (SAR) instruments. SAR instruments are microwave imaging systems. Besides their cloud penetrating capabilities and day and night operational capabilities, they have also “interferometric” capabilities, i.e., capabilities to measure very accurately the travel path of the transmitted and received radiation. SAR Interferometry (InSAR) exploits the variations of the radiation travel path over the same area observed at two or more acquisition times with slightly different viewing angles. Using InSAR, scientists are able to measure millimetric surface deformations of the terrain, like the ones associated with earthquake movements, volcano deformation, landslides, or terrain subsidence (ESA 2007). Since the launch of ERS-1 in 1991, InSAR has advanced the fields of tectonics and volcanology by allowing scientists to monitor the terrain deformation anywhere in the world at any time. Some major earthquakes, such as in Landers, California, in 1992 and Bam, Iran, in 2003 or the 2009 L’Aquila earthquake in central Italy, were “imaged” by the ERS and Envisat missions, allowing geologists to better understand the fault rupture mechanisms. Using ERS and Envisat data, scientists have been able to monitor the long-term behavior of some volcanoes such as Mt. Etna, providing crucial information for understanding how the volcano’s surface deformed during the rise, storage, and eruption of magma. Changes in surface deformation, such as sinking, bulging, and rising, are indicators of different stages of volcanic activity, which may result in eruptions. Thus, precise monitoring of a volcano’s surface deformation could lead to predictions of eruptions. The InSAR technique was also exploited by merging SAR data acquired by different satellites. For 9 months in 1995/1996, the two ERS-1 and ERS-2 satellites undertook a “tandem” mission, in which they orbited Earth only 24 h apart. The acquired image pairs provide much greater interferometric coherence than is normally possible, allowing scientists to generate detailed digital elevation maps and observe changes over a very short space of time. The same “tandem” approach was followed by the ERS-2 and Envisat satellites, adding to the ever growing set of SAR interferometric data. Because SAR instruments are able “to see” through clouds or at night time, their ability to map river flooding was quickly recognized and has been gradually exploited by civil protection authorities. Similarly other ERS and Envisat instruments such as the ATSR infrared radiometers have been able to provide relevant forest fires statistics. Using data of the MERIS spectrometer instrument on board the Envisat, the most detailed Earth global land cover map was created. This global map, which is ten
Earth Observation Satellite Missions and Data Access
163
times sharper than any previous global satellite map, was derived by an automatic and regionally adjusted classification of the MERIS data global composites using land cover classes defined according to the UN Land Cover Classification System (LCCS). The map and its various composites support the international community in modeling climate change impacts, studying ecosystems, and plotting worldwide land-use trends.
4.2
Ocean
Through the availability of ERS and Envisat satellite data, scientists have gained an understanding of the ocean and its interaction with the entire Earth system that they would not have otherwise been able to do. Thanks to their 20 year’s extent, the time series of the ERS and Envisat missions’ data allow scientists to investigate the effects of climate change, in particular on the oceans. Those long-term data measurements allow removing the yearly variability of most of the geophysical parameters, providing results of fundamental significance in the context of climate change. Radar altimeters on board satellites play an important role in those long-term measurements. They work by sending thousands of separate radar pulses down to the Earth per second and then recording how long their echoes take to bounce back to the satellite platform. The sensor times its pulses’ journey down to under a nanosecond to calculate the distance to the planet below to a maximum accuracy of 2 cm. The consecutive availability of altimeters on board the ERS-1, ERS-2, and then Envisat allowed establishing that the global mean sea level raised by about 2–3 mm per year since the early 1990s, with important regional variations. Sea-surface temperature (SST) is one of the most stable of several geographical variables which, when determined globally, helps diagnose the state of the Earth’s climate system. Tracking SST over a long period is a reliable way researchers know of measuring the precise rate at which global temperatures are increasing and improves the accuracy of our climate change models and weather forecasts. There is evidence from measurements made from ERS and Envisat ATSR radiometer instruments that there is a distinctive upward trend in global sea-surface temperatures. The ATSR instruments produced data of unrivaled accuracy on account of its unique dual view of the Earth’s surface, whereby each part of the surface is viewed twice, through two different atmospheric paths. This not only enables scientists to correct for the effects of dust and haze, which degrade measurements of surface temperature from space, but also enables scientists to derive new measurements of the actual dust and haze, which are needed by climate scientists. One of the main assets of the Envisat mission is its multisensor capability, which allows observing geophysical phenomena from various “viewpoints.” A good example is the observation of cyclones: the data returned by Envisat includes cloud structure and height at the top of the cyclone, wind and wave fields at the bottom, sea-surface temperature, and even sea height anomalies, indicative of upper ocean thermal conditions that influence its intensity.
164
H. Laur and V. Liebig
Fig. 4 This Envisat’s ASAR image acquired on 17 November 2002 shows a double-headed oil spill originating from the stricken Prestige tanker, lying 100 km off the Spanish coast
The ERS and Envisat SAR data did also stimulate the development of maritime applications. They include monitoring of illegal fisheries or monitoring of oil slicks (Fig. 4) which can be natural or the results of human activities. The level of shipping and offshore activities occurring in and around icebound regions is growing steadily and with it the demand for reliable sea ice information. Traditionally, ice-monitoring services were based on data from aircraft, ships, and land stations. But the area coverage available from such sources is always limited and often impeded further by bad weather. Satellite data has begun to fill this performance gap, enabling continuous wide-area ice surveillance. SAR instruments of the type flown on ERS and Envisat were able to pierce through clouds and darkness and therefore are particularly adapted for generating high-quality images of sea ice. Ice classification maps generated from radar imagery are now being supplied to users at sea. The maritime operational applications will be continued with the Sentinel-1 data.
Earth Observation Satellite Missions and Data Access
4.3
165
Cryosphere
The cryosphere is both influenced by and has a major influence on climate. Because any increase in the melt rate of ice sheets and glaciers has the potential to greatly increase sea level, researchers are looking to the cryosphere to get a better idea of the likely scale of the impact of climate change. In addition, the melting of sea ice will increase the amount of solar radiation that will be absorbed by icefree polar oceans rather than reflected by ice-covered oceans, increasing the ocean temperature. The remoteness, darkness, and cloudiness of Earth’s polar regions make them difficult to study. Being microwave active instruments, the radar on board the ERS and Envisat missions allowed seeing through clouds and darkness. Since about 30 years, satellites have been observing the Arctic and have witnessed reductions in the minimum Arctic sea ice extent – the lowest amount of ice recorded in the area annually – at the end of summer from around 8 million km2 in the early 1980s to the historic minimum of 3.5 million km2 in 2012, changes widely viewed as a consequence of greenhouse warming. ERS and Envisat radar instruments, i.e., the imaging radar (SAR) and the radar altimeter instruments, witnessed this sharp decline, providing detailed measurements, respectively, on sea ice areas and sea ice thickness. The CryoSat mission adds accurate sea ice thickness measurements to complement the sea ice extent measurements. In addition to mapping sea ice, scientists have used repeat-pass SAR image data to map the flow velocities of glaciers. Using SAR data collected by ERS-1 and ERS-2 during their tandem mission in 1995 and Envisat and Canada’s Radarsat1 in 2005, scientists discovered that the Greenland glaciers are melting at a pace twice as fast as previously thought. Such a rapid pace of melting was not considered in previous simulations of climate change, therefore showing the important role of Earth Observation in advancing our knowledge of climate change and improving climate models. Similar phenomena also take place in Antarctica, with some large glaciers such as the Pine Island Glacier thinning at a constantly accelerating rate as suggested by ERS and Envisat altimetry data. In Antarctica, the stability of the glaciers is also related to their floating terminal platform, the ice shelves. The breakup of large ice shelves (e.g., Larsen ice shelf, Wilkins ice shelf) has been observed by Envisat SAR and is likely a consequence of both sea and air temperature increase around the Antarctic peninsula and West Antarctica.
4.4
Atmosphere
ERS-2 and Envisat were equipped with several atmospheric chemistry instruments, which can look vertically or sideways to map the atmosphere in three dimensions, producing high-resolution horizontal and vertical cross sections of trace chemicals stretching from ground level to 100 km in the air, all across a variety of scales.
166
H. Laur and V. Liebig
Those instruments could detect holes in the thinning ozone layer, plumes of aerosols and pollutants hanging over major cities or burning forests, and exhaust trails left in the atmosphere by commercial airliners. ERS-2 and Envisat satellites have been maintaining a regular census of global stratospheric ozone levels from 1995 to 2012, mapping yearly Antarctic ozone holes as they appear. The size and precise time of the ozone hole is dependent on the year-to-year variability in temperature and atmospheric dynamics, as established by satellite measurements. Envisat results benefited from improved sensor capabilities. As an example, the high spatial resolution delivered by the Envisat atmospheric instruments means precise maps of atmospheric trace gases, even resolving individual city sources such than the high-resolution global atmospheric map of nitrogen dioxide (NO2), an indicator of air pollution (Fig. 5). By making a link with the measurements started with ERS-2, the scientists could note a significant increase in the NO2 emissions above Eastern China, a tangible sign of the fast economical growth of China. Based on several years of Envisat observations, scientists have produced global distribution maps of the most important greenhouse gases – carbon dioxide (CO2) and methane (CH4) – that contribute to global warming. The importance of cutting emissions from these “anthropogenic,” or man-made gases has been highlighted with European Union leaders endorsing binding targets to cut greenhouse gases in the midterm future. Careful monitoring is essential to ensuring these targets are met,
Fig. 5 Nitrogen dioxide (NO2) concentration map over Europe derived from several years of SCIAMACHY instrument data on board the Envisat satellite (Courtesy Institute of Environmental Physics, Univ. of Heidelberg)
Earth Observation Satellite Missions and Data Access
167
and space-based instruments are new means contributing to this. The SCIAMACHY instrument on board the Envisat satellite was the first space sensor capable of measuring the most important greenhouse gases with high sensitivity down to the Earth’s surface because it observes the spectrum of sunlight shining through the atmosphere in “nadir”-looking operations. Envisat atmospheric chemistry data are useful for helping build scenarios of greenhouse gas emissions, such as methane – the second most important greenhouse gas after carbon dioxide. Increased methane concentrations induced mainly by human activities were observed. By comparing model results with satellite observations, the model is continually adjusted until it is able to reproduce the satellite observations as closely as possible. Based on this, scientists continually improve models and their knowledge of nature.
4.5
ESA Programs for Data Exploitation
Earth Observation is an inherently multipurpose tool. This means there is no typical Earth Observation user: it might be anyone who requires detailed characterization of any given segment of our planet, across a wide variety of scales from a single city block to a region, or continent, right up to coverage of the entire globe. Earth Observation is already employed by many thousands of users worldwide. However, ESA works to further increase Earth Observation take-up by encouraging development of new science and new applications and services centered on user needs. New applications usually emerge from scientific research. ESA supports scientific research either by providing easy access to high-quality data (see Sect. 5), by organizing dedicated workshops and symposia, by training users, or by taking a proactive role in the formulation of new mission concepts and by providing support to science. Converting basic research and development into an operational service requires the fostering of partnership between research institutions, service companies, and user organizations. ESA’s Data User Element (DUE) program addresses institutional users tasked with collecting specific geographic or environmental data. The Data User Element aims to raise such institutions’ awareness of the applicability of Earth Observation to their day-to-day operations and develop demonstration products tailored to increase their effectiveness. The intention is then to turn these products into sustainable services provided by public or private entities. More information on ESA’s Data User Element (DUE) program can be found at http:// dup.esrin.esa.it. Complementing Data User Element objectives is ESA’s Earth Observation Value Adding program. This provides a supportive framework within which to organize end-to-end service chains capable of leveraging scientific EO data into commercial tools supplied by self-supporting businesses. More information on ESA EO Value Adding program can be found at http://www.eomd.esa.int.
168
5
H. Laur and V. Liebig
User Access to ESA Data
ESA endeavors to maximize the beneficial use of Earth Observation data. It does this by fostering the use of this valuable information by as many people as possible, in as many ways as possible. For users, and therefore for ESA, easy access to EO data is of paramount importance. However, the challenges for easy EO data access are many: 1. The ESA EO data policy shall be beneficial to various categories of users, ranging from global change scientists to operational services, and shall have the objective to stimulate a balanced development of Earth science, public services, and valueadding companies. ESA has always pursued an approach of low cost fees for its satellite data, trying to provide free of charge the maximum amount of data. This approach will continue and even be reinforced in the future, by further increasing the amount of data available on the Internet and by reducing the complexity of EO missions. 2. The volume of data transmitted to the ground by ESA EO satellites is particularly high: the Envisat satellite transmitted about 270 GB of data every day; the future Sentinel-1 satellite will transmit about 900 GB of data every day. Once acquired, the data shall be transformed (i.e., processed) into products in which the information is related either to an engineering calibrated parameter (so-called Level 1 products, e.g., a SAR image) or to a geophysical parameter (so-called Level 2 products, e.g., sea surface temperature). Despite the high volume of data, the processing into Level 1 and Level 2 products shall be as fast as possible to serve increasingly demanding operational services. Finally, the data products shall be almost immediately available with users either through a broad use of Internet or through dedicated communication links. 3. The quality of EO data products shall be high so that user can effectively rely on the delivered information content. This means that the processing algorithms (i.e., the transformation of raw data into products) as well as the product calibration and validation shall be given strong attention. ESA has constantly given such attention for their EO missions, investing large efforts in algorithms development, particularly for innovative instruments such as the ones flying on board the Earth Explorer missions. Of equal importance for the credibility of EO data are the validation activities aiming at comparing the geophysical information content of EO data products with similar measurements collected through, e.g., airborne campaigns or ocean buoys setup. 4. Finally, EO data handling tools shall be offered to users. Besides the general assistance given through a centralized ESA EO user service ([email protected]), the ESA user tools include: – Online data information (http://earth.esa.int) providing EO missions news, data product description, processing algorithms documentation, workshop proceedings, etc.
Earth Observation Satellite Missions and Data Access
169
– Data collection visualization through online catalogues, including request for product generation when the product does not yet exist (e.g., for future data acquisition) or direct download when the product is already available in online archives – EO data software tools, aiming to facilitate the utilization of data products by provision of, e.g., viewing capabilities, innovative processing algorithms, format conversion, etc.
5.1
How to Access the EO Data at ESA
Internet is the main way to access to the EO satellite services and products at ESA. Besides the general description on data access described below, further assistance can be provided by the ESA EO help desk ([email protected]). 1. For most of the ESA EO data, open and free-of-charge access is granted after a simple user registration. The data products are those that are systematically acquired, generated, and available online. This includes all altimetry, sea surface temperature, atmospheric chemistry, and future Earth Explorer data but also large collections of optical (MERIS, Landsat) and SAR datasets. User registration is done at http://eopi.esa.int. The detailed list of open and free-of-charge data products is available on this Web site. For users who are only interested in basic imagery (i.e., false color jpg images), ESA provides access to large galleries of free-of-charge Earth images (http://earth.esa.int/satelliteimages). 2. Some ESA EO data and services cannot be provided free of charge or openly either because of restrictions in the distribution rights granted to ESA (e.g., some ESA Third Party Missions) or because the data/service is restrained by technical capacities and therefore is on demand (i.e., not systematically provided). The restrained dataset essentially includes on-demand SAR data acquisition and production. In this case, users shall describe the intended use of the data within a project proposal. The project proposals are collected at http://eopi.esa.int. ESA analyzes the project proposal to review its scientific objectives, to assess its feasibility, and to establish project quotas for the requested products and services (e.g., instrument tasking). The products and services are provided free of charge. Products are provided on the Internet.
6
Conclusion
During the last three decades, Earth Observation satellites have gradually taken on a fundamental role with respect to understanding and managing our planet.
170
H. Laur and V. Liebig
Contributing to this trend, the Earth Observation program of the European Space Agency addresses a growing number of scientific issues and operational services, thanks to the continuous development of new satellites and sensors and to a considerable effort in stimulating the use of EO data. Partnerships between satellite operators and strong relations with user organizations are essential at ESA for further improving information retrieval from EO satellite data. The coming decade will see an increasing number of orbiting EO satellites, not only in Europe. This is a natural consequence of the growing user demands and expectations, but also of the gradual decrease of the costs for satellite manufacturing. The main challenge in Earth Observation will therefore be to maximize the synergies between existing satellites, both with respect to combining their respective observations and optimizing their operation concepts.
References ESA (2006) The changing Earth – new scientific challenges for ESA’s living planet programme, ESA SP-1304 ESA (2007) InSAR principles – guidelines for sar interferometry processing and interpretation, ESA TM-19
Satellite-to-Satellite Tracking (Low-Low/High-Low SST) Wolfgang Keller
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Scientific Relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Reference Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Basis Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Celestial Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Observation Models and Data Processing Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Satellite-to-Satellite Tracking in the High–Low Mode . . . . . . . . . . . . . . . . . . . . . . . 5.2 Satellite-to-Satellite Tracking in the Low–Low Mode . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Integral Equations Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Regional Gravity Field Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Missions and Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 CHAMP Mission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 GRACE Mission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
172 172 175 175 177 181 183 183 194 200 204 205 206 206 207 208 209
Abstract
This contribution reviews the mathematical ideas behind the most frequently used techniques for the processing of satellite-to-satellite tracking data. Its emphasis is on the model part rather than on all necessary technicalities in data preprocessing and numerical implementation. The main outcomes of these dataprocessing strategies, when applied to data of the satellite missions CHAMP and GRACE, are reviewed.
W. Keller () Geodätisches Institut, Universität Stuttgart, Stuttgart, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_56
171
172
1
W. Keller
Introduction
The dedicated gravity field missions CHAMP and GRACE have attracted and are still attracting the attention of many researchers worldwide. An impressive number of algorithms and approaches have been developed to extract information about the gravitational field of the Earth from the original satellite-to-satellite tracking (SST) signal. This contribution aims at a mathematical description of the basic ideas behind the most frequently used and most developed approaches for the processing of SST data. It will not cover the important issue of data preprocessing, though this preprocessing is vital for the extraction of geophysical meaningful results from the SST data.
2
Scientific Relevance
Geophysical processes close to the Earth’s surface, which are connected with mass transports, such as the hydrological cycle, imprint their signature on the time variability of the gravitational field of the Earth. Though gravity and gravity changes can be precisely measured at the Earth’s surface, this data has a very coarse spatial and temporal resolution. The way out of this sparse data coverage is to use artificial satellites as proof masses in the Earth’s gravitational field: The deviation of the orbits of these satellites from their simple Keplerian orbits is due to the deviation of the Earth’s gravitational field from a rotational symmetric field. The orbital heights of those satellites have to be big enough to guarantee that the atmospheric friction of the satellite does not exceed the anomalous gravitational acceleration on the satellite. On the other hand, the intensity of the gravitational signal decreases with increasing distance from the mass center of the Earth. And this decay is the faster the smaller the scales of the gravitational anomalies are. This means a recovery of the gravitational field from orbit observations of artificial satellites smoothes out smaller details of the gravitational field. This smoothing cannot be counteracted by a lower orbit because the atmospheric friction would obscure the gravitational signal. This smoothing only can be counteracted by differential measurements, meaning that instead of the orbits of the satellites themselves the orbit differences between two or more satellites are observed. This concept of relative orbit observation and its implications to the recovery of geophysical phenomena is explained in more detail in Rummel (2003). This satellite-to-satellite tracking (SST) principle can be put into mathematical terms in the following way: Assume that there are n satellites which can “see” each other in a certain way. The orbits of these satellites will be denoted by .xi .t/; xP i .t//; i D 1; : : : ; n. The “things” that the satellites 2; : : : ; n “see” of satellite 1 can be modeled as a vectorial function F of the orbits of all involved satellites:
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
s.t/ WD F.x1 .t/; xP 1 .t/; : : : ; xn .t/; xP n .t// C .t/:
173
(1)
Here comprises all unavoidable observation errors as well as the errors from imperfect data preprocessing. The signal s is the primary SST signal and all the desired information about the gravitational field and its variability in space and time has to be extracted from this signal. The gravitational field V is hidden in the orbits of the satellites. Hence a more precise notation of the orbits of the involved satellites is .xi .t; V /; xP i .t; V //; i D 1; : : : ; n, which changes the primary SST model (1) to s.t/ WD F.x1 .t; V /; xP 1 .t; V /; : : : ; xn .t; V /; xP n .t; V // C .t/:
(2)
The recovery of the gravitational field from SST data can be modeled as the following minimization problem: Z V D argmin
T
ks.t/ F.x1 .t; V /; xP 1 .t; V /; : : : ; xn .t; V /; xP n .t; V //k2 dt 0
(3)
j V 2 Harm./ : The minimization is carried out over all functions V , which are harmonic in and regular at infinity (Harm./). In most cases will be the exterior of a sphere of radius R. Already at this stage two different modes of SST can be distinguished: the high– low SST (hl-SST) and the low–low SST (ll-SST). In the hl-SST mode the satellites 2; : : : ; n have such a high orbital altitude that the uncertainties in the gravitational field do not measurably influence their computed orbits. This means the orbits .xi .t; V0 /; xP i .t; V0 //; i D 2; : : : ; n can be computed using a known reference potential V0 and do not longer depend on the unknown gravitational potential V . The minimization problem simplifies to Z T 2 O V D argmin ks.t/ F.x1 .t; V /; xP 1 .t; V //k dt j V 2 Harm./ (4) 0
with O 1 .t; V /; xP 1 .t; V // W F.x D F.x1 .t; V /; xP 1 .t; V /; x2 .t; V0 /; xP 2 .t; V0 /; : : : ; xn .t; V0 /; xP n .t; V0 //:
(5)
In the ll-SST mode two satellites in the same low orbit are chasing each other and are observing their relative positions and velocities. The corresponding minimization problem is
174
W. Keller
Z V D argmin
T
ks.t/ F.x1 .t; V /; xP 1 .t; V /; x2 .t; V /; xP 2 .t; V //k2 dt 0
j V 2 Harm./ :
(6)
Both the hl-SST (4) and the ll-SST (6) minimization problems are infinitedimensional problems and as such not suitable for numerical implementation. In order to discretize these infinite-dimensional problems, a parametric model for the unknown gravitational potential has to be introduced: X
VQ .x/ D
cz ˆz .x; z /:
(7)
z2C Z r
In this notation ˆz 2 Harm./ are harmonic basis functions, which besides on the location x can also depend on additional parameters z . The weights cz in the linear combination (7) can be complex numbers and the multi-index z ranges inside a certain subset C of the r-dimensional integers. As a second discretization step, the SST signal s.t/ has to be sampled equidistantly: si WD s.ih/;
h > 0:
(8)
After that, the infinite-dimensional minimization problems (4) and (6) can be approximated by their finite-dimensional counterparts:
f.cz ; z / j z 2 C g D argmin
N X
ksi FO x1 ih; VQ cz ; z ; xP 1 ih; VQ cz ; z k2
i D1
(9) in the hl-SST mode and
f.cz ; z / j z 2 C g D argmin
N X i D1
ksi F x1 ih; VQ cz ; z ; xP 1 ih; VQ cz ; z ; x2 ih; VQ cz ; z ; xP 2 ih; VQ cz ; z k2
(10)
in the ll-SST mode. Both (9) and (10) are finite-dimensional nonlinear leastsquares problems and can be solved by certain variants of the Levenberg–Marquardt algorithm (Levenberg 1944; Marquardt 1963):
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
175
.cz ; z /.i C1/ D .cz ; z /.i /
!1 @F > @F > @F .i / C˛ I @ .cz ; z / @ .cz ; z / @ .cz ; z / s F .cz ; z /.i / :
(11)
The different algorithms for SST data processing differ in the way how they in the Jacobian or how they combine the primary compute the entries @.c@F z ; z / observations si to secondary observations i , which then are linearly related to the unknown parameters. These questions will be discussed in detail in Sect. 5.
3
Conventions
3.1
Reference Systems
In geodynamics a well-defined set of reference systems is in use. For the precise definition of these reference systems and the transformation methods between the individual reference systems, the IERS conventions (Petit and Luzum 2010) can be consulted. Since for astronomical standards the time-span of SST observations is rather small, three simplified reference systems can be used: 1. A space-fixed system 2. An Earth-fixed system 3. An orbital system The space-fixed system has its origin in the mass center of the Earth and its x3 -axis points into the direction of the mean rotation axis of the Earth at the initial epoch t0 of then SST observations. Its x1 -axis points to the intersection of the mean equatorial plane with the ecliptic at this epoch t0 and the x2 -axis completes the former two axes to an orthogonal right-handed Cartesian system. Since the origin of the system is not in un-accelerated motion, this system is not a proper inertial system. The acceleration is due to the attracting forces of Sun, Moon, and planets. If these attracting forces are considered in the form of tidal forces in the data preprocessing, the space-fixed system becomes an inertial system. The differences between the simplified space-fixed system used here and the space-fixed system defined in the IERS convention have to be taken into account by precession, nutation, and polar-motion corrections. These corrections can be made part of the data preprocessing and will not be discussed here. The Earth-fixed system has the same x3 -axis but the x1 -axis lies on the meridian of Greenwich. The x2 -axis completes the former two axes to an orthogonal
176
W. Keller
right-handed Cartesian system. The angle ‚ between the x1 -axis of the space-fixed and the x1 -axis of the Earth-fixed system is called Greenwich sidereal time. Hence, the coordinates of a point change from the space-fixed to the Earth-fixed system according to 0
1 cos sin 0 R3 . / WD @ sin cos 0A : 0 0 1
xEf D R3 .‚/xsf ;
(12)
The orbital system is related to a fictitious satellite with an exactly circular orbit in the gravitational field of a flattened Earth. Its x3 -axis is perpendicular to the orbital plane and its x1 -axis points to the fictitious satellite. The x2 -axis completes again to a right-handed orthogonal Cartesian system. According to Kaula (2000), for such a satellite, the angle between the x1 -axis of the space-fixed system and the nodal line, i.e., the intersection between equatorial and orbital plane, changes with time as 3n D 0 J2 2
R a
r
2 cos i t;
nD
GM a3
with GM being the product of gravitational constant and mass of the Earth; J2 being the dynamic form factor (cf. Petit and Luzum 2010); i being the inclination of the orbit, i.e., the angle between the equatorial and the orbital plane; and a being the radius of the orbital circle (Fig. 1). The argument of latitude u, i.e., the angle between the nodal line and the position of the satellite, is given by
Fig. 1 Relations between reference systems
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
3n J2 u D u0 C n C 4
R a
177
!
2 2
.3 cos i 1/ t:
(13)
This means the transformation from the space-fixed to the orbital system is given by 1 1 0 0 R1 .˛/ WD @0 cos ˛ sin ˛ A : 0 sin ˛ cos ˛ 0
xorb D R3 .u/R1 .i /R3 ./xsf ;
3.2
(14)
Basis Functions
There are two types of basis functions for the representation of the gravitational potential, which are used in the SST community: spherical harmonics and radial basis functions. Nevertheless, they are used in different normalizations. Here, the standard used in Wolfram MathWorld (Weisstein) will be applied. The surface spherical harmonics are defined as
Yl;m .#; / D
8q < .2lC1/ .l m/Š P m .cos #/e {m
;m 0
:.1/m Y l;m .#; /
;m < 0
4
.lC m/Š
l
;
(15)
where Y l;m is the conjugate complex of Yl;m , the Plm are the Legendre functions, and #; are spherical coordinates related to the Earth-fixed system. Assume now that the original coordinate system is rotated by the three Eulerian angles ˛; ˇ; and : 1 cos ˇ 0 sin ˇ R2 .ˇ/ D @ 0 1 0 A sin ˇ 0 cos ˇ 0
R.˛; ˇ; / WD R3 . /R2 .ˇ/R3 .˛/;
and that the spherical coordinates in the rotated system are denoted by # 0 ; 0 , then a spherical harmonics in the non-rotated coordinates can be expressed as a linear combination of spherical harmonics of the same degree in the rotated coordinates
Yl;m .#; / D
l X
l Dl;m;k .˛; ˇ; /Yl;k .# 0 ; 0 /;
(16)
kDl
with l l .˛; ˇ; / D e {m˛ dmk .ˇ/e {k Dl;m;k
(17)
178
W. Keller
being the so-called Wigner functions according to Kostelec and Rockmore (2008). The representation of the gravitational potential as spherical harmonics expansion in the Earth-fixed system is given by l 1 GM X R lC1 X V .r; #; / D Kl;m Yl;m .#; /; R r lD0
Kl;m 2 Z:
(18)
mDl
This means, in the case of spherical harmonics in an Earth-fixed system as basis functions, the general notation specializes to ˆl;m .x/ D
GM R
R r
lC1
˚
Yl;m .#; /; .l; m/ 2 C D .i; j / 2 Z2 j i 0; jj j i : (19)
Besides on the index vector z D .l; m/ and on the position vector x, the basis functions do not depend on additional parameters z . Since the transition from the Earth-fixed system to the orbital system is accomplished by the rotation R ‚ ; i; u C ; 2 2
(20)
the gravitational potential at the position of the fictitious satellite is given by
V .r; u; ƒ/ D
l 1 l GM X R lC1 X X k l Kl;m { km dm;k .i /P l .0/e {.kuCmƒ/ ; R r lD0 mDl kDl (21)
with ƒD ‚ and m P l .x/
D
8q < .2lC1/ .l m/Š P m .x/
;m 0
:.1/m P m .x/ l
;m < 0
4
.lC m/Š
l
:
(22)
Hence, in the orbital system, the general notation of a basis function is specialized to 2 3 lC1 1 X R GM k l ˆm;k .u; ƒ/ D 4 { km dm;k .i /P l .0/5 e {.kuCmƒ/ : R r lDmaxfjkj;jmjg
(23)
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
179
This means in the orbital systems the basis functions modify to imaginary exponentials defined on the torus Œ0; 2/ Œ0; 2/. The domain of definition gave this representation its name: torus approach (Sneeuw 2000). A satellite position with respect to the Earth is mapped onto the position ƒ; u on the torus. Since the Earth has the topological genus 0 and the torus has the topological genus 1, there is no one-to-one mapping between the satellite positions in an Earth-fixed system and the positions on a torus. Two different points on the torus can correspond to the same point related to the Earth: once the point is reached on an ascending and once on a descending orbital arc. A second kind of basis functions is the so-called radial basis functions. A radial basis function (RBF) on the sphere is a function which depends only upon the distance of its argument from the north pole of the sphere, i.e., Q e> ˆ.x/ D ˆ 3 ;
x ; D kxk
0 1 0 @ e3 D 0 A : 1
(24)
Radial basis functions are used to approximate functions defined on the sphere as linear combinations of rotated versions of ˆ. A RBF, rotated to the position , is a function which only depends on the distance of its argument from that position: Q > : ˆ.x; / D ˆ
(25)
Since 1 > 1 holds, any square integrable rotated RBF must have a series expansion in Legendre polynomials: ˆ.x; / D
1 X
n Pn > :
(26)
nD0
This means in an Earth-fixed system and for rotated RBFs, the general notation of a basis function specializes to ˆl .x; l / D
1 X
n Pn > l ;
(27)
nD0
with l D fl ; fn gg :
(28)
So far, RBFs have been defined on the surface of the unit sphere only. Their harmonic continuation to the exterior of the mean Earth sphere is given by ˆl .x; l / D
1 X nD0
n
nC1 R Pn > l : r
(29)
180
W. Keller
Since the Legendre polynomials Pn are sums of products of surface spherical harmonics n X .2n C 1/ > Pn D Y n;m ./Yn;m ./; 4 mDn
(30)
the rotated RBF ˆ.x; l / can be expressed as ˆ.x; l / D
1 X nD0
n X 4 n Y n;m ./Yn;m .l /: 2n C 1 mDn
(31)
In this representation the position parameter l and the argument are separated. Since the position parameter l is the following rotation of the vector e3 1 sin #l cos l l D @ sin #l sin l A ; cos #l 0
l D R3 .l /R2 .#l /e3 ;
(32)
the representation (31) is equivalent to 1 X
n X 4 n ˆ.x; l / D 2n C 1 mDn nD0
n X
! n dm;k .#l /e {kl Yn;k .e3 /
Y n;m ./:
(33)
kDn
A comparison of (31) with (18) allows the conversion of a RBF representation into the corresponding surface spherical harmonics representation. So far, both RBF representations refer to an Earth-fixed system. The representation of a RBF in the orbital system is ˆ.u; ƒ; l /
! nC1 X n n X R 4 km l {.kuCmƒ/ k n { dm;k .i /e P l .0/ Yn;m .l / D 2n C 1 r mDn kDn nD0 2 3 nC1 1 1 X X X R 4 k l 4 n D { km dm;k .i /P l .0/Yn;m .l /5 2n C 1 r mD1 1 X
kD1
e {.kuCmƒ/ :
nDmaxfjkj;jmjg
(34)
This means, as in the case of spherical harmonics also, the RBFs in the orbital system are imaginary exponentials on the torus. A basis function on the torus is of particular importance if the underlying orbit is a so-called “repeat orbit.” For a repeat orbit holds
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
181
ˇ uP D ; P ˛ ƒ
(35)
with ˛; ˇ relative prime. Then
m;k
WD ku C mƒ D
uP .kˇ C m˛/ t D P m;k t ˇ „ ƒ‚ …
(36)
P m;k
holds. Due to this relation between the time t and the location u; ƒ on the torus the basis function, evaluated along a repeat orbit, and changes into a periodic time function: 2 ˆm;k .t/ D 4
GM R
1 X lDmaxfjkj;jmjg
3 lC1 R k P l { km dm;k .i /P l .0/5 e { m;k t ; r
(37)
and ˆm;k .t; l / 1 X
D
1 X
mD1 kD1
e{
2 4
X nDmaxfjkj;jmjg
4 n 2n C 1
R r
nC1
3 k l { km dm;k .i /P l .0/Yn;m .l /5
P m;k t
(38)
respectively. The period of these functions is given by T D ˛ nodal days.
4
Celestial Mechanics
In an inertial system, the equation of motion of a unit point mass is governed by the Newtonian equations xR .t/ D rV .x.t/; t/ C F.x.t/; xP .t/; t/;
(39)
where x.t/ is the Cartesian position of the unit point mass, representing the satellite, V is the potential of the gravitational force acting on it, and F is the nonconservative
182
W. Keller
force exerted, for example, by atmospheric friction, on the satellite. In a rotating system inertial forces have to be added and the equation of motion changes from (39) to xR0 .t/ D rV 0 x0 .t/; t CF0 x0 .t/; xP 0 .t/; t ! ! x0 .t/ 2! xP 0 .t/ : ƒ‚ … „ ƒ‚ … „
(40)
Coriolisforce
centrifugal force
Here ! is the rotation axis of the system and all the quantities denoted by a prime refer to that rotating system. Since we can assume ! P D 0, the inclusion of the Eulerian force is not necessary. Here, we distinguish two rotating systems: the Earth-fixed and the orbital system. For the Earth-fixed system, ! D 7:292115 105 e3 holds, and V 0 does not longer explicitly depend on t. For the orbital system r !D
1 sin i sin GM @ sin i cos A a3 cos i 0
holds, but in this case the gravitational potential V 0 still explicitly depends on t. The equations of motion can be solved as an initial value problem or as a boundary value problem. In the case of an initial value problem, position x0 .0/ and velocity xP 0 .0/ at the beginning of the orbital arc have to be known, while in the case of the boundary value problem, the positions x0 .0/ at the beginning and x0 .T / at the end of the orbital arc have to be given. In the case of a boundary value problem, it is convenient to transform the equations of motion into an equivalent integral equation (cmp. Schneider 1968): Z xQ ./ D xQ .0/.1 / C xQ .1/ T
1
2
K ; 0 f 0 d 0 ;
(41)
0
where WD
t ; T
xQ ./ WD x0 .T /
and the integral kernel K is given by ( 0
K.; / WD
0 .1 / 0
.1 /
; 0 ; 0 > :
:
(42)
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
183
The function f comprises all forces acting on the satellite: f 0 WD rV 0 x0 0 T ; 0 T C F0 x0 0 T ; xP 0 0 T ; 0 T ! ! x0 0 T 2! xP 0 0 T :
5
Observation Models and Data Processing Strategies
5.1
Satellite-to-Satellite Tracking in the High–Low Mode
(43)
High–low modus is the name of scenario where a low-flying satellite is tracked by several high-flying satellites. Since the orbits of the high-flying satellites can be computed sufficiently precise from existing gravity field models, their orbits do not contribute to an improvement of the knowledge about the gravitational field. Hence, the positions of the high-flying satellites are used as known reference positions and the position of the low-flying satellite is determined in reference to these known positions. In general GPS satellites are used as high-flying satellites and the low-flying satellite tracks its position and velocity in an Earth-fixed system, in relation to the known positions and velocities of the GPS satellites, by an onboard GPS receiver. If this is put in relation to the general SST model (4) and (6), we obtain 0 0 0 0 0 0 x .V ; t/ O s.t/ D F x V ; t ; xP V ; t C .t/ WD 0 0 C .t/ xP .V ; t/
(44)
and Z
T
V D argmin 0
0 0 0 0 2 O s.t/ F x V ; t ; xP V ; t dt:
(45)
If this problem is discretized according to (7) and (8), it results in a nonlinear leastsquares problem
f.cz ; z / j z 2 C g D argmin
N X 2 si FO x0 V 0 ; ti ; xP 0 V 0 ; ti : i D1
In order to solve this least-squares problem, the entries of the Jacobian @FO @FO @x0 VQ cz ; z @FO @Px0 VQ cz ; z D 0 C 0 @ .cz ; z / @x @Px @ cz ; z @ cz ; z
(46)
184
W. Keller
have to be known. As a standard approach, they are computed by solving the socalled variational equations (Ballani 1988). Since this technique is numerically costly, some less costly techniques have been developed, which convert the position and velocity information into synthetic observations, which then are directly related to the unknown potential. Those techniques are named: • Acceleration approach • Energy-balance approach–torus approach A technique, which is strongly related to the variational equation approach, is the so-called integral equation approach.
Energy-Balance Approach The energy-balance approach is one of the oldest ideas in dynamical satellite geodesy. It was first discussed in O’Keefe (1957), Reigber (1969), and Bjerhammar (1976). The method acquired practical importance only with availability of timely and spatially dense data, as, for instance, delivered by the satellite mission CHAMP. There are numerous publications about the application of the energy-balance approach to CHAMP data. Without a ranking the following contributions will be mentioned: Visser et al. (2003), Gerlach et al. (2003), and Badura et al. (2006). In an inertial frame the Lagrangian of the low-flying satellite is given as LDT V D
1 > xP xP V: 2
(47)
In the Earth-fixed system, the Lagrangian changes to LD
1 > > 1 0 > 0 xP C 2 xP 0 ! x0 xP ! x0 C ! x0 V: 2 2
(48)
A change from the Lagrangian to the Hamiltonian H D p> xP 0 L;
pD
@L @Px0
(49)
yields H D
> 1 0 > 0 1 xP ! x0 xP ! x0 C V: 2 2
(50)
Since the Hamiltonian is constant and x0 ; xP 0 are known from orbit integration, this leads to an observation equation for the unknown parameters fcz ; z g in the basis function representation VQ of V . So far all nonconservative forces acting on the satellite have been ignored. Taking them into account adds another term to the energy-balance equation: Z t > 1 0 > 0 1 H D xP ! x0 xP ! x0 C V C f> xP 0 dt 0 : (51) 2 2 0
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
185
This last term is the line integral of the nonconservative force f along the orbital path. If the nonconservative force is measured or modeled, this line integral can be evaluated for each observation epoch t D ih. Inserting the base function representation VQ and the known positions x0i WD 0 x .ih/ and velocities xP 0i WD xP 0 .ih/ yields the following observation equation: H
X z2C Z
> 1 2 1 cz ˆz x0i ; z D xP 0i ! x0i C ! x0i 2 2 r
Z
ih
f> xP 0 dt 0 :
0
(52) Hence the observations si in (52) are the sum of kinetic energy, potential energy, and dissipative work
si D
1 xP 0 2 1 ! x0 > ! x0 C i i i 2 2
Z
ih
f> xP 0 dt 0
(53)
0
and the least-squares problem N X i D0
si H C
X
!2 0 cz ˆz xi ; z ! min
(54)
z2C Z r
is partly linear and partly nonlinear in the unknown parameters H; cz ; z . In order to refer the least-squares problem (54) to the generic setting (9), the following definitions have to be made: X FO x0 ih; VQ ; xP 0 ih; VQ D H cz ˆz x0 ; z : z2C Z r
For the solution of this least-squares problem, the entries of the Jacobian 8 ˆ ˆ
xl .t1 /x0 .t1 / > B kx0l .t1 /x0 .t1 /k 1 .t1 / : : : kx0l .t1 /x0 .t1 /k N .t1 / C B C :: C: Xl D B : B C @ A x0l .tn /x0 .tn / > x0l .tn /x0 .tn / > kx0 .t /x0 .t /k 1 .tn / : : : kx0 .t /x0 .t /k N .tn / l
n
1
l
n
n
4. Collect all GPS pseudo-ranges residuals s D .s1 .t1 /; : : : ; s1 .tn /; s2 .t1 /; : : : ; s2 .tn /; : : : ; s4 .tn //>
192
W. Keller
where the residuals are defined as the difference between observed pseudo-ranges and pseudo-ranges computed with the initial guess p0 . 5. Assemble the design matrix 0
1 X1 B C X D @ ::: A : X4 6. Compute parameter corrections p D .X> X/1 Xs:
Colombo’s Modification of Variational Equation Approach Since for each unknown parameter pk one differential equation has to be solved, the total numerical effort is considerable. Therefore, it is desirable to find an at least approximative closed solution of (77). Colombo (1984) solved this problem by treating it in the orbital system. The transformation from the Earth-fixed to the orbital system is accomplished by the rotation (20) and results in the following rotation vector with respect to the orbital system: 0 1 r 0 GM ! D @ 0A ; n D : a3 n We get 0 2 1 n k;1 C 2nPk;2 ! .! k .t// 2! Pk .t/ D @ n2 k;2 2nPk;1 A : 0 If now, the gravitational potential V 0 is split in its spherical part U D disturbing potential T 0 , we first obtain 0 1 2k;1 r 2 U k D n2 @k;2 A k;3
GM r
and the
and (77) simplifies to 0
1 Rk;1 3n2 k;1 2nPk;2 0 B C B C D r 2 T 0 x0 k .t/ C @rT : Rk;2 C 2nPk;1 @ A @pk Rk;3 C n2 k;3
(78)
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
193
Since both and r 2 T are small, their product can be neglected, and the final form of the variational equations in the orbital system is obtained: 0R 1 k;1 3n2 k;1 2nPk;2 @rT 0 B C : Rk;2 C 2nPk;1 @ AD @pk Rk;3 C n2 k;3
(79)
This is an inhomogeneous ordinary differential equation with constant coefficients, which can be solved in a closed form, provided the inhomogeneity is sufficiently “simple.” In order to make this sure, the orbit of the satellite is approximated by a so-called repeat orbit. If a basis function representation of T is chosen according to (37), its evaluation along the repeat orbit results in a periodic disturbing potential: T D
X m;k
cm;k ˆm;k .t/ D
X
cm;k Am;k .a; i /e {
P m;k t
:
(80)
m;k
Consequently, the gradient of T along the repeat orbit is also a periodic vector function:
rT D
1 @ X @a P @ A cm;k @ a1 @u Am;k .a; i /e { m;k t 1 @ m;k a cos i @i
0
0
D
1
@Am;k X B {k @a C P cm;k @ a Am;k A e { m;k t : @A 1 m;k m;k a cos i @i
(81) This means the inhomogeneity is a Fourier series, and since the differential equation is linear, the superposition principle can be applied and each term of the force function can be treated separately. For each term a differential equation with constant coefficients and a periodic inhomogeneity has to be solved. The solution is elementary and consists in a superposition of two periodic solutions: one with the eigenfrequency n of the homogeneous differential equation and one with the frequency P m;k of the excitation term. Hence, the partial derivatives k are series of periodic functions with the frequencies n and P m;k . These partial derivatives have to be transformed back from the orbital system to the Earth-fixed system according to @x0 D R k D R u; i; C ‚ : k @pk 2 2
(82)
This means in the Colombo modification step 2, the variational equation approach has to be replaced by the closed solution of the variational equations in the orbital frame and the back-transformation of the partial derivatives k into the Earth-fixed system.
194
W. Keller
Colombo’s Modification 1. With an initial guess p0 for the unknown parameters, compute a reference orbit x0 .t/. 2. Find the orbital elements a; i; ; M .t0 / of the best-fitting circular repeat orbit. 3. Find the Fourier series representation of rT according to (81). 4. For each Fourier term, solve Eqs. (79) for the partial derivatives k in the original system. 5. For each GPS satellite l, build the matrix 0 x0l .t1 /x0 .t1 / > B kx0l .t1 /x0 .t1 /k R 1 .t1 / B Xl D B B @ 0 x .t /x0 .t / > kxl0 .tn /x0 .tn /k R 1 .tn / l
n
n
1 R N .t1 / C C C: C 0 A xl .tn /x0 .tn / > : : : kx0 .t /x0 .t /k R N .tn /
::: :: :
x0l .t1 /x0 .t1 / > kx0l .t1 /x0 .t1 /k
l
n
n
6. Collect all GPS pseudo-ranges residuals s D .s1 .t1 /; : : : ; s1 .tn /; s2 .t1 /; : : : ; s2 .tn /; : : : ; s4 .tn //> where the residuals are defined as the difference between observed pseudo-ranges and pseudo-ranges computed with the initial guess p0 . 7. Assemble the design matrix 0
1 X1 B C X D @ ::: A : X4 8. Compute parameter corrections 1 p D X> X Xs:
5.2
Satellite-to-Satellite Tracking in the Low–Low Mode
For the determination of the gravitational field of the Earth, satellite-to-satellite tracking enfolds its true potential only in the low–low mode. In this mode two satellites measure their relative velocity, the so-called range-rate: > P WD xP 02 .t; p/ xP 01 .t; p/ e12 ;
(83)
with e12 being the line-of-sight (LOS) unit vector e12 WD
x02 x01 : kx02 x01 k
(84)
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
195
For the determination of the parameters p, describing the gravitational field of the Earth, the following least-squares problem has to be solved: p WD argmin
n X > 2 P tj xP 02 tj ; p xP 01 tj ; p e12 tj ; p :
(85)
j D1
As in the high–low mode, there are two groups of methods to solve this least-squares problem: 1. Conversion of the measured range-rates into artificial in situ observations as in the • Potential-difference approach • The line-of-sight gradiometry approach 2. The computation of the partial derivatives @@pP as in • The variational equation approach • The integral equation approach
Potential-Difference Approach The basic idea of this approach dates back to an article of Wolff in 1969. It was further developed and tested for the GRACE mission by Jekeli (1999) and Han (2004). The observation quantity is the range-rate P12 D xP > x2 xP 1 /> e12 : 12 e12 WD .P
(86)
Using the energy conservation in the inertial frame V D
1 > xP xP E0 2
(87)
and forming the along-track derivative da on both side yields da V D xP > da xP :
(88)
If both satellites are close to each other, one can approximate
V12 WD V .x2 / V .x1 / kPx1 kP12 :
(89)
So far the developments are made in an inertial system for a static potential V . In reality the Earth rotates and therefore the potential is time dependent. In order to remove this time dependency, the energy balance has to be considered in a rotating system. Doing this, the centrifugal potential has to be added to the energy balance, yielding the following observation equation for the potential differences:
196
W. Keller
V 0 .x2 /V 0 .x1 /CE0 kPx1 kP12
> 1 > 1 ! x02 ! x02 C ! x01 ! x01 : 2 2 (90)
Using the basis function representation of the gravitational field, the following relationship between range-rates and gravitational field parameters can be established: X
cz ˆz x02 ; z ˆz x01 ; z C E0
z2C Z r
kPx1 kP12
> 1 > 1 ! x02 ! x01 ! x02 C ! x01 : 2 2
(91)
If either the basis functions do not depend on the parameters z or if these parameters are fixed a priori, Eqs. (91) are linear in the unknown coefficients cz , and these coefficients can be determined by a standard linear least-squares technique: n X
1 > ! x02 tj kPx1 tj kP12 tj ! x02 tj 2 j D1 > 1 ! x01 tj C ! x01 tj 2 !2 X 0 0 C E0 ! min: cz ˆz x2 tj ˆz x1 tj
(92)
z
Potential-Difference Approach 1. Multiply the measured range-rates with the total velocity of the trailing satellite s j D P12 tj x01 tj : 2. Reduce this pseudo-observation by the difference in the centrifugal potential sj D s j
> 1 > 1 ! x02 tj ! x01 tj ! x02 tj C ! x01 tj : 2 2
3. Solve the linear least-squares problem n X j D1
sj
X z
2 cz ˆz x02 tj ˆz x01 tj C E0
! ! min :
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
197
Line-of-Sight Gradiometry A different technique to convert range-rates in into situ observations is the lineof-sight gradiometry. This technique is described in Blaha (1992), Heß and Keller (1999), and Keller and Sharifi (2005). The main idea of this approach is to compute the time derivative of the observed range-rates: R D .Rx2 xR 1 /> e12 C
kPx2 xP 1 k2 P2 ;
(93)
with the measured range-rate P and the inter-satellite range D kx2 x1 k. Since xR i D rV .xi / holds, Eq. (93) can be recast into an observation equation for the differences in the potential gradients: .rV .x2 / rV .x1 //> e12 D R
kPx2 xP 1 k2 P2 :
(94)
A Taylor expansion of the left-hand side of this equation at the satellite midpoint x WD 12 .x1 C x2 / yields 2 e> 12 r V .x/ e12
R kPx2 xP 1 k2 P2 2: 2
(95)
The left side of this equation is the gravity gradient in the direction of the lineof-sight between the two satellites, which gives this approach its name. If now the basis function representation of the gravitational potential is inserted, we arrive at a least-squares problem for the parameters of this representation:
.cz ; z / D argmin
n X
X 2 cz e> yy tj 12 r ˆz x tj ; z e12
!2 :
(96)
z2C Z r
j D1
In this equation the quantity yy is an abbreviation for the artificial observation yy WD
R kPx2 xP 1 k2 P2 2: 2
(97)
Instead in the Earth-fixed system, the problem can also be treated in the orbital system. This has the advantage that the line-of-sight gradient can be expressed as the linear combination of the second-order derivative in u direction and of the firstorder derivative in radial direction: 2 e> 12 r V .x/ e12 D
1 @2 V 1 @V ; C r 2 @u2 r @r
(98)
which makes the computation of all components of the Marussi tensor r 2 V obsolete. The least-squares problem simplifies to
198
W. Keller n X cm;k ; m;k D argmin yy tj j D1
X
cm;k
m;k
1 @2 ˆm;k u; ƒ; m;k tj a2 @u2
!2 1 @ˆm;k u; ƒ; m;k tj C : a @a
(99)
For spherical harmonics as basis functions, their representation (23) in the orbital system leads to the following simplified least-squares problem:
cm;k ; m;k D argmin
n X
0
12 X @yy tj cm;k ˆm;k uj ; ƒj tj A ;
j D1
(100)
m;k
with
ˆm;k uj ; ƒj
2 GM D4 3 R
1 X lDmaxfjkj;jmjg
3 lC3 R k l .i /P l .0/5 l 1 k 2 { km dm;k r
e {.kuj Cmƒj / :
(101)
All in all, for spherical harmonics as basis functions, the line-of-sight gradiometry approach consists of the following steps: Line-of-Sight Gradiometry Approach 1. Convert the measured range P rate into synthetic relative accelerations R by some numerical differentiation scheme. 2. Compute the line-of-sight gravity gradient yy .tj / according to (97). 3. Solve the linear least-squares problem n X j D1
12 X @yy tj cm;k ˆm;k uj ; ƒj tj A ! min 0
m;k
Variational Equation Approach The variational equation approach is used by several groups concerned with the processing of GRACE data (Beutler et al. 2010a,b). It is centered around the nonlinear least-squares problem
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
199
n X > 0 2 P tj xP 02 tj xP 01 tj e12 tj ! min :
(102)
j D1
To solve this nonlinear least-squares problem, the computation of the partial derivatives of the model range-rates > PM .t/ D xP 02 .t/ xP 01 .t/ e012 .t/ with respect to the unknown parameters p is an essential point. Obviously, we have > P 0 > @ x02 x01 @ xP 02 xP 01 @PM 1 0 0 0 0 x x1 xP xP 1 D e12 C @pk @pk 2 2 @pk (103) > .2/ P 0 1 0 .1/ .2/ .1/ D e012 P k P k C k k x2 x01 xP 2 xP 01 .i /
For each satellite .i / the quantities k are computed as the solutions of the variational equations as described in Sect. 5.1. Hence, the variational equation approach for the determination of the gravity field parameters p from the observed range-rates P consists of the following steps: Variational Equation Approach .i /
1. For each satellite solve the variational equations for the partial derivatives k according to Sect. 5.1. 2. Compute the partial derivatives of the model range-rates with respect to the parameters pk as .2/ @PM .1/ D e012 P k P k C @pk
P 0 > .2/ 1 0 .1/ 0 0 xP 2 xP 1 k k x2 x1
3. Solve the linear least-squares problem n X j D1
X @PM P tj pk @pk
!2 ! min :
k
Basically, also an iterative solution of the nonlinear least-squares problem is possible. But taking into account that rather good a priori values p0 for p are known and that the solution of the variational equations is costly, in most cases only one single step is carried out.
200
W. Keller
5.3
Integral Equations Approach
The integral equation approach, fully developed in Mayer-Gürr et al. (2007), MayerGürr (2012), also tackles the least-squares problem (102). Its difference to the .i / .i / variational equation approach lies in the way the quantities k and P k in the partial derivatives of the model range-rates with respect to the unknown parameters p are computed. In the integral equations approach, the point of departure is not the equation of motion (40) but its equivalent integral equation counterpart (43). If now in (43) the unknown potential V 0 is replaced by its basis function representation, for each satellite .i / we obtain the following approximation of the integral equation describing its motions: Z xQ .i / ./ D xQ .i / .0/.1 / C xQ .i / .1/ T 2
1
K ; 0 f.i / 0 ; p d 0
(104)
0
with f
.i /
0 ; p WD r
X
cz ˆz x
0.i /
0
T ; z
!
z2C Z r
C F0 x0.i / 0 T ; xP 0.i / 0 T ; 0 T ! ! x0.i / 0 T (105) 2! xP 0.i / 0 T : The vector p collects all unknown parameters cz ; z in the basis function representation of the unknown potential. For the determination of the parameters p from the observed positions and velocities, their partial derivatives with respect to the parameters are needed. The partial derivatives solve the integral equations @Qx.i / ./ @p Z 1 @f.i / D T 2 K ; 0 @x0 0 @f.i / 0 C ; p d0 @p Z 1 @f.i / D T 2 K ; 0 @x0 0 @f.i / 0 ; p d0 C @p
.i / WD
0 @Qx.i / 0 @f.i / 0 @xPQ .i / 0 C ;p ;p @p @Px0 @p
@f.i / 0 .i / 0 0 ; p .i / 0 C ; p P @Px0 (106)
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
201
and @xPQ .i / .i / ./ P WD @p Z 1 @f.i / 0 @Qx.i / 0 @f.i / 0 @xPQ .i / 0 @K ; 0 C C D T 2 ;p ;p @x0 @p @Px0 @p 0 @ @f.i / 0 ; p d0 @p Z 1 @f.i / 0 .i / 0 @f.i / 0 .i / 0 @K 2 0 D T ;p C ; p P ; @x0 @Px0 0 @ @f.i / 0 ; p d0 C @p
(107)
respectively. If now the integrals are approximated by quadrature formulas Z
1
I WD 0
@f.i / 0 .i / 0 @f.i / 0 .i / 0 @f.i / 0 0 P ;p ;p C 0 ;p C K ; @x0 @Px @p
d0
n X
wl K
; l0
lD1
C
@f.i / 0 ;p @p l
Z
1
0
I WD 0
@f.i / 0 .i / 0 @f.i / 0 .i / 0 ; p l C ; p P l @x0 l @Px0 l
(108)
@f.i / 0 .i / 0 @f.i / 0 .i / 0 @f.i / 0 @K 0 P ;p C ;p C ; ;p @ @x0 @Px0 @p d0
n X lD1
@f.i / 0 .i / 0 @f.i / 0 .i / 0 @K 0 ; l ; p l C ; p P l wl @ @x0 l @Px0 l @f.i / 0 ;p ; C @p l
(109)
202
W. Keller
we obtain two linear systems of equations for the unknown partial derivatives:
.i /
n X @f.i / 0 .i / 0 @f.i / 0 .i / 0 2 0 j D T ; p l C ; p P l wl K j ; l @x0 l @Px0 l lD1 @f.i / 0 ;p (110) C @p l
n X @f.i / 0 .i / 0 @f.i / 0 .i / 0 @K .i / j D T 2 ; p l C ; p P l P wl j ; l0 @ @x0 l @Px0 l lD1
C
@f.i / 0 l ; p : @p
(111)
If the unknown partial derivatives with respect to the kth parameter pk are assembled into column vectors .i / .i / .i / .i / .i / Zk WD vec k .1 /; : : : ; k .n /; P k .1 /; : : : ; P k .n / ;
(112)
these linear equations can be written in matrix form as .i / .i / A B .i / .i / Zk D bk IC C.i / D.i /
(113)
with 2
A.i /
.i / w1 K 1 ; 10 @f@x0 6 6 0 @f.i / 2 6 w1 K 2 ; 1 @x0 DT 6 6 4 .i / w1 K n ; 10 @f@x0 2
B.i /
.i / w1 K 1 ; 10 @f@Px0 6 6 0 @f.i / 2 6w1 K 2 ; 1 @Px0 DT 6 6 4 .i / w1 K n ; 10 @f@Px0
.i / 10 ; p : : : wn K 1 ; n0 @f@x0 0 .i / 1 ; p : : : wn K 2 ; n0 @f@x0 :: : 0 .i / 1 ; p : : : wn K n ; n0 @f@x0
0 3 n ; p 0 7 7 n ; p 7 7; 7 5 0 n ; p
(114)
0 3 n ; p 0 7 7 n ; p 7 7; 7 5 0 n ; p
(115)
.i / 10 ; p : : : wn K 1 ; n0 @f@Px0 0 .i / 1 ; p : : : wn K 2 ; n0 @f@Px0 :: : 0 .i / 1 ; p : : : wn K n ; n0 @f@Px0
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
2
C.i /
.i / w1 @K 1 ; 10 @f@x0 @ 6 6 @K 0 @f.i / 2 6w1 @ 2 ; 1 @x0 DT 6 6 4 0 @f.i / w1 @K @ n ; 1 @x0 2
D.i /
0 @f.i / w1 @K @ 1 ; 1 @Px0 6 6 @K 0 @f.i / 2 6w1 @ 2 ; 1 @Px0 DT 6 6 4 .i / w1 @K n ; 10 @f@Px0 @
203
@f.i / 0 3 n ; p @x0 0 7 7 0 @f.i / 2 ; n @x0 n ; p 7 7; 7 5 .i / @f 0 0 n ; n @x0 n ; p
(116)
0 3 0 @f.i / 10 ; p : : : wn @K n ; p ; 1 0 n @Px @ 7 0 .i / 7 @f @K 1 ; p : : : wn @ 2 ; n0 @Px0 n0 ; p 7 7; :: 7 : 5 0 @K 0 @f 0 1 ; p : : : wn @ n ; n @Px0 n ; p
(117)
10 ; p : : : wn @K @ 0 1 ; p : : : wn @K @ :: : 0 1 ; p : : : wn @K @
1 ; n0
and .i / bk
! n X @f.i / 0 @f.i / 0 ;p ;:::; ;p : D T vec wl K .1 ; l / wl K .n ; l / @pk l @pk l lD1 lD1 (118) 2
n X
The matrices A.i / ; B.i / ; C.i / ; D.i / are independent of the parameter pk . Hence the main numerical effort, the LU-decomposition, has to be carried out only once per satellite, no matter how many parameters pk are to be determined. This leads to the following algorithm:
Integral Equation Approach 1. For each satellite compute the matrices A.i / ; B.i / ; C.i / ; D.i / according to (114)– (117). 2. For each parameter pk .i / • Compute the vector bk according to (118). .i / • Compute the partial derivatives Zk with respect to the parameter pk as the solution of (113). • Compute the partial derivatives of the model range-rates with respect to the parameters pk as .2/ @PM .1/ P P C D e> k k 12 @pk
P 0 > .2/ 1 0 .1/ k k x2 x01 xP 2 xP 01
3. Solve the linear least-squares problem n X j D1
X @PM P tj pk @pk k
!2 ! min :
204
6
W. Keller
Regional Gravity Field Models
Spherical harmonics are suitable basis functions for a global recovery of the gravitational field of the Earth. Because of their global support on the sphere, they are best suited for a homogeneous data distribution on the sphere. Since in general SST satellites have almost polar orbits, their orbital arcs converge toward the poles. Therefore, the data distribution at the poles is much denser than around the equator. In this situation it is reasonable to combine a global gravity field model, based on spherical harmonics expansion, with regional improvements of this global field. The latter is represented by basis functions with a local support or by basis functions, which are at least rapidly decaying. A technique relying on basis functions with a local support is the so-called mascons technique. This technique was developed by Muller and Sjogren (1968) for the recovery of the lunar gravitational field. Due to the fact that constraints between the basis functions can be introduced, this technique has been applied to GRACE data by several authors (Luthcke et al. 2008; Rowlands et al. 2010). Nevertheless, the majority of authors use RBFs for the regional improvement of a global gravity field solution. In this respect the contributions of Schmidt et al. (2006), Fengler et al. (2007), Eicker (2012), Klees et al. (2008) and others have to be mentioned. The basic idea of the regional improvement by RBFs is the separation of the data M N set fsi gN i D1 in Eq. (10) into two disjunct subsets fsi gi D1 [fsi gi DM C1 . The first subset is used to derive a global spherical harmonics model
fcl;m g D argmin
M X i D1
si F x1 ih; VQ .Kl;m / ; xP 1 ih; VQ .Kl;m / ; x2 ih; VQ .Kn;m / ; 2 xP 2 ih; VQ .Kl;m /
(119)
with l 1 GM X R lC1 X VQ D Kl;m Yl;m .#; /: R r lD0
(120)
mDl
Once the global model (120) has been determined, the residual observations between the original data and the synthetic data are computed for both subsets: ri Dsi F x1 ih; VQ .Kl;m / ; xP 1 ih; VQ .Kl;m / ; x2 ih; VQ .Kn;m / ; xP 2 ih; VQ .Kl;m / ; i D 1; : : : ; N: (121)
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
205
The residuals then undergo an analysis with respect to RBFs as basis functions
f.cl ; l /g D argmin
N X i D1
kri F x1 ih; VO .cl ; l / ; xP 1 ih; VO .cl ; l / ; x2 ih; VO .cl ; l / ; xP 2 ih; VO .cl ; l / k2 (122) with VO D
L X
cl ˆ .x; l /
(123)
lD1
and ˆ according to (27). In most applications, the parameters l are assigned fixed a priori values, which makes the regional improvement technique a linear leastsquares problem. One of the few publications, where also the parameters l are subject to optimization, i.e., where the base functions are allowed to change shape and position during the minimization process, is Antoni (2012).
6.1
Final Remarks
So far, all arguments give only a coarse sketch of the methods applied in the processing of SST data. Only the basic ideas behind these methods have been presented. The practical application differs in the following aspects from the given arguments: 1. An important step in data preprocessing is the so-called de-aliasing. Though this step is absolutely vital for the derivation of meaningful results from the SST data, it could not be discussed here. De-aliasing means that all forces, which do not fit into the simple observation model, have to be either measured or modeled and the genuine SST data has to be reduced for these forces. Hence, the concept of de-aliasing includes the following reductions: • • • •
For atmospheric friction For tidal effects For precession and nutation effects For atmospheric and oceanographic loading effects
A comprehensive description of the de-aliasing procedures is given in Bettadpur (2012). 2. In general not the de-aliased data itself but the de-aliased residual data is the input for the subsequent analysis. Residual data means that synthetic observations,
206
W. Keller
which are computed from an a priori gravity field model, are subtracted from the de-aliased data. This step reduces the effect of all approximations made in the derivation of the observation model. The consequence of forming residuals is that not the gravity field parameters themselves but only their differences to the parameters of the a priori model can be determined. 3. In general the normal equation matrices of the least-squares problems have a poor condition. Therefore, in most cases different kinds of regularization are applied. 4. Due to imperfect de-aliasing, the monthly variations in GRACE-derived gravity field models show dominant north–south stripes, which obscure the desired geophysical information. Therefore, a number of filter strategies are applied in post-processing to remove these stripes (Kusche et al. 2009). 5. Since the GRACE satellites also carry onboard GPS receivers, the GPS mission is both a hl and a ll SST mission. While the hl data are insensitive to the short wavelengths in the gravity field, the ll mission is “ignorant” for the long wavelength features. For an optimal resolution both data types have to be combined. Due to their different error budget, a proper combination requires a preceding variance components estimation.
7
Missions and Outcomes
7.1
CHAMP Mission
The CHAMP mission was not only dedicated to the gravity field but also to the magnetic field of the Earth. Another important experiment carried out with CHAMP was the study of the atmosphere by radio-occultation. An overview over the results after 5 years in orbit is given in Reigber et al. (2006). Different institutions processed the CHAMP data for different orbit lengths and derived individual spherical harmonic gravity field models. All these models have been collected by the International Centre for Global Earth Models (ICGEM) and can be accessed via its website (ICGEM). The models differ in the time-span of data and in the resolution limit in degree and order. The highest resolution up to degree and order 140 is given by the model EIGEN-CHAMP03S (Reigber et al. 2004). For its derivation CHAMP data from October 2000 until June 2003 have been analyzed. The normal equation regularization for this model started from degree and order 60 on. An evaluation of the quality of the derived models can be done by a comparison with a model of higher quality, derived from GRACE data, e.g., by a comparison with the GRACE-EIGEN6C2 model. Taking this model for the truth, one can see that the estimation error of the EIGEN-CHAMP03S model matches the signal strengths at about degree 90. Roughly speaking, this means that from this point on, the estimation error of a coefficient exceeds its magnitude. While this is a relative evaluation, an absolute evaluation can be done by a comparison of geoid heights, obtained by a combination of GPS heights and spiritleveling heights, with gravity field-derived geoid heights. In this comparison the
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
207
GPS heights minus leveling heights solution is considered the master solution. The comparison shows a disagreement of about 0.8 m in North America and 1.2 m in Europe for EIGEN-CHAMP03S.
7.2
GRACE Mission
Besides static solutions for the time-invariant part of the gravitational field, the GRACE mission also provides monthly solutions. The static solutions can be accessed via the ICGEM website (ICGEM). The models differ in the time-span of data and in the resolution limit in degree and order. The highest resolution up to degree and order 180 is given by the model ITG-GRACE2010s (Mayer-Gürr et al. 2010). It uses data from August 2002 till August 2009. If evaluated against GRACE-EIGEN6C2, it shows that the estimation error matches the signal strength at about degree 160. This means an improvement by the factor 2 compared to the best CHAMP solution. Also the comparison of the ITG-GRACE201s model with GPS-leveling results in a disagreement of about 0.5 m for all continents, which is also by the factor 2 better than the best CHAMP solution in Europe. The essential outcome of the GRACE missions is the monthly gravity field solutions, because they carry the imprint of mass transport processes at or close to the surface of the Earth. They also can be accessed via the ICGEM website (ICGEM). Besides the unmodified monthly solution, also monthly solution, which is de-striping filtered according to Kusche et al. (2009), is provided. The importance of monthly GRACE solution for geophysical interpretation rests on the assumption that all mass changes, which are observed by GRACE, take place in a thin layer at the surface of the Earth. Under this assumption, observed changes Kl;m in the spherical harmonics coefficients can be converted into changes of surface layer density , i.e., mass change per area (cmp. Chao and Gross 1987):
.#; / D
l 1 R X 2l C 1 X Kl;m Yl;m .#; /: 3 1 C kl lD0
(124)
mDl
In this equation stands for the average density of the Earth and the coefficients kl are the so-called Love numbers (Han and Wahr 1995). Since the majority of mass changes, which can be observed by GRACE, are related to water and ice redistributions, these density changes can be converted into changes of equivalent water thickness by EW T D
; w
(125)
with w being the density of water. This change in EWT corresponds to the vertically integrated mass changes inside aquifers, soil, surface reservoirs, and snow and ice packs. It can be observed by GRACE with an accuracy of a few millimeters for a
208
W. Keller
spatial resolution of about 400 km. For this reasons the EWT observed by GRACE has an important impact to: • Hydrology • Oceanography • Glaciology In continental hydrology GRACE estimates of continental water storage improve the understanding of hydrological and atmospheric processes. The great advantage of GRACE estimates is that they are averages over a few hundred of kilometers while traditional ground-based hydrological data refer to scales of 10 km or less. The weakness of GRACE results is that they cannot distinguish between water on the surface and in the soil. Neither they can discriminate between water, snow, and ice. All in all, the GRACE estimates are in good agreement with estimates derived from hydrological models as WGHM or GLDAS. GRACE can give reliable estimates about the total water budget over large regions, and with this information, it can contribute to the improvement of hydrological models. In oceanography the main contribution of GRACE is the separation between steric and non-steric sea-level rise, because only the latter is related to mass transports. Precise information about the steric sea-level rise makes it possible to estimate the change in heat storage in the oceans, which is important for climate change prediction. In the pre-GRACE area of glaciology, information about the mass balance of the arctic and antarctic ice shields could only be derived from measurements at a small number of benchmark glaciers. The parameters derived from these measurements are hardly representative for the entire ice shield. Here the mass change derived from GRACE is an important additional source of information, because it refers to averages over a few hundreds of kilometers. But the total mass change observed by GRACE is not only due to the mass loss or mass gain of ice shields but additionally stems from the effect of postglacial rebound. The reduction of GRACE observation by the influence of postglacial rebound, computed from existing models, gives valuable constraints for ice-dynamics models.
8
Conclusions
Satellite-to-satellite tracking techniques proved capable to monitor both the static and the time-variable gravity field of the Earth with an unprecedented accuracy and resolution. A number of different algorithms have been developed to convert the genuine SST observations into parameters of a basis function representation of the gravitational field of the Earth. Nevertheless, there are still a number of unsolved problems, like the de-striping of monthly GRACE solutions, the spectral leakage, and the aliasing problem, which still constitute a challenge for further research.
Satellite-to-Satellite Tracking (Low-Low/High-Low SST)
209
References Antoni M (2012) Nichtlineare Optimierung regionaler Graviationsfeldmodelle aus SST Daten. PhD thesis, Universität Stuttgart Badura T, Sakulin C, Gruber T, Klostius R (2006) Derivation of the CHAMP-only gravity field model TUG-CHAMP04 applying the energy integral approach. Stud Geophys Geod 50:57–74 Ballani L (1988) Partielle Ableitungen und Variationsgleichungen zur Modellierung von Satellitenbahnen und Parameterbestimmung. Vermessungstechnik 36:192–194 Bettadpur S (2012) Level-2 gravity field product user handbook rev. 3.0, May 29. ftp://podaac-ftp. jpl.nasa.gov/GeodeticsGravity/grace/L1B/JPL/RL01/docs/L2-UserHandbook_v3.0.pdf Beutler G, Jäggi A, Mervart L, Meyer U (2010a) The celestial mechanics approach: theoretical foundations. J Geodesy 84:65–624 Beutler G, Jäggi A, Mervart L, Meyer U (2010b) The celestial mechanics approach: application to data of the GRACE mission. J Geodesy 84:661–681 Bjerhammar A (1976) On the energy integral for satellites. Technical report, Report of the Royal Institute of Technology, Stockholm Blaha G (1992) Refinement of the satellite-to-satellite line-of-sight model in residual gravity field. Manuscr Geod 17:321–333 Chao BF, Gross RS (1987) Changes in the Earth’s rotation and low-degree gravitational field induced by earthquakes. J R Astron Soc 91:569–596 Colombo O (1984) Global mapping of gravity with two satellites. Technical report vol 7 Nr 3, Netherlands Geodetic Commission Eicker A (2012) Gravity field Refinement by radial basis functions from in-situ satellite data. Technical report, DGK Reihe C, Bd. 676 Fengler MJ, Freeden W, Kohlhaas A, Michel V, Peters T (2007) Wavelet modeling of regional variations of the Earth’s gravitational potential observed by GRACE. J Geodesy 81:5–15 Gerlach CL, Földvary L, Švehla D, Gruber T, Wermut M, Sneeuw N, Frommknecht B, Oberhofer H, Peters T, Rothacher M, Rummel R, Steigenberger P (2003) A CHAMP-only gravity field model from kinematic orbits using the energy integral. Geophys Res Lett, doi:10.1029/2003GLO18025 Han D, Wahr J (1995) The viscoelastic relaxation of a realistic stratified Earth and further analysis of post-glacial rebound. Geophys J Int 120:287–311 Han S-C (2004) Efficient determination of global gravity field from satellite-to-satellite tracking mission. Celest Mech Dyn Astron 88:69–102 Heß D, Keller W (1999) Gradiometrie mit GRACE. Z Vermess 124:137–144 ICGEM. http://icgem.gfz-potsdam.de/ICGEM/ICGEM.html, 2014 Jekeli C (1999) The determination of gravitational potential differences from satellite-to-satellite tracking. Celest Mech Dyn Astron 75:85–101 Kaula WM (2000) Theory of satellite geodesy. Applications of satellites to geodesy. Dover, New York Keller W, Sharifi MA (2005) Satellite gradiometry using a satellite pair. J Geodesy 78:544–557 Klees R, Liu X, Wittwer T, Gunter BC, Revtona EA, Tenzer R, Ditmar P, Winsemius HC, Savanije HHG (2008) A comparison of global and regional GRACE models for land hydrology. Surv Geophys 29:335–359 Kostelec PJ, Rockmore DN (2008) FFTs on the rotation group. J Fourier Anal Appl 14:145–179 Kusche J, Schmidt R, Petrovic S, Rietbroeck R (2009) Decorrelated GRACE time-variable gravity field solutions by GFZ, and their validation using a hydrological model. J Geodesy 83:903–913 Levenberg KA (1944) A method for the solution of certain problems in least squares. Q Appl Math 2:164–168 Luthcke SB, Arendt AA, Rowlands DD, McCarthy JJ, Larsen CF (2008) Recent glacier mass changes in the Gulf of Alaska region from GRACE mascons solutions. J Glaciol 54:767–777 Marquardt D (1963) An algorithm for least-squares estimation of nonlinear parameters. SIAM J Appl Math 11:431–443
210
W. Keller
Mayer-Gürr T (2012) Gravitationsfeldbestimmung ausn der Analyse kurzer bahnbögen am eispiel der Satellitenmissionen CHAMP und GRACE. Technical report, DGK Reihe C, Bd. 675 Mayer-Gürr T, Eicker A, Ilk K-H (2007) ITG-Grace02s: a GRACE gravity field derived from range measurements of short arcs. In: Gravity field of the Earth, proceedings of the 1st international symposium of the international gravity field service (IGFS), Istanbul Mayer-Gürr T, Ilk H, Eicker A, Feuchtinger M (2005) ITG-CHAMP01: a CHAMP gravity field model from short kinematic arcs over a one-year observation period. J Geodesy 78:462–480 Mayer-Gürr T, Kurtenbach E, Eicker A (2010) ITG-grace2010 gravity field model. http://www. igg.uni-bonn.de/apmg/index.php?id=itg-grace2010 Muller PM, Sjogren WL (1968) Mascons: lunar mass concentrations. Science 161:680–684 O’Keefe JA (1957) An application of Jacobi’s integral to the motion of an Earth satellite. Astron J 62:265–266 Petit G, Luzum B (2010) IERS conventions (2010) (IERS technical note 36). Technical report, Verlag des Bundesamtes für Kartographie und Geodäsie, Frankfurt am Main Reigber C (1969) Zur Bestimmung des Gravitationsfeldes der Erde aus Satellitenbeobachtungen. Technical report, DGK Reihe C, Bd. 137 Reigber C, Jochmann H, Wünsch J, Petrovic S, Schwintzer P, Barthelmes F, Neumayer K-H, König R, Förste C, Balmino G, Biancale R, Lemoine J-M, Loyer S, Perosanz F (2004) Earth gravity field and seasonal variability from CHAMP. In: Reigber C, Lühr H, Schwintzer P, Wickert J (eds) Earth observation with CHAMP – results from three years in orbit. Springer, Berlin, pp 25–30 Reigber C, Lühr H, Grunwald L, Förste C, König R (2006) CHAMP mission 5 years in orbit. In: Flury J, Rummel R, Reigber C, Rothacher M, Boedecker G, Schreiber U (eds) Observation of the Earth system from space. Springer, Berlin/Heidelberg/New York Reubelt T (2009) Harmonische Gravitationsfeldanalyse aus GPS-vermessenen kinematischen Bahnen niedrig fliegender Satelliten vom Typ CHAMP, GRACE, GOCE mit einem hochauflösenden Beschleunigungsansatz. Technical report, DGK Reihe C, Bd. 632 Reubelt T, Austen G, Grafarend EW (2003) Harmonic analysis of the Earth’s gravitational field by means of semi-continuous ephemerides of a low Earth orbiting GPS-tracked satellite. Case study: CHAMP. J Geodesy 77:257–278 Rowlands DD, Luthcke SB, McCarthy JJ, Klosko SM, Chinn DS, Lemoine FG, Boy J-P, Sabaka TS (2010) Global mass-flux solutions from grace: a comparison of parameter estimation strategies – mass concentrations versus Stokes coefficients. J Geophys Res 115:B01403 Rummel R (2003) How to climb the gravity wall. Space Sci Rev 108:1–14 Schmidt M, Han S-C, Kusche J, Sanchez L, Shum CK (2006) Regional high-resolution spatiotemporal gravity modeling from GRACE data using spherical wavelets. Geophys Res Lett 33:L08403 Schneider M (1968) A general method of orbit determination. Technical report, Library Translations, Aircraft Establishment, Ministry of Technology, Farnborough Sneeuw N (2000) A semi-analytical approach to gravity field analysis from satellite observations. Technical report, DGK Reihe C, Bd. 527 Visser PNAM, Sneeuw N, Gerlach C (2003) Energy integral method for gravity field determination from satellite orbit coordinates. J Geodesy 77:207–216 Weisstein E Wolfram mathworld. http:mathworld.wolfram.com, 2014 Wolff M (1969) Direct measurements of the Earth’s gravitational field using a satellite pair. J Geophys Res 74:5295–5300
GOCE: Gravitational Gradiometry in a Satellite Reiner Rummel
Contents 1 Introduction: GOCE and Earth Sciences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 GOCE Gravitational Sensor System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Gravitational Gradiometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 GOCE Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusions: GOCE Science Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
212 213 216 221 221 224
Abstract
Spring 2009 the satellite Gravity and steady-state Ocean Circulation Explorer (GOCE), equipped with a gravitational gradiometer, was launched by European Space Agency (ESA). Its purpose is the detailed determination of the spatial variations of the Earth’s gravitational field, with applications in oceanography, geophysics, geodesy, glaciology, and climatology. Gravitational gradients are derived from the differences between the measurements of an ensemble of three orthogonal pairs of accelerometers, located around the center of mass of the spacecraft. Gravitational gradiometry is complemented by gravity analysis from orbit perturbations. The orbits are thereby derived from uninterrupted and threedimensional GPS tracking of GOCE. The gravitational tensor consists of the nine second-derivatives of the Earth’s gravitational potential. Due to its symmetry only six of them are independent. These six components can also be interpreted in terms of the local curvature of the field or in terms of components of the tidal
R. Rummel () Institut für Astronomische und Physikalische Geodäsie, TU Munich, Munich, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_4
211
212
R. Rummel
field generated by the Earth inside the spacecraft. Four of the six components are measured with high precision (1011 s2 per square-root of Hz), the others are less precise. Several strategies exist for the determination of the gravity field at the Earth’s surface from the measured tensor components at altitude. The mission ended in November 2013. Until August 2012 in total 2.3 years of data were collected. They entered into ESA’s fourth release of GOCE gravity models. After August 2012 the orbit altitude was lowered in several steps by altogether 31 km in order to test the enhanced gravitational sensitivity at lower orbit heights.
The fields of application range from solid earth physics, via geodesy and oceanography to atmospheric physics. For example, several studies are concerned with the state of isostatic mass compensation in regions such as South America, Africa, Himalaya, and Antarctica. GOCE will help to unify height systems worldwide and enable the direct conversion of GPS-based ellipsoidal heights to accurate and globally consistent heights above the geoid. For the first time, it became possible to derive mean dynamic ocean topography and geostrophic ocean velocities with high spatial resolution and accuracy directly from space, combining the altimetric mean sea surface and the GOCE geoid. Assimilation into numerical ocean circulation models will help to improve estimates of ocean mass and heat transport. Commonmode accelerations as measured by GOCE lead to improved atmospheric density and wind estimates at GOCE altitudes.
1
Introduction: GOCE and Earth Sciences
On March 17, 2009, the European Space Agency (ESA) launched the satellite Gravity and steady-state Ocean Circulation Explorer (GOCE). It is the first satellite of ESA’s Living Planet Programme (see ESA 1999a, 2006). It is also the first one that is equipped with a gravitational gradiometer. The purpose of the mission is to measure the spatial variations of the Earth’s gravitational field globally with maximum resolution and accuracy. Its scientific purpose is essentially twofold. First, the gravitational field reflects the density distribution of the Earth’s interior. There are no direct ways to probe the deep Earth interior, only indirect ones in particular, seismic tomography, gravimetry, and magnetometry. The gravimetry part is now been taken care of by GOCE. Also, in the field of space magnetometry, an ESA mission was launched in fall 2013; it is denoted “Swarm” and consists of three satellites. Seismic tomography is based on a worldwide integrated network of seismic stations. From a joint analysis of all seismic data, a tomographic image of the spatial variations in the Earth’s interior of the propagation velocity of seismic waves is derived. The three methods together establish the experimental basis for the study of solid Earth physics, or more specifically, of phenomena such as core-mantle topography, mantle convection, mantle plumes, ocean ridges, subduction zones, mountain building, and mass compensation. Inversion of gravity alone is non-unique but joint inversion
GOCE: Gravitational Gradiometry in a Satellite
213
together with seismic tomography, magnetic field measurements and in addition with surface data of plate velocities and topography, and models from mineralogy will lead to a more and more comprehensive picture of the dynamics and structure of the Earth’s interior (see, e.g., Bunge et al. 1998; Hager and Richards 1989; Kaban et al. 2004; Lithgow-Bertelloni and Richards 1998). Second, the gravitational field and therefore the mass distribution of the Earth determines the geometry of level surfaces, plumb lines, and lines of force. This geometry constitutes the natural reference in our physical and technical world. In particular, in cases where small potential differences matter such as in ocean dynamics and large civil constructions, precise knowledge of this reference is an important source of information. The most prominent example is ocean circulation. Dynamic ocean topography, the small one up to 2 m deviation of the actual ocean surface from an equipotential surface, can be directly translated into ocean surface circulation. The equipotential surface at mean ocean level is referred to as geoid and it represents the hypothetical surface of the world oceans at complete rest. GOCE, in conjunction with satellite altimetry missions, like Jason will allow for the first time direct and detailed measurement of ocean circulation (see discussions in Albertella and Rummel 2009; Ganachaud et al. 1997; LeGrand and Minster 1999; Losch et al. 2002; Maximenko et al. 2009; Wunsch and Gaposchkin 1980). GOCE is an important satellite mission for oceanography, solid Earth physics, geodesy, and climate research (compare ESA 1999b; Johannessen et al. 2003; Rummel et al. 2002).
2
GOCE Gravitational Sensor System
In the following, the main characteristics of the GOCE mission which is unique in several ways will be summarized (see also Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the first: the History of the Geomagnetic Atlases). The mission consists of two complementary gravity sensing systems. The large-scale spatial variations of the Earth’s gravitational field are derived from its orbit, while the medium to short scales are measured by a so-called gravitational gradiometer. Even though satellite gravitational gradiometry has been proposed already in the late 1950s in Carroll and Savet (1959) (see also Wells 1984), the GOCE gradiometer is the first instrument of its kind to be put into orbit. The principles of satellite gradiometry will be described in Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was Not the First: The History of the Geomagnetic Atlases. The purpose of gravitational gradiometry is the measurement of the gradients of gravitational acceleration or, equivalently, the second-derivatives of the gravitational potential. In total, there exist nine second-derivatives in the orthogonal coordinate system of the instrument. The GOCE gradiometer is a three-axis instrument and its measurements are based on the principle of differential acceleration measurement. It consists of three pairs of accelerometers, mounted orthogonally to each other, each accelerometer having three axes (see Fig. 1). The gradiometer baseline of each oneaxis gradiometer is 50 cm. The precision of each accelerometer is about 21012 m/s2
214
R. Rummel
Fig. 1 GOCE gravitational gradiometer (courtesy ESA)
per square-root of Hz along two sensitive axes; the third axis has much lower sensitivity. This results in a precision of the gravitational gradients of 1011 s2 or 10 mE per square-root of Hz (1 E = 109 s2 = 1 Eötvös Unit). From the measured gravitational acceleration differences, the three main diagonal terms and one offdiagonal term of the gravitational tensor can be determined with high precision. These are the three diagonal components xx , yy , zz as well as the off-diagonal component xz , while the components xy and yz are less accurate. Thereby the coordinate axes of the instrument are pointing in flight direction (x/, cross
GOCE: Gravitational Gradiometry in a Satellite
215
direction (y/, and radially toward the Earth (z/. The extremely high gradiometric performance of the instrument is confined to the so-called measurement band (MB), while outside the measurement band noise is increasing. Strictly speaking, the derivation of the gradients from accelerometer differences assumes all six accelerometers (three pairs) to be perfect twins and all accelerometer test masses to be perfectly aligned. In real world, small deviations from such an idealization exist. Thus, the calibration of the gradiometer is of high importance. Calibration is essentially the process of determination of a set of scale, misalignment, and angular corrections. They are the parameters of an affine transformation between an ideal and the actual set of six accelerometers. Calibration in orbit is done by random shaking of the satellite by means of a set of cold gas thrusters and comparison of the actual output with the theoretically correct one. Before calibration the nonlinearities of each accelerometer are removed electronically; in other words, the proof mass of each accelerometer inside the electrodes of the capacitive electronic feedback system is brought into its linear range. The gravitational signal is superimposed by the effects of angular velocity and angular acceleration of the satellite in space. Knowledge of the latter is required for the removal of the angular effects from the gradiometer data and for angular control. The separation of angular acceleration from the gravitational signal is possible from a particular combination of the measured nine acceleration differences. The angular rates (in the MB) as derived from the gradiometer data in combination with those deduced from the star sensor readings are used for attitude control of the spacecraft. The satellite has to be well controlled and guided smoothly around the Earth. It is Earth pointing, which implies that it performs one full revolution in inertial space per full orbit cycle. Angular control is attained via magnetic torquers, i.e., using the Earth’s magnetic field lines for orientation. This approach leaves uncontrolled onedirectional degree of freedom at any moment. In order to prevent non-gravitational forces, in particular atmospheric drag, to “sneak” into the measured differential accelerations as secondary effect, the satellite is kept “drag-free” in along-track direction by means of a pair of ion thrusters. The necessary control signal is derived from the available “common-mode” accelerations (sum instead of differences of the measured accelerations) along the three orthogonal axes of the accelerometer pairs of the gradiometer. Some residual angular contribution may also add to the commonmode acceleration, due to the imperfect symmetry of the gradiometer relative to the spacecraft’s center of mass. This effect has to be modeled. The second gravity sensor device is a newly developed European GPS receiver. From its measurements, the orbit trajectory is computed to within a few centimeters, either purely geometrically, the so-called kinematic orbit, or by the method of reduced dynamic orbit determination (compare Bock et al. 2011; Jäggi 2007; Švehla and Rothacher 2004). As the spacecraft is kept in an almost drag-free mode (at least in along-track direction) the orbit motion can be regarded as purely gravitational. It complements the gradiometric gravity field determination and covers the long wavelength part of the gravity signal.
216
R. Rummel
CESS
Xenon tank
Ion propulsion module
MT
Gradiometer
STR SSTI
CDM LRR
GCD tank
Fig. 2 GOCE satellite and main instruments (courtesy ESA) (CESS coarse earth and Sun sensor, MT magneto torquer, STR star tracker, SSTI satellite to satellite tracking instrument, CDM command and data management unit, LRR laser retro reflector)
The orbit altitude is extremely low, only about 255 km at perigee. This is essential for a high gravitational sensitivity. No scientific satellite has been flown at such low altitudes so far. Its altitude is maintained through the drag-free control and additional orbit maneuvers, which are carried out at regular intervals. As said above, this very low altitude results in high demands on drag-free and attitude control. Finally, any time-varying gravity signal of the spacecraft itself, the so-called self-gravitation, must be excluded. This results in extremely tight requirements on metrical stiffness and thermal control. In summary, GOCE is a technologically very complex and innovative mission. The gravitational field sensor system consists of a gravitational gradiometer and GPS receiver as core instruments. Orientation in inertial space is derived from star sensors. Common-mode and differential-mode accelerations from the gradiometer and orbit positions from GPS are used together with ion thrusters for drag-free control and together with magneto-torquers for angular control. The satellite and its instruments are shown in Fig. 2. The system elements are summarized in Table 1.
3
Gravitational Gradiometry
Gravitational gradiometry is the measurement of the second derivates of the gravitational potential V . Its principles are described in textbooks such as Misner et al. (1970), Falk and Ruppel (1974), and Ohanian and Ruffini (1994) or in articles like Rummel (1986), compare also Colombo (1989) and Rummel (1997). Despite the high precision of the GOCE gradiometer instrument, the theory can still be
GOCE: Gravitational Gradiometry in a Satellite
217
Table 1 Sensor elements and type of measurement delivered by them (approximate orientation of the instrument triad: x = along-track, y = out-of-orbit-plane, z = radially downward) Sensor Three-axis gravity gradiometer
Star sensors (STR) GPS receiver (SSTI) Drag control with two ion thrusters Angular control with magnetic torquers Orbit altitude maintenance Internal calibration (and quadratic factors removal) of gradiometer
Measurements Gravity gradients xx , yy , zz , xz in instrument system and in MBW (measurement bandwidth) Angular accelerations (highly accurate around y-axis, less accurate around x, z axes) Common-mode accelerations High-rate and high-precision inertial orientation Orbit trajectory with centimeter precision Based on common-mode accelerations from gradiometer and GPS orbit Based on angular rates from star sensors and gradiometer Based on GPS orbit Calibration signal from random shaking by cold gas thrusters (and electronic proof mass shaking)
formulated by classical Newton mechanics. Let us denote the gravitational tensor, expressed in the instrument frame as 1 0 @2 V Vxx Vxy Vxz @x 2 B @2 V A @ D Vij D Vyx Vyy Vyz D @ @y@x @2 V Vzx Vzy Vzz 0
@2 V @x@y @2 V @y 2 @2 V @z@x @z@y
@2 V @x@z @2 V @y@z @2 V @z2
1 C A;
(1)
where the gravitational potential represents the integration over all Earth masses (cf. Classical Physical Geodesy and Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications) ZZZ VP D G
Q d ˙Q : `PQ
(2)
where G is the gravitational constant, Q the density, `PQ the distance between the mass element in Q and the computation point P , and d ˙ is the infinitesimal volume. We may assume the gravitational effect of the atmosphere to be negligible. Then the space outside of the Earth is empty and it holds r rV D 0 (source free) apart from P r rV D 0 (vorticity free). This corresponds to saying in Eq. 1 Vij D Vj i and i Vi i D 0. It leaves only five independent components in each point and offers important cross-checks between the measured components. If the Earth were a homogenous sphere, the off-diagonal terms would be zero and in a local triad {north, east, radial} one would find
218
R. Rummel
0
1 1 0 Vxx Vxy Vxz 1 0 0 GM D Vij D @ Vyx Vyy Vyz A D 3 @ 0 1 0 A ; r 0 0 2 Vzx Vzy Vzz
(3)
where M is the mass of the spherical Earth. This simplification gives an idea about the involved orders of magnitude. At GOCE satellite altitude, it is Vzz D 2;740 E. This also implies that at a distance of 0.5 m from the spacecraft’s center of mass, the maximum gravitational acceleration is about 1:5 106 m/s2 . In an alternative interpretation, one can show that the Vij express the local geometric curvature structure of the gravitational field, i.e., 0
1 k1 t1 f1 D Vij D g @ t2 k2 f2 A ; f1 f2 H
(4)
where g is gravity, k1 and k2 express the local curvature of the level surfaces in north and east directions, t1 and t2 are the torsion, f1 and f2 the north and east components of the curvature of the plumb line, and H the mean curvature. For a derivation, refer to Marussi (1985). This interpretation of gravitational gradients in terms of gravitational geometry provides a natural bridge to Einstein’s general relativity, where gravitation is interpreted in terms of space-time curvature for it holds i R0j 0 D
1 Vij c2
(5)
i for the nine components of the tidal force tensor, which are components of Rkj l the Riemann curvature tensor with its indices running from 0, 1, 2, 3 (Ohanian and Ruffini 1994, p. 41; Moritz and Hofmann-Wellenhof 1993, Chap. 5). A third interpretation of gravitational gradiometry is in terms of tides. Sun and moon produce a tidal field on Earth. It is zero at the Earth’s center of mass and maximum at its surface. Analogously, the Earth is producing a tidal field in every Earth-orbiting satellite. At the center of mass of a satellite, the tidal acceleration is zero; i.e., the acceleration relative to the center of mass is zero; this leads to the terminology “zero-g.” The tidal acceleration increases with distance from the satellite’s center of mass like
ai D Vij dx j
(6)
with the measurable components of tidal acceleration ai and of relative position dx j , both taken in the instrument reference frame. Unlike sun and moon relative to the Earth, GOCE is always Earth pointing with its z-axis. This implies that the gradiometer measures permanently “high-tide” in z-direction and “low-tide”
GOCE: Gravitational Gradiometry in a Satellite
219
in x- and y-directions. The gradient components are deduced from taking the difference between the acceleration at two points along one gradiometer axis and symmetrically with respect to the satellite’s center of mass Vij D
ai .1/ ai .2/ : 2dx j
(7)
Remark 1. In addition to the tidal acceleration of the Earth, GOCE is measuring the direct and indirect tidal signal of sun and moon. This signal is much smaller, well-known and taken into account. If the gravitational attraction of the atmosphere is taken care of by an atmospheric model, the Earth’s outer field can be regarded source free and Laplace equation holds. It is common practice to solve Laplace equation in terms of spherical harmonic functions. The use of alternative base functions is discussed, for example, in Schreiner (1994) or Freeden et al. (1998). For a spherical surface, the solution of a Dirichlet boundary value problem yields the gravitational potential of the Earth in terms of normalized spherical harmonic functions Ynm .˝P / of degree n and order m as: V .P / D
X R nC1 X n
rP
tnm Ynm .P / D Y t
(8)
m
with f˝P , rP g D fP , P , rP g the spherical coordinates of P and tnm the spherical harmonic coefficients. The gravitational tensor at P is then Vij .P / D
(
X X n
m
tnm @ij
R rP
)
nC1 Ynm .P /
D Y fij gt :
(9)
The spherical harmonic coefficients tnm are derived from the measured gradiometric components Vij by least squares adjustment. Remarks 2. In the case of GOCE gradiometry the situation is as follows. Above degree and order n D m D 220, noise starts to dominate signal. Thus, the series has to be truncated in some intelligent manner, minimizing aliasing and leakage effects. GOCE is covering our globe in 61 days with a dense pattern of ground tracks. Original planning of its mission lifetime assumed the completion of at least three times such 61-day cycles. The orbit inclination is 96:7ı . This leaves the two polar areas (opening angle 6:7ı ) free of observations, the so-called polar gaps. Various strategies have been
220
R. Rummel
suggested for minimization of the effect of the polar gaps on the determination of the global field (compare, e.g., Baur et al. 2009). Instead of dealing in an least-squares adjustment with the analysis of the individual tensor components one could consider the study of particular combinations. A very elegant approach is the use of the invariants of the gravitational tensor. They are independent of the orientation of the gradiometer triad. It is referred to Rummel (1986) and in particular to Baur and Grafarend (2006) and Baur (2007). While the first invariant cannot be used for gravity field analysis, it is the Laplace trace condition, the two others can be used. They are nonlinear and lead to an iterative adjustment. Let us assume for a moment the gradiometer components Vij would be given in an Earth-fixed spherical {north, east, radial}-triad. In that case, the tensor can be expanded in tensor spherical harmonics and decomposed into the irreducible radial, mixed normal-tangential, and pure tangential parts with the corresponding eigenvalues (Martinec 2003; Nutz 2002; Rummel 1997; Rummel and van Gelderen 1992; Schreiner 1994), for zz and xx C yy ; p.n C 1/.n C 2/ for fxx yy ; 2xy g .n C 2/ n.n C 1/ for fxz ; yz g; .nC2/Š .n2/Š All eigenvalues are of the order of n2 . In Schreiner (1994) and Freeden et al. (1998) as well as Geodetic Boundary Value Problem of this handbook, it is also shown that {xz , yz } and {xx yy , – 2xy } are insensitive to degree zero, the latter combination also to degree one. In the case of GOCE, the above properties cannot be employed in a straightforward manner, because (1) not all components are of comparable precision and, more importantly, (2) the gradiometric components are measured in the instrument frame, which is following in its orientation the orbit and the attitude control commands. There exist various competing strategies for the actual determination of the field coefficients tnm , depending on whether the gradients are regarded in situ measurements on a geographical grid, along the orbit tracks, as a time series along the orbit or as Fourier-coefficients derived from the latter (compare, e.g., Brockmann et al. 2009; Migliaccio et al. 2004; Pail and Plank 2004; Stubenvoll et al. 2009). These methods take into account the noise characteristics of the components and their orientation in space. The principles of the methods of gravity modeling are described by Roland Pail in this handbook (Pail 2014). Here, the stochastic as well as the functional model are discussed. It also contains a summary of the methods of determination of angular rates from a joint analysis of star tracking and gradiometry applying Wiener filtering. Because of the polar gap gravity modeling combining gravitational gradiometry and GPS-based kinematic orbits leads to numerical instabilities and requires some form of regularization (Kusche and Klees 2002). A further step will be regional refinement by combination of GOCE with terrestrial data sets (e.g., Eicker et al. 2009; Stubenvoll et al. 2009 or Förste et al. 2011).
GOCE: Gravitational Gradiometry in a Satellite
221
Table 2 ESA release 4 GOCE gravity models DIR4 and TIM4 and their characteristics DIR4 260 01.11.09–01.08.12; 2.3 years(net)
TIM4 250 01.11.09–19.06.12; 2.2 years(net)
Vxx , Vyy , Vzz , Vxz 288 Mio. Obs. Band-pass filter – 2003–2012 GRGS RL02 (d/o 55), GFZ RL05 (d/o 56–180) LAGEOS1/2 (SLR) 1985–2010, 25 years Regularization Iterative spherical cap (d/o 260) based on GRACE/LAGEOS Kaula zero constraint(d/o >200)
Vxx , Vyy , Vzz , Vxz 279 Mio. Obs. ARMA filter per segment Short arc approach (d/o 130) –
Maximum D/O GOCE Data Volume Gravity Gradients Gradient Filter GOCE SST (GPS) GRACE SST(K-Band)
4
– Kaula zero constraint (near zonals and for d/o >180)
GOCE Status
GOCE was launched on March 17, 2009. After a commissioning and calibration phase, the first operational measurement cycle started on November 1, 2009. The mission was originally planned for 20 months only, because of its smooth performance it was ultimately extended until November 2013. All sensor systems worked well with the exception of a slightly higher than nominal noise level in the radial components Vzz and Vxz for reasons still not understood. Three major interruptions occurred: from February 12 to March 2 and from July 2 to September 25, 2010 due to problems with the processor units and from January 1 to January 21, 2011 because of a software problem related to the GPS receiver. The mission ended on November 11, 2013 with the reentry of the spacecraft over the South Atlantic Ocean near the Falkland Islands. By August 2013, altogether 2.3 years of data were available and entered the release 4 models. ESA Release 4 GOCE gravity models are summarized in Table 2. From August 1, 2012 on the orbit, altitude was lowered in four steps with at least one full measurement cycle in between by 9, 6, 5, and 11 km, i.e., altogether by 31km. The lowest altitude was attained on May 31, 2013 with 224 km. This was done in order to test whether the increased gravitational sensitivity can be exploited despite the increase of atmospheric drag at lower altitudes. Preliminary analysis shows an increase in spatial resolution. Release five is expected to be published in summer 2014. It will include the complete GOCE data set, including the data from the lower orbit altitudes.
5
Conclusions: GOCE Science Applications
Data exploitation of GOCE for science and application is far from being completed. We observe that the research fields are essentially those identified already in the
222
R. Rummel
Fig. 3 Vertical gravity gradient field of Antartica as measured by GOCE (units: milliEötvös D10-12s-2) and some prominent tectonic features; the GOCE polar gap is marked in gray
GOCE mission proposal (ESA 1999b). There is an exception. The expectation was that GOCE will be unable to sense temporal variations of gravity and geoid. This is true in general; however some big mass movements such as those associated with the big Japan earthquake seem to be detectable from the gradiometer data (Bouman et al. 2013). Solid earth physics. The gravity field as sensed by GOCE reflects the density distribution of the earth’s masses, primarily from topography, crust, and lithosphere and in an attenuated form from the upper and lower mantle down to the core. In ocean areas, gravity is well-known already from satellite altimetry. The situation is different in continental areas. Comparisons of GOCE gravity models with EGM2008 (e.g., Yi and Rummel 2014) reveal good agreement in well-surveyed parts of the earth such as North America, Europe, Australia, New Zealand, and Japan but rather poor agreement in large parts of South America, Africa, Himalaya, and South-East Asia. EGM2008 is a combination of GRACE satellite gravimetry with a global set of terrestrial gravity anomalies. Some of the regions with poor terrestrial gravity are of high geodynamic relevance. Currently several studies are underway, looking into the state of isostatic equilibrium and into the elastic thickness of the lithosphere (e.g., Sampietro et al. 2012 or van der Meijde et al. 2013). A special case is Antarctica where terrestrial gravity data is sparse. GRACE delivered gravity and geoid information in Antarctica but confined to rather large spatial scales. GOCE shows now the gravity field and with it tectonic processes hidden under an ice layer with a thickness of several kilometers (e.g., Ferraccioli et al. 2011). Figure 3 shows the vertical gravity gradient field of Antarctica as measured by GOCE and some prominent tectonic features. The polar gap is marked in gray.
GOCE: Gravitational Gradiometry in a Satellite
223
Height systems. Official heights are provided to the user either as gravity potential differences between terrain points or as metric heights derived from the potential differences such as orthometric or normal heights. The official height systems refer to a zero value at some adopted reference marker at a tide gauge, representing mean sea level there. However, mean sea level varies from location to location due to the variations in coastal oceanic conditions. The variations are not big, usually less than 2 m, but they are responsible for unknown height off-sets between the various worldwide official height systems. GOCE provides the best possible geoid surface (ideally refined locally by shorter scales from some regional geoid computation). It represents the theoretical sea surface at rest and can be introduced as an ideal worldwide height reference. It also permits determination of the height off-sets between the various height systems, not detectable in the past. This process of height unification is of great value for mapping, large civil constructions and sea level research. As a demonstration of the value of the GOCE geoid for height determination, Woodworth et al. (2012) revealed the bias of the North American height system which is based on classical spirit leveling. In the near future, GPS positions can be translated to heights above the GOCE geoid yielding globally consistent and physically meaningful heights worldwide. Oceanography. In his classical textbook “Atmosphere-Ocean Dynamics”, A.E. Gill (1982) writes on page 46: “If the sea were at rest, its surface would coincide with the geopotential surface.” This geopotential surface is the geoid and with GOCE its shape is determined with an accuracy of 2–3 cm. In reality the sea is not at rest, it is in motion as driven by winds, atmospheric pressure differences, and tides and as a consequence the ocean surface deviates from a geopotential surface by up to 1 or 2 m at the major ocean currents; the deviation is denoted dynamic ocean topography. The actual sea surface is measured from space by satellite altimetry. More than 20 years of altimetry yield models of the mean sea surface (MSS) accurate to a few centimeters. Subtraction of the GOCE geoid from MSS gives the geodetic mean dynamic ocean topography (MDT). It is maintained by the balance of the pressure differences due to the MDT and Coriolis acceleration.
Fig. 4 Geostropic velocities (in cm/s) derived from drifter measurements (left) and from geodetic mean dynamic ocean topography (right)
224
R. Rummel
Its slope is proportional to the geostrophic velocities of ocean circulation. GOCE together with altimetry give MDT and the geostrophic velocity field without the use of any oceanographic in situ data. Geodetic MDT and geostrophic velocities represent a new type of input quantity for numerical ocean circulation models and help to improve ocean mass and heat transport estimates, e.g., in the area of the Weddell Sea in the Southern ocean, one of the tipping points of our climate system. Figure 4 shows the magnitude of the geostrophic velocities in the North Atlantic based on the Danish MSS model DTU-10 (right) and geostrophic velocities derived from drifter data (left). We observe higher signal strength of the geodetic estimate. A large number of ocean studies based on GOCE was published already. Examples are Bingham et al. (2011), Janji´c et al. (2012) or Le Traon et al. (2011). Atmosphere. GOCE was kept drag-free in flight direction using ion thrusters. The feedback signal for drag-free control was the measured common-mode accelerations of the six accelerometers of the gradiometer. They are a measure of the nongravitational forces acting on the GOCE spacecraft and open the possibility for studies of atmospheric density and winds (Doornbos et al. 2013). At GOCE altitude, no other data source is available.
References Albertella A, Rummel R (2009) On the spectral consistency of the altimetric ocean and geoid surface: a one-dimensional example. J Geod 83(9):805–815 Baur O (2007) Die Invariantendarstellung in der Satellitengradiometrie. DGK, Reihe C, Beck, München Baur O, Grafarend EW (2006) High performance GOCE gravity field recovery from gravity gradient tensor invariants and kinematic orbit information. In: Flury J, Rummel R, Reigber Ch, Rothacher M, Boedecker G, Schreiber U (eds) Observation of the earth system from space. Springer, Berlin, pp 239–254 Baur O, Cai J, Sneeuw N (2009) Spectral approaches to solving the polar gap problem. In: Flechtner F, Mandea M, Gruber Th, Rothacher M, Wickert J, Güntner A, Schöne T (eds) System earth via geodetic-geophysical space techniques. Springer, Berlin Bingham RJ, Knudsen P, Andersen O, Pail R (2011) An initial estimate of the North Atlantic steady-state geostrophic circulation from GOCE. Geophys Res Lett 38:L01606. doi:10.1029/2010GL045633 Bock H, Jäggi A, Meyer U, Visser P, van den Ijssel J, van Helleputte T, Heinze M, Hugentobler U (2011) GPS-derived orbits of the GOCE satellite. J Geod 85(11):807–818 Bouman J, Visser P, Fuchs M, Broerse T, Haberkorn C, Lieb V, Schmidt M, Schrama E, van der Wal W (2013) GOCE gravity gradients and the Earth’s time varying gravity field. ESA Living Planet, Edinburgh Brockmann JM, Kargoll B, Krasbutter I, Schuh WD, Wermuth M (2009) GOCE data analysis: from calibrated measurements to the global earth gravity field. In: Flechtner F, Mandea M, Gruber Th, Rothacher M, Wickert J, Güntner A, Schöne T (eds) System earth via geodetic-geophysical space techniques. Springer, Berlin Bunge H-P, Richards MA, Lithgow-Bertelloni C, Baumgardner JR, Grand SP, Romanowiez BA (1998) Time scales and heterogeneous structure in geodynamic earth models. Science 280:91–95 Carroll JJ, Savet PH (1959) Gravity difference detection. Aerosp Eng 18:44–47
GOCE: Gravitational Gradiometry in a Satellite
225
Colombo O (1989) Advanced techniques for high-resolution mapping of the gravitational field. In: Sansò F, Rummel R (eds) Theory of satellite geodesy and gravity field determination. Lecture notes in earth sciences, vol 25. Springer, Heidelberg, pp 335–369 Doornbos E, Bruinsma S, Fritsche B, Visser P, v/d Ijssel J, Teixeira Encarna J, Kern M (2013) Air density and wind retrieval using GOCE data. ESA Living Planet, Edinburgh Eicker A, Mayer-Gürr T, Ilk KH, Kurtenbach E (2009) Regionally refined gravity field models from in situ satellite data. In: Flechtner F, Mandea M, Gruber Th, Rothacher M, Wickert J, Güntner A, Schöne T (eds) System earth via geodetic-geophysical space techniques. Springer, Berlin ESA (1999a) Introducing the “Living Planet” Programme-the ESA strategy for earth observation. ESA SP-1234. ESA Publication Division, ESTEC, Noordwijk ESA (1999b) Gravity field and steady-state ocean circulation mission. Reports for mission selection, SP-1233 (1). ESA Publication Division, ESTEC, Noordwijk. http://www.esa.int./ livingplanet/goce ESA (2006) The changing earth-new scientific challenges for ESA’s Living Planet Programme. ESA SP-1304. ESA Publication Division, ESTEC, Noordwijk Falk G, Ruppel W (1974) Mechanik, Relativität, Gravitation. Springer, Berlin Ferraccioli F, Finn CA, Jordan TA, Bell RE, Anderson LM, Damaske D (2011) East Antarctic rifting triggers uplift of the Gamburtsev Mountains. Nature 479:388–392 Förste C, Bruinsma S, Shako R, Marty J-C, Flechtner F, Abrikosov O, Dahle C, Lemoine J-M, Neumayer H, Biancale R, Barthelmes F, König R, Balmino G (2011) EIGEN-6: a new combined global gravity field model including GOCE data from the collaboration of GFZ-Potsdam and GRGS-Toulouse, EGU2011-3242 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere. Oxford Science Publications, Oxford Ganachaud A, Wunsch C, Kim M-Ch, Tapley B (1997) Combination of TOPEX/POSEIDON data with a hydrographic inversion for determination of the oceanic general circulation and its relation to geoid accuracy. Geophys J Int 128:708–722 Gill AE (1982) Atmosphere-ocean dynamics. Academic, New York Hager BH, Richards MA (1989) Long-wavelength variations in Earth’s geoid: physical models and dynamical implications. Philos Trans R Soc Lond A 328:309–327 Jäggi A (2007) Pseudo-stochastic orbit modelling of low earth satellites using the global positioning system. Geodätisch - geophysikalische Arbeiten in der Schweiz, 73 Janji´c T, Schröter J, Savcenko R, Bosch W, Albertella A, Rummel R, Klatt O (2012) Impact of combining GRACE and GOCE gravity data on ocean circulation estimates. Ocean Sci 8:65–79. doi:10.5194/os-8-65-2012 Johannessen JA, Balmino G, LeProvost C, Rummel R, Sabadini R, Sünkel H, Tscherning CC, Visser P, Woodworth P, Hughes CH, LeGrand P, Sneeuw N, Perosanz F, Aguirre-Martinez M, Rebhan H, Drinkwater MR (2003) The European gravity field and steady-state ocean circulation explorer satellite mission: its impact on geophysics. Surv Geophys 24:339–386 Kaban MK, Schwintzer P, Reigber Ch (2004) A new isostatic model of the lithosphere and gravity field. J Geod 78:368–385 Kusche J, Klees R (2002) Regularization of gravity field estimation from satellite gravity gradients. J Geod 76:359–368 LeGrand P, Minster J-F (1999) Impact of the GOCE gravity mission on ocean circulation estimates. Geophys Res Lett 26(13):1881–1884 Le Traon PY, Schaeffer P, Guinehut S, Rio MH, Hernandez F, Larnicol G, Lemoine JM (2011) Mean ocean dynamic topography from GOCE and altimetry, ESA SP 686 Lithgow-Bertelloni C, Richards MA (1998) The dynamics of cenozoic and mesozoic plate motions. Rev Geophys 36(1):27–78 Losch M, Sloyan B, Schröter J, Sneeuw N (2002) Box inverse models, altimetry and the geoid; problems with the omission error. J Geophys Res 107(C7):15-1–15-13 Martinec Z (2003) Green’s function solution to spherical gradiometric boundary-value problems. J Geod 77:41–49
226
R. Rummel
Marussi A (1985) Intrinsic geodesy. Springer, Berlin Maximenko N, Niiler P, Rio M-H, Melnichenko O, Centurioni L, Chambers D, Zlotnicki V, Galperin B (2009) Mean dynamic topography of the ocean derived from satellite and drifting buoy data using three different techniques. J Atmos Ocean Technol 26:1910–1919 Migliaccio F, Reguzzoni M, Sansò F (2004) Space-wise approach to satellite gravity field determination in the presence of colored noise. J Geod 78:304–313 Misner CW, Thorne KS, Wheeler JA (1970) Gravitation. Freeman, San Francisco Moritz H, Hofmann-Wellenhof B (1993) Geometry, relativity, geodesy. Wichmann, Karlsruhe Nutz H (2002) A unified setup of gravitational observables. Dissertation. Shaker Verlag, Aachen Ohanian HC, Ruffini R (1994) Gravitation and spacetime. Norton & Comp., New York Pail R (2014) It is all about statistics: global gravity field modelling from GOCE and complementary data. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics. Springer, Heidelberg Pail R, Plank R (2004) GOCE gravity field processing strategy. Stud Geophys Geod 48:289–309 Rummel R (1986) Satellite gradiometry. In: Sünkel H (ed) Mathematical and numerical techniques in physical geodesy. Lecture notes in earth sciences, vol 7. Springer, Berlin, pp 317–363. ISBN (Print):978-3-540-16809-6, doi:10.1007/BFb0010135 Rummel R (1997) Spherical spectral properties of the earth’s gravitational potential and its first and second-derivatives. In: Sansò F, Rummel R (eds) Geodetic boundary value problems in view of the one centimeter geoid. Lecture notes in earth sciences, vol 65. Springer, Berlin, pp 359–404. ISBN:3-540-62636-0 Rummel R, van Gelderen M (1992) Spectral analysis of the full gravity tensor. Geophys J Int 111:159–169 Rummel R, Balmino G, Johannessen J, Visser P, Woodworth P (2002) Dedicated gravity field missions-principles and aims. J Geodyn 33/1–2:3–20 Sampietro D, Reguzzoni M, Braitenberg C (2012) The GOCE estimated Moho beneath the Tibetan Plateau and Himalaya. In: C Rizos, P Wills (eds) Earth on the edge: science for a substainable planet, International Association of Geodesy Symposia, vol 139. Springer, pp 391–397. doi:10.1007/978-3-642-37222-3_52 Schreiner M (1994) Tensor spherical harmonics and their application in satellite gradiometry. Dissertation, Universität Kaiserslautern Stubenvoll R, Förste Ch, Abrikosov O, Kusche J (2009) GOCE and its use for a high-resolution global gravity combination model. In: Flechtner F, Mandea M, Gruber Th, Rothacher M, Wickert J, Güntner A, Schöne T (eds) System earth via geodetic-geophysical space techniques. Springer, Berlin Svehla D, Rothacher M (2004) Kinematic precise orbit determination for gravity field determination. In: Sansò F (ed) The proceedings of the international association of geodesy: a window on the future of geodesy. Springer, Berlin, pp 181–188 Van der Meijde M, Julia J, Assumpcao M(2013) Gravity derived Moho for South America. Tectonophysics 609:456–467 Wells WC (ed) (1984) Spaceborne gravity gradiometers. NASA conference publication, vol 2305, Greenbelt Woodworth PL, Hughes CW, Bingham RJ, Gruber T(2012) Towards worthwide height system unification using ocean information., J Geodetic Sci 2(4):302–318. doi:10.2478/v10156-0120004-8 Wunsch C, Gaposchkin EM (1980) On using satellite altimetry to determine the general circulation of the oceans with application to geoid improvement. Rev Geophys 18:725–745 Yi W, Rummel R (2014) A comparison of GOCE gravitational models with EGM2008. J Geodyn 73:14–22
Sources of the Geomagnetic Field and the Modern Data That Enable Their Investigation Nils Olsen, Gauthier Hulot, and Terence J. Sabaka
Contents 1 Sources of the Earth’s Magnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Internal Field Sources: Core and Crust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Ionospheric, Magnetospheric, and Earth-Induced Field Contributions . . . . . . . . . . . 2 Modern Geomagnetic Field Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Definition of Magnetic Elements and Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Ground Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Satellite Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Making the Best of the Data to Investigate the Various Field Contributions: Geomagnetic Field Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
228 230 233 235 235 237 238 244 246
Abstract
The geomagnetic field one can measure at the Earth’s surface or on board satellites is the sum of contributions from many different sources. These sources have different physical origins and can be found both below (in the form of electrical currents and magnetized material) and above (only in the form of electrical currents) the Earth’s surface. Each source happens to produce a contribution
N. Olsen () DTU Space, Technical University of Denmark, Kgs. Lyngby, Denmark e-mail: [email protected] G. Hulot Equipe de Géomagnétisme, Institut de Physique du Globe de Paris, Sorbonne Paris Cité, Université Paris Diderot, Paris, France e-mail: [email protected] T.J. Sabaka Planetary Geodynamics Laboratory, Code 698, NASA Goddard Space Flight Center, Greenbelt, MD, USA e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_5
227
228
N. Olsen et al.
with rather specific spatio-temporal properties. This fortunate situation is what makes the identification and investigation of the contribution of each source possible, provided appropriate observational data sets are available and analyzed in an adequate way to produce the so-called geomagnetic field models. Here we provide a general overview of the various sources that contribute to the observed geomagnetic field, and of the modern data that enable their investigation via such procedures. The Earth has a large and complicated magnetic field, a major part of which is produced by a self-sustaining dynamo operating in the fluid outer core. What is measured at or near the surface of the Earth, however, is the superposition of the core field and of additional fields caused by magnetized rocks in the Earth’s crust, by electric currents flowing in the ionosphere, magnetosphere and oceans, and by currents induced in the Earth by the time-varying external fields. The sophisticated separation of these various fields and the accurate determination of their spatial and temporal structure based on magnetic field observations is a significant challenge, which requires advanced modeling techniques (see e.g., Hulot et al. 2007). These techniques rely on a number of mathematical properties which we review in the accompanying chapter by Sabaka et al. (2010), entitled Mathematical Properties Relevant to Geomagnetic Field Modeling. But as many of those properties have been derived by relying on assumptions motivated by the nature of the various sources of the Earth’s magnetic field and of the available observations, it is important that a general overview of those sources and observations be given. This is precisely the purpose of the present chapter. It will first describe the various sources that contribute to the Earth’s magnetic field (Sect. 1) and next discuss the observations currently available to investigate them (Sect. 2). Special emphasis is given on data collected by satellites, since these are extensively used for modeling the present magnetic field. We will conclude with a few words with respect to the way the fields those sources produce can be identified and investigated, thanks to geomagnetic field modeling.
1
Sources of the Earth’s Magnetic Field
Several sources contribute to the magnetic field that is measured at or above the surface; the most important ones are sketched in Fig. 1. The main part of the field is due to electrical currents in the Earth’s fluid outer core at depths larger than 2,900 km; this is the so-called core field. Its strength at the Earth’s surface varies from less than 30,000 nT near the equator to about 60,000 nT near the poles, which makes the core field responsible for more than 95 % of the observed field at ground. Magnetized material in the crust (the uppermost few kilometers of Earth) causes the crustal field; it is relatively weak and accounts on average only for a few percent of the observed field at ground. Core and crustal fields together make the internal field (since their sources are internal to the Earth’s surface). External magnetic field contributions are caused by electric currents in the ionosphere (at altitudes 90–1,000 km) and magnetosphere (at altitudes of several Earth radii). On average,
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
229
Fig. 1 Sketch of the various sources contributing to the near-Earth magnetic field
their contribution is also relatively weak – a few percent of the total field at ground during geomagnetic quiet conditions. However, if not properly considered, they disturb the precise determination of the internal field. It is therefore of crucial importance to account for external field (by data selection, data correction, and/or field co-estimation) in order to obtain reliable models of the internal fields. Finally, electric currents induced in the Earth’s crust and mantle by the time-varying fields of external origin, and the movement of electrically conducting seawater, cause magnetic field contributions that are of internal origin like the core and crustal field; however, typically only core and crustal field is meant when speaking about “internal sources.” A useful way of characterizing the spatial behavior of the geomagnetic field is to make use of the concept of spatial power spectra (e.g., Lowes 1996 and section 4 of Sabaka et al. 2010). Figure 2 shows the spectrum of the field of internal origin, often referred to as the Lowes-Mauersberger spectrum, which gives the mean square magnetic field at Earth’s surface due to contributions with horizontal wavelength n corresponding to spherical harmonic degree n. The spectrum of the observed magnetic field (based on a combination of the recent field models derived by Olsen et al. 2009 and Maus et al. 2008) is shown by black dots, while theoretical spectra describing core, resp. crustal, field spectra (Voorhies et al. 2002) are shown as blue,
230
N. Olsen et al. λn [km]
Rn [(nT)2]
40000
4000
2000 1500
1000
800
700
600
500
108
108
106
106
104
104
Core
102
100
0
Crust
10
20
102
30
40
50
60
70
100 80
degree n
Fig. 2 Spatial power spectrum of the geomagnetic field at Earth’s surface. Black dots represent the spectrum of a recent field model (Maus et al. 2008; Olsen et al. 2009). Also shown are theoretical spectra (Voorhies et al. 2002) for the core (blue) and crustal (magenta) part of the field, as well as their superposition (red curve)
resp. magenta, curves. Each of these two theoretical spectra has two free parameters which have been fitted to the observed spectra; their sum (red curve) provides a remarkable good fit to the observed spectrum. There is a sharp “knee” at about degree n D 14 which indicates that contributions from the core field are dominant at large scales (n < 14) while those of the crustal field dominate for the smaller scales (n > 14). We now proceed with a more detailed overview of the various field sources and their characteristics.
1.1
Internal Field Sources: Core and Crust
Core Field Although the Earth’s magnetic field has been known for at least several thousands of years (see e.g., Merrill et al. 1998), the nature of its sources has eluded scientific understanding for a very long time. It was not until the nineteenth century that its main source was finally proven to be internal to the Earth. We now know that this
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
231
main source is most likely a self-sustaining dynamo within the Earth’s core (see e.g., Roberts 2007; Wicht et al. 2010). This dynamo is the result of the fact that the liquid electrically conducting outer core (consisting of a Fe-Ni alloy) is cooling down and convecting vigorously enough to maintain electrical currents and a magnetic field. The basic process is one whereby the convective motion of the conducting fluid within the magnetic field induces electromotive forces which maintain electrical currents, and therefore also the magnetic field, against Ohmic dissipation. The Earth’s core dynamo has several specific features. First, the core contains a solid, also conducting, inner core. This inner core is thought to be the solidified part of the core (Jacobs 1953), the growth of which is the result of the cooling of the core. Because the Fe-Ni alloy that makes the core must in fact contain additional light elements, the solidification of the inner core releases so-called compositional buoyancy at the inner core boundary. This buoyancy will add up to the thermal buoyancy and is thought to be a major source of energy for the convection (see e.g., Nimmo 2007). Second, the core, together with the whole Earth, is rotating fast, at a rate of one rotation per day. This leads Coriolis forces to play a major role in organizing the convection, and the way the dynamo works. In particular, spherical symmetry is dynamically broken, and a preferential axial (North-South) dipole field can be produced (see e.g., Gubbins and Zhang 1993). At any given instant, however, the field produced cannot be too simple (a requirement which has been formalized in terms of anti-dynamo theorems, starting with the best-known Cowling theorem (Cowling 1957)). In particular, no fully axisymmetric field can be produced by a dynamo. In effect, all dynamo numerical simulations run so far with conditions approaching that of the core dynamo produce quite complex fields in addition to a dominant axial dipole. This complexity does not only affect the so-called toroidal component of the magnetic field which remains for the most part within the fluid core (such toroidal components are nonzero only where their sources lie (cf. section 3 of Sabaka et al. 2010), and the poorly conducting mantle forces those to essentially remain within the core). It also affects the poloidal component of the field which can escape the core by taking the form of a potential field (such poloidal components can indeed escape their source region in the form of a Laplacian potential field, see again Sabaka et al. (2010)), reach the Earth’s surface and make the core field we can observe. The core field thus has a rich spatial spectrum beyond a dominant axial dipole component. It also has a rich temporal spectrum (with typical time scales from decades to centuries (see e.g., Hulot and Le Mouël 1994)) directly testifying for the turmoil of the poloidal field produced by the dynamo at the core surface. However, what is observable at and above the Earth’s surface is only part of the core field that reaches it. Spatially, its small-scale contributions are masked by the crustal field, as shown in Fig. 2, and therefore only its largest scales (corresponding to spherical harmonic degrees smaller than 14) can be recovered. And temporally, the high frequency part of the core field (corresponding to periods shorter than a few months) is screened by the finite conductivity of the mantle (see e.g., Alexandrescu et al. 1999). This puts severe limitations on the possibility to
232
N. Olsen et al.
recover the spatiotemporal structure of the core field, regardless of the quality of the magnetic field observations. More information on our present knowledge of the core field based on recent data can be found in e.g., Hulot et al. (2007), Jackson and Finlay (2007), and Finlay et al. (2010).
Crustal Field The material that makes the mantle and crust contains substantial amounts of magnetic minerals. Those minerals can become magnetized in the presence of an applied magnetic field. To produce any significant magnetic signals, this magnetism must however be of ferromagnetic type, which also requires the material to be at a low enough temperature (below the so-called Curie temperature of the minerals, see e.g., Dunlop and Özdemir 2007). Those conditions are only met within the Earth’s upper layers, above the so-called Curie isotherm. Its depth can vary between zero (such as at mid-oceanic ridges) and several tens of kilometers, with a typical value on the order of 20 km in continental regions. Magnetized rocks can thus only be found in those layers. Magnetized rocks essentially carry two types of magnetization, induced magnetization and remanent magnetization. Induced magnetization is one that is proportional, both in strength and direction, to the ambient field within which the rock is embedded. The ability of such a rock to acquire this magnetization is a function of the nature and proportion of the magnetic minerals it contains. It is measured in terms of a proportionality factor known as the magnetic susceptibility. Were the core field (or more correctly the local field experienced by the rock) to disappear, this induced magnetization would also disappear. Then, only the second type of magnetization, remanent magnetization, would remain. This remanent magnetization could have been acquired by the rock in many different ways (see e.g., Dunlop and Özdemir 2007). For instance at times of deposition for a sedimentary rock, or via chemical transformation, if the rock has been chemically altered. The most ubiquitous process, however, which also usually leads to the strongest remanent magnetization, is thermal. It is the way igneous and metamorphic rocks acquire their remanent magnetization when they cool down below their Curie temperature. The rock becomes magnetized in proportion, both in strength and direction, to the ambient magnetic field that the rocks experiences at the time it cools down (the proportionality factor being again a function of the magnetic minerals contained in the rock). Remanent magnetization from a properly sampled rock thus can provide information about the ancient core field (see e.g., Hulot et al. 2010). There is no way one can identify the signature of the present core field without taking the crustal field into account (in fact, modern ways of modeling the core field from satellite data often also involves modeling the crustal field). It is therefore important to also mention some of the most important spatio-temporal characteristics of the field produced by magnetized sources. It should first be recalled that not all magnetized sources will produce observable fields at the Earth’s surface. In particular, if the upper layers of the Earth consisted in a spherical shell
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
233
of uniform magnetic properties magnetized within the core field at a given instant, they would produce no observable field at the Earth’s surface. This is known as Runcorn’s theorem (Runcorn 1975), an important implication of which is that the magnetic field observed at the Earth’s surface is not sensitive to the induced magnetization due to the average susceptibility of a spherical shell best describing the upper magnetic layers of the Earth. It will only sense the departure of those layers from sphericity (see e.g., Lesur and Jackson 2000), either because of the Earth’s flattening, because of the variable depth of the Curie isotherm, or because of the contrasts in magnetization due to the variable nature and susceptibility of rocks within those layers (although even such contrasts can sometimes also fail to produce observable fields (Maus and Haak 2003)). Another issue of importance is that of the relative contributions of induced and remanent magnetization. Induced magnetization is most likely the main source of large scale magnetization, while remanent magnetization plays a significant role only at regional (especially in oceans) and local scales (e.g., Purucker et al. 2002; Hemant and Maus 2005). Finally, it is important that we also briefly mention the poorly known issue of possible temporal changes in crustal magnetization on a human time scale. At a local scale, any dynamic process that can alter the magnetic properties of the rocks, or change the geological setting (such as an active volcano), would produce such changes. On a planetary scale, by contrast, significant changes can only occur in the induced magnetization because of the slowly time-varying core field, as has been recently demonstrated by Hulot et al. (2009) and Thebault et al. (2009). Recent reviews of the crustal field are given by Purucker and Whaler (2007) and Thébault et al. (2010).
1.2
Ionospheric, Magnetospheric, and Earth-Induced Field Contributions
Ionospheric and magnetospheric currents (which produce the field of external origin), as well as Earth-induced currents (which produce externally induced fields), contribute in a nonnegligible way to the observed magnetic field, both at ground and at satellite altitude. It is therefore important to consider them in order to properly identify and separate their signal from that of the field of internal origin.
Ionospheric Contributions Geomagnetic daily variations at nonpolar latitudes (known as Sq variations) are caused by diurnal wind systems in the upper atmosphere: Heating at the dayside and cooling at the nightside generates tidal winds which drive ionospheric plasma against the core field, inducing electric fields and currents in the ionospheric Eregion dynamo region between 90 and 150 km altitude (Richmond 1989; Campbell 1989; Olsen 1997b). The currents are concentrated at an altitude of about 110– 115 km and hence can be represented by a sheet current at that altitude, (cf. section 3.4 and 3.5 of Sabaka et al. 2010). They remain relatively fixed with
234
N. Olsen et al.
respect to the Earth-Sun line and produce regular daily variations which are directly seen in the magnetograms of magnetically quiet days. On magnetically disturbed days there is an additional variation which includes superimposed magnetic storm signatures of magnetospheric and high-latitude ionospheric origin. Typical peakto-peak Sq amplitudes at middle latitudes are 20–50 nT; amplitudes during solar maximum are about twice as large as those during solar minimum. Sq variations are restricted to the dayside (i.e., sunlit) hemisphere, and thus depend mainly on local time. Selecting data from the nightside when deriving models of the internal field is therefore useful to minimize field contributions from the nonpolar ionospheric E-region. Because the geomagnetic field is strictly horizontal at the dip equator, there is about a fivefold enhancement of the effective (Hall) conductivity in the ionospheric dynamo region, which results in about a fivefold enhanced eastward current, called the Equatorial Electrojet (EEJ), flowing along the dayside dip equator (Rastogi 1989). Its latitudinal width is about 6ı –8ı . In addition, auroral electrojets (AEJ) flow in the auroral belts (near ˙(65ı –70ı ) magnetic latitude) and vary widely in amplitude with different levels of magnetic activity from a few tens nT during quiet periods to several thousand nT during major magnetic storms. As a general rule, ionospheric fields at polar latitudes are present even at magnetically quiet times and on the nightside (i.e., dark) hemisphere, which makes it difficult to avoid their contribution by data selection. Electric currents at altitude above 120 km, i.e., in the so-called ionospheric F region (up to 1,000 km altitude), cause magnetic fields that are detectable at satellite altitude as nonpotential (e.g., toroidal) magnetic fields (Olsen 1997a; Richmond 2002; Maus and Lühr 2006). Their contributions in nonpolar regions are also important during local nighttime when the E-region conductivity vanishes and therefore contributions from Sq and the Equatorial Electrojet are absent.
Magnetospheric Contributions The field originating in Earth’s magnetosphere is due primarily to the ring-current and to currents on the magnetopause and in the magnetotail (Kivelson and Russell 1995). Currents flowing on the outer boundary of the magnetospheric cavity, the magnetopause currents, cancel the Earth’s field outside and distend the field within the cavity. This produces an elongate tail in the antisolar direction within which socalled neutral-sheet currents are established in the equatorial plane. Interaction of these currents with the radiation belts near the Earth produces a ring-current in the dipole equatorial plane which partially encircles the Earth, but achieves closure via field-aligned currents (FAC) (currents which flow along core field lines) into and out of the ionosphere. These resulting fields have magnitudes on the order of 20– 30 nT near the Earth during magnetically quiet periods, but can increase to several hundreds of nT during disturbed times. In polar regions (poleward of, say, ˙65ı dipole latitude), the auroral ionosphere and magnetosphere are coupled by field-aligned currents. The fields from these FAC have magnitudes that vary with the magnetic disturbance level. However, they are
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
235
always present, on the order of 30–100 nT during quiet periods and up to several thousand nT during substorms. There are also currents which couple the ionospheric Sq current systems in the two hemispheres that flow, at least in part, along magnetic field lines. The associated magnetic fields are generally 10 nT or less. Finally, there exists a meridional current system which is connected to the EEJ with upward directed currents at the dip equator and field-aligned downward directed currents at low latitudes. These currents result in magnetic fields of about a few tens of nT at 400 km altitude.
Induction in the Solid Earth and the Oceans Time-varying external fields produce secondary, induced, currents in the oceans and the Earth’s interior; this contribution is what we refer to as externally induced fields, which are the topic of electromagnetic induction studies (see Parkinson and Hutton 1989; Constable 2007; Kuvshinov 2012). In addition, the motion of electrically conducting seawater through the core field, via a process referred to as motional induced induction, also produces secondary currents (e.g., Tyler et al. 2003; Kuvshinov and Olsen 2005; Maus 2007b). The oceans thus contribute twofold to the observed magnetic field: by secondary currents induced by primary current systems in the ionosphere and magnetosphere; and by motion induced currents due to the movement of seawater, for instance by tides. The amplitude of induced contributions generally decreases with the period. As an example, about one third of the observed daily Sq variation in the horizontal components is of induced origin (Schmucker 1985). But induction effects also depend on the scale of the source (i.e., the ionospheric and magnetospheric current systems); as a result, the induced contribution due to the daily variation of the Equatorial Electrojet is, for instance, much smaller than the above mentioned one third, typical of the large-scale Sq currents (Olsen 2007b).
2
Modern Geomagnetic Field Data
2.1
Definition of Magnetic Elements and Coordinates
Measurements of the geomagnetic field taken at ground or in space form the basis for modeling the Earth’s magnetic field. Observations taken at the Earth’s surface are typically given in a local topocentric (geodetic) coordinate system (i.e., relative to a reference ellipsoid as approximation for the geoid). The magnetic elements X; Y; Z are the components of the field vector B in an orthogonal right-handed coordinate system, the axis of which are pointing towards geographic North, geographic East, and vertically down, as shown in Fig. 3. Derived magnetic elements are: the angle between geographic North and the (horizontal) direction in which a compass needle is pointing, denoted as declination D D arctan Y =X ; the angle between the local horizontal plane and the field vector,
236
N. Olsen et al.
Fig. 3 The magnetic elements in the local topocentric coordinate system, seen from North-East
p 2 2 denoted as inclination p I D arctan Z=H ; horizontal intensity H D X C Y ; and 2 2 2 total intensity F D X C Y C Z . In contrast to magnetic observations taken at or near ground, satellite data are typically provided in the geocentric coordinate system as spherical components Br ; B ; B' where r, , ' are radius, colatitude, and longitude respectively. Equations for transforming between geodetic components X; Y; Z and geocentric components Br ; B ; B' can be found in, e.g., section 5.02.2.1.1 of Hulot et al. (2007). The distribution in space of the observations at a given time determines the spatial resolution to which the field can be determined for that time. Internal sources are often fixed with respect to the Earth (magnetic fields due to induced currents in the Earth’s interior are an exception) and thus follow its rotation. Internal sources are therefore best described in an Earth-Centered-Earth-Fixed (ECEF) coordinate frame like that given by the geocentric coordinates r, , '. In contrast, many external fields are fixed with respect to the position of the Sun, and therefore the use of a coordinate frame that follows the (apparent) movement of the Sun is advantageous. Solar magnetospheric (SM) coordinates for describing near magnetospheric currents like the ring-current, and geocentric solar magnetospheric (GSM) coordinates for describing far magnetospheric current systems like the tail currents have turned out to be useful when determining models of Earth’s magnetic field (Maus and Lühr 2005; Olsen et al. 2006, 2009, 2010b).
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
2.2
237
Ground Data
180
IGY/C IQSY
200 annual means hourly means 1-min values
IMS 2nd IPY
140
100 80 60 40 20
1st IPY
120 Göttingen Magnetic Union
number of observatories
160
Preparation for Ørsted
Presently about 150 geomagnetic observatories monitor the time changes of the Earth’s magnetic field. Their global distribution, shown in the lower part of Fig. 4,
0 1800 1820 1840 1860 1880 1900 1920 1940 1960 1980 2000 year
Fig. 4 Distribution of ground observatory magnetic field data in time (top) and space (bottom)
238
N. Olsen et al.
is very uneven, with large uncovered areas especially over the oceans. Red symbols indicate sites that provide data (regardless of the observation time instant and the duration of the time series), while the yellow dots show observatories that have provided hourly mean values for the recent years. These observatories provide data of different temporal resolution which are distributed through the World-DateCenter system (e.g., http://www.ngdc.noaa.gov/wdc, http://wdc.kugi.kyotou.ac.jp, http://www.wdc.bgs.ac.uk). Traditionally, annual mean values have been used for deriving field models, but the availability of hourly mean values (or even 1 min values) in digital form for the recent years allow for a better characterization of external field variations. The upper part of Fig. 4 shows the distribution in time of observatory data of various sampling rate. International campaigns, like the Göttingen Magnetic Union, the 1st and 2nd International Polar Year (IPY), the International Geophysical Year (IGY/C), the International Quiet Solar Year (IQSY), the International Magnetospheric Study (IMS), and the preparation of the Ørsted satellite mission have stimulated observatory data processing and the establishment of new observatories. Geomagnetic observatories aim at measuring the magnetic field in the geodetic reference frame with an absolute accuracy of 1 nT (Jankowski and Sucksdorff 1996). However, it is presently not possible to take advantage of that measurement accuracy due to the (unknown) contribution from nearby crustal sources. When using observatory data for field modeling it is therefore common practice to either use first time differences of the observations (thereby eliminating the static crustal field contribution) or to co-estimate together with the field model an “observatory bias” for each site and element, following a procedure introduced by Langel et al. (1982). A joint analysis of observatory and satellite data allows one to determine these observatory biases. The need for knowledge of the absolute baseline is therefore less important during periods for which satellite data are available. Recognizing this will simplify the observation practice, especially for ocean-bottom magnetometers for which the exact determination of true north is very difficult and expensive. In addition to geomagnetic observatories (which monitor the time changes of the geomagnetic field at a given location), magnetic “repeat stations” are sites where high-quality magnetic measurements are taken every few years for a couple hours or even days (Newitt et al. 1996; Turner et al. 2007). The main purpose of repeat stations is to measure the time changes of the core field (secular variation); they offer better spatial resolution than observatory data but do not provide continuous time series.
2.3
Satellite Data
The possibility to measure the Earth’s magnetic field from space has revolutionized geomagnetic field modeling. Magnetic observations taken by low-earth-orbiting (LEO) satellites at altitudes below 1,000 km form the basis of recent models of the geomagnetic field.
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
239
There are several advantages of using satellite data for field modeling: 1. Satellites sample the magnetic field over the entire Earth (apart from the polar gap, a region around the geographic poles that is left unsampled if the satellite orbit is not perfectly polar). 2. Measuring the magnetic field from an altitude of 400 km or so corresponds roughly to averaging over an area of this dimension. Thus the effect of local heterogeneities, for instance caused by local crustal magnetization, is reduced. 3. The data are obtained over different regions with the same instrumentation, which helps to reduce spurious effects. There are, however, some points to consider when using satellite data instead of ground based data: 1. Since the satellite moves (with a velocity of about 8 km/s at 400 km altitude) it is not possible to decide whether an observed magnetic field variation is due to a temporal or spatial change of the field. Thus, there is risk for time-space aliasing. 2. It is necessary to measure the magnetic field with high accuracy – not only regarding resolution, but also regarding orientation and absolute values. 3. Due to the Earth’s rotation, the satellite revisits a specific region after about 1 day.1 Hence the magnetic field in a selected region is modeled from time series with a sampling rate of 1 day. However, since the measurements were not lowpass filtered before “resampling,” aliasing may occur. 4. Satellites usually acquire data not at one fixed altitude, but over a range of altitudes. The decay of altitude through mission lifetime often leads to time series that are unevenly distributed in altitude. 5. Finally, the satellite moves through an electric plasma, and the existence of electric currents at satellite altitude does, in principle, not allow to describe the observed field as the gradient of a Laplacian potential. An overview of previous and present satellites that have been used for geomagnetic field modeling is given in Table 1 (see also Table 3.3 of Langel and Hinze 1998). The quality of the magnetic field measurements is rather different for the listed satellites, and before the launch of the Ørsted satellite in 1999, the POGO satellite series (Cain 2007) that flew in the second half of the 1960s, and Magsat (Purucker 2007), which flew for 6 months around 1980, were the only high-precision magnetic satellites. A timeline of high-precision missions is shown in Fig. 5. After a gap of almost 20 years with no high-precision satellites in orbit, the launch of the Danish
1 Actually the satellite revisits that region already after about 12 h, but this will be for a different local time. Because of external field contributions – which heavily depend on local time – it is safer to rely on data taken at similar local time conditions, which results in the above stated sampling recurrence of 24 h.
240
N. Olsen et al.
Table 1 Satellite missions of relevance for geomagnetic field modeling
87ı 86ı 82ı 97ı 90ı 90ı 90ı
413–1,510 412–908 397–1,098 325–550 568–23,290 309–1,012 639–769
6 6 6 6 ? 30(F)/100 ?
UARS Ørsted CHAMP SAC-C Swarm
1991–1994 1999– 2000–2010 2001–2004 2013–
57ı 97ı 87ı 97ı 88ı /87ı
560 650–850 250–450 698–705 530/ < 450
? 4 3 4 2
Scalar only Scalar only Scalar only Vector and scalar Vector (spinning) Low accuracy vector Low accuracy vector, timing problems Vector (spinning) Scalar and vector Scalar and vector Scalar only Scalar and vector
Swarm
Vector and scalar
Ørsted
?
3-sat constellation, vector and scalar
1965–1967 1967–1969 1969–1971 1979–1980 1981–1991 1981–1983 1990–1993
Vector and scalar
Remarks Scalar only
Ørsted, CHAMP, SAC-C
Altitude/km Accuracy/nT 261–488 22
Magsat
Inclination i 50ı
Scalar only
Years 1964
POGO (OGO-2, -4, -6)
Satellite Cosmos 49 POGO OGO-2 OGO-4 OGO-6 Magsat DE-1 DE-2 POGS
CHAMP SAC-C 1965 1970 1975
1980 1985 1990 1995 2000 2005 2010 2015 2020
Fig. 5 Distribution of high-precision satellite missions in time
Ørsted satellite (Olsen 2007a) in February 1999 marked the beginning of a new epoch for exploring the Earth’s magnetic field from space. Ørsted was followed by the German CHAMP satellite (Maus 2007a) and the US/Argentinian/Danish SAC-C satellite, launched in July and November 2000, respectively. All three satellites carry essentially the same instrumentation and provide high-quality and high-resolution magnetic field observations from space. They sense the various internal and external
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
241
Fig. 6 Left: The path of a satellite at inclination i in orbit around the Earth. Right: Ground track of 24 h of the Ørsted satellite on January 2, 2001 (yellow curve). The satellite starts at –57ı N, 72ı E at 00 UT, moves northward on the morning side of the Earth, and crosses the Equator at 58ı E (yellow arrow). After crossing the polar cap it moves southward on the evening side and crosses the equator at 226ı E (yellow open arrow) 50 min after the first equator crossing. The next Equator crossing (after additional 50 min) is at 33ı E (red arrow), 24ı westward of the first crossing 100 min earlier, while moving again northward
field contributions differently, due to their different altitudes and drift rates through local time. A closer look at the characteristics of satellite data sampling is helpful. A satellite moves around Earth in elliptical orbits. However, ellipticity of the orbit is small for many of the satellites used for field modeling, and for illustration purposes we will concentrate on circular orbits. As sketched on the left side of Fig. 6, orbit inclination i is the angle between the orbit plane and the equatorial plane. A perfectly polar orbit implies i D 90ı , but for practical reasons most satellite orbits have inclinations that are different from 90ı . This results in “polar gaps,” which are regions around the geographic poles that are left unsampled. The right part of Fig. 6 shows the ground track of 1 day (January 2, 2001) of Ørsted satellite data. It is obvious that the coverage in latitude and longitude provided even by only a few days of satellites data is much better than that of the present ground based observatory network (cf. Fig. 4). The polar gaps, the regions of half-angle j90ı – i j around the geographic poles, are obvious when looking at the orbits in a polar view in an Earth-fixed coordinate system, as done in the left part of Fig. 7 for the Ørsted (red), resp. CHAMP (blue), satellite tracks of January 2, 2001. The polar gaps are larger for Ørsted (inclination i D 97ı ) compared to CHAMP (i D 87ı ), as confirmed by the figure. As mentioned before, internal magnetic field sources are often fixed (or slowly changing, in the case of the core field) with respect to the Earth while most external fields have relatively fixed geometries with respect to the Sun. A good description of the various field contributions requires a good sampling of the data in the respective coordinate systems.
242
N. Olsen et al.
Fig. 7 Left: Ground track of 1 day (January 2, 2001) of the Ørsted (red) and CHAMP (blue) satellites in dependence on geographic coordinates. Right: Orbit in the solar-magnetospheric (SM) reference frame
Good coverage in latitude and longitude is essential for modeling the internal field. There are, however, pitfalls due to peculiarities of the satellite orbits, which may result in less optimal sampling. The top panel of Fig. 8 shows the longitude of the ascending node (the equator crossing of the satellite going from south to north) of the Ørsted (left) and CHAMP (right) satellite orbits. Depending on orbit altitude (shown in the middle part of the figure) there are periods with pronounced “revisiting patterns”: In June and July 2003, the CHAMP satellite samples, for instance, only the field near the equator at longitudes ' D 7:6ı ; 19:2ı ; 30:8ı : : :356:0ı . This longitudinal sampling of ' D 11:6ı hardly allows to resolve features of the field of spatial scale corresponding to spherical harmonic degrees above n D 15. Another issue that has to be considered when deriving field models from satellite data concerns satellite altitude. The middle panel of Fig. 8 shows the altitude evolvement for the Ørsted and CHAMP satellites. Various altitude maneuvers are the reason for the sudden increase of altitude of CHAMP. At lower altitudes the magnetic signal of small-scale features of the internal magnetic field (corresponding to higher spherical harmonic degrees) are relatively more amplified compared to large-scale features (represented by low degree spherical harmonics). The crustal field signal measured by a satellite is thus normally stronger towards the end of the mission lifetime due to the lower altitude. However, if the crustal field is not properly accounted for, the decreasing altitude may hamper the determination of the core field time changes. Good sampling in the Earth-Fixed coordinate system, which is essential for determining the internal magnetic field, can, at least in principle, be obtained from a few days of satellite data. However, good sampling in sun-fixed coordinates is required for a reliable determination of the external field contributions. The right part of Fig. 7 shows the distribution of the Ørsted, resp. CHAMP, satellite
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
243
Fig. 8 Some orbit characteristics for the Ørsted (left), resp. CHAMP (right) satellite in dependence on time. Top: Longitude of the ascending node, illustrating longitudinal “revisiting patterns.” Middle: mean altitude. Bottom: Local Time of ascending (red), resp. descending (blue) node
data of January 2, 2001, in the SM coordinate system, i.e., in dependence on the distance from the geomagnetic North pole (which is in the center of the plot) and magnetic local time MLT. Despite the rather good sampling in the geocentric frame (left panel), the distribution in the sun-fixed system (right panel) is rather coarse, especially when only data from one satellite are considered. Data obtained at different local times are essential for a proper description of external fields. The bottom panel of Fig. 8 shows how (geographic) local time of the satellite orbits
244
N. Olsen et al.
change through mission lifetime. The Ørsted satellite scans all local times within 790 days (2.2 years), while the local time drift rate of CHAMP is much higher: CHAMP covers all local times within 130 days. Combining observations from different spacecraft flying at different local times helps to improve data coverage in the various coordinate systems. Especially the Swarm satellite constellation mission (Friis-Christensen et al. 2006, 2009) consisting of three satellites that were launched in November 2013 has specifically been designed to reduce the time-space ambiguity that is typical of single-satellite missions.
3
Making the Best of the Data to Investigate the Various Field Contributions: Geomagnetic Field Modeling
Using the observations of the magnetic field described in the previous section to identify the various magnetic fields described in Sect. 1 is the main purpose of geomagnetic field modeling. This requires the use of mathematical representations of such fields in both space and time. The mathematical tools that make such a representation in space possible are described in the accompanying chapter by Sabaka et al. (2010). But because the fields vary in time, some temporal representations are also needed. Using such spatio-temporal representations formally makes it possible to represent all the fields that contribute to the observed data in the form of a linear superposition of elementary functions. The set of numerical coefficients that define this linear combination is then what one refers to as a geomagnetic field model. It can be recovered from the data via inverse theory, which next makes it possible to identify the various field contributions. In practice, however, one has to face many pitfalls, not the least because the data are limited in number and not ideally distributed. In particular, although numerous, the usefulness of satellite data is limited by the time needed for satellites to complete an orbit, during which some of the fields can change significantly. This can then translate into some ambiguity in terms of the spatial/temporal representations. But advantage can be taken of the known spatiotemporal properties of the various fields described in Sect. 1 and of the combined use of ground and satellite data. Fast changing fields, with periods up to typically a month, are for instance known to mainly be of external origin (both ionospheric and magnetospheric), but with some electrically induced internal fields. Those are best identified with the help of observatory data, which can be temporally band filtered, and next spatially analyzed with the help of the tools described in Sabaka et al. (2010). This then makes it possible to identify the contribution from sources above and below the Earth’s surface. The relative magnitude and temporal phase shifts between the (induced) internal and (inducing) external fields can then be computed, which provide very useful information with respect to the distribution of the electrical conductivity within the solid Earth (see e.g., Constable 2007; Kuvshinov 2008). Satellite data can also be used for similar purposes, but this is a much more difficult endeavor since, as we already noted, one then has to deal with additional space/time separation
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
245
issues related to the fact that satellites sense both changes due to their motion over stationary sources (such as the crustal field) and true temporal field changes. Much efforts are currently devoted to deal with those issues and make the best of such data for recovering the solid Earth electrical conductivity distribution, with encouraging preliminary successes (see e.g., Kuvshinov and Olsen 2006), especially in view of the Swarm satellite constellation mission (Kuvshinov et al. 2006). As a matter of fact, and for the time being, satellite data turn out to be most useful for the investigation of the field of internal origin (the core field and the crustal field). But even recovering those fields requires considerable care and advanced modeling strategies. In principle, and as explained in Sabaka et al. (2010), full vector measurements can be combined with ground-based vector data to infer both the field of internal origin, the E-region ionospheric field, the local F -region ionospheric field and the magnetospheric field. But there are many practical limits to this possibility, again because satellites do not provide instantaneous sets of measurements on a sphere at all times, and also because the data distribution at the Earth’s surface is quite sparse. This sets a limit on the quality of the E-region ionospheric field, one can possibly hope to recover by a joint use of ground-based and satellite data. But appropriate knowledge of the spatiotemporal behavior of each type of sources can again be used. This basically leads to the two following possible strategies to infer the contribution of each field from satellite data. A first strategy consists in acknowledging that the field due to nonpolar ionospheric E-region is weak at night, especially at so-called magnetic quiet time (as may be inferred from ground-based magnetic data), and selecting satellite data in this way, so as to minimize contributions from the ionosphere. Those satellite data can then be used alone to infer both the field of internal origin and the field of external (then only magnetospheric) origin (though this usually still requires some care when dealing with polar latitude data, because those are always, also at night and during quiet conditions, affected by some ionospheric and local field-aligned currents). This is a strategy that can be used to focus on the field of internal origin, and in particular the crustal field (see e.g., Maus et al. 2008). A second strategy consists in making use of both observatory data and satellite data, and simultaneously parameterizing the spatial and temporal behavior of as many sources as possible. This strategy has been used in particular to improve the recovery of the core field and its slow secular changes (see e.g., Thomson and Lesur 2007; Lesur et al. 2008; Olsen et al. 2009), but can more generally be used to try and recover all field sources simultaneously, using the so-called Comprehensive Modeling approach (Sabaka et al. 2004, 2002), to investigate the temporal evolution of all fields over long periods of times when satellite data are available. This strategy is one that looks particularly promising in view of the Swarm satellite constellation mission (Sabaka and Olsen 2006). Finally, it is worth pointing out that whatever strategy is being used, residuals from the modeled fields may then also always be used to investigate additional nonmodeled sources such as local ionospheric F -region sources (Lühr and Maus 2006; Lühr et al. 2002) to which satellite data are very sensitive.
246
N. Olsen et al.
Considerable more details about all those strategies and other possible future strategies can be found in e.g., Hulot et al. (2007), Olsen et al. (2010a), Lesur et al. (2011), and Schott and Thebault (2011) to which the reader is referred, and where many more references can be found. Acknowledgements This is IPGP contribution 2595 (updated).
References Alexandrescu MM, Gibert D, Le Mouël JL, Hulot G, Saracco G (1999) An estimate of average lower mantle conductivity by wavelet analysis of geomagnetic jerks. J Geophys Res 104: 17735–17745 Cain JC (2007) POGO (OGO-2, -4 and -6 spacecraft). In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg, pp 828–829 Campbell WH (1989) The regular geomagnetic field variations during quiet solar conditions. In: Jacobs JA (ed) Geomagnetism, vol 3. Academic, London, pp 385–460 Constable S (2007) Geomagnetic induction studies. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 237–276 Cowling TG (1957) Magnetohydrodynamics. Wiley Interscience, New York Dunlop D, Özdemir Ö (2007) Magnetizations in rocks and minerals. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 277–336 Finlay CF, Dumberry M, Chulliat A, Pais A (2010) Short timescale core dynamics: theory and observations. Space Sci Rev 155:177–218. doi:10.1007/s11214-010-9691-6 Friis-Christensen E, Lühr H, Hulot G (2006) Swarm: a constellation to study the Earth’s magnetic field. Earth Planets Space 58:351–358 Friis-Christensen E, Lühr H, Hulot G, Haagmans R, Purucker M (2009) Geomagnetic research from space. EOS Trans AGU 90(25):213–215 Gubbins D, Zhang K (1993) Symmetry properties of the dynamo equations for paleomagnetism and geomagnetism. Phys Earth Planet Int 75:225–241 Hemant K, Maus S (2005) Geological modeling of the new CHAMP magnetic anomaly maps using a geographical information system technique. J Geophys Res 110:B12103. doi:10.1029/2005JB003837 Hulot G, Le Mouël JL (1994) A statistical approach to the Earth’s main magnetic field. Phys Earth Planet Int 82:167–183. doi:10.1016/0031-9201(94)90070-1 Hulot G, Sabaka TJ, Olsen N (2007) The present field. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 33–75 Hulot G, Olsen N, Thébault E, Hemant K (2009) Crustal concealing of small-scale core-field secular variation. Geophys J Int 177:361–366. doi:10.1111/j.1365-246X. 2009.04119.x Hulot G, Finlay C, Constable C, Olsen N, Mandea M (2010) The magnetic field of planet Earth. Space Sci Rev 152:159–222. doi:10.1007/s11214-010-9644-0 Jackson A, Finlay CC (2007) Geomagnetic secular variation and its application to the core. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam Jacobs JA (1953) The earth’s inner core. Nature 172:297–300 Jankowski J, Sucksdorff C (1996) IAGA guide for magnetic measurements and observatory practice. IAGA, Warszawa Kivelson MG, Russell CT (1995) Introduction to space physics. Cambridge University Press, Cambridge Kuvshinov A (2008) 3-D global induction in the oceans and solid earth: recent progress in modeling magnetic and electric fields from sources of magnetospheric, ionospheric and oceanic origin. Surv Geophys 29(2):139–186
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
247
Kuvshinov A (2012) Deep electromagnetic studies from land, sea, and space: progress status in the past 10 years. Surv Geophys 33:169–209. doi:10.1007/s10712-011-9118-2 Kuvshinov AV, Olsen N (2005) 3D modeling of the magnetic field due to ocean flow. In: Reigber C, Lühr H, Schwintzer P, Wickert J (eds) Earth observation with CHAMP, results from three years in orbit. Springer, Berlin Kuvshinov AV, Olsen N (2006) A global model of mantle conductivity derived from 5 years of CHAMP, Ørsted, and SAC-C magnetic data. Geophys Res Lett 33:L18301. doi:10.1029/2006GL027083 Kuvshinov AV, Sabaka TJ, Olsen N (2006) 3-D electromagnetic induction studies using the Swarm constellation. Mapping conductivity anomalies in the Earth’s mantle. Earth Planets Space 58:417–427 Langel RA, Hinze WJ (1998) The magnetic field of the Earth’s lithosphere: the satellite perspective. Cambridge University Press, Cambridge Langel RA, Estes RH, Mead GD (1982) Some new methods in geomagnetic field modeling applied to the 1960–1980 epoch. J Geomagn Geoelectron 34:327–349 Lesur V, Jackson A (2000) Exact solution for internally induced magnetization in a shell. Geophys J Int 140:453–459 Lesur V, Wardinski I, Rother M, Mandea M (2008) GRIMM: the GFZ reference internal magnetic model based on vector satellite and observatory data. Geophys J Int 173:382–394 Lesur V, Olsen N, Thomson AW (2011) Geomagnetic core field models in the satellite era. In: Mandea M, Korte M (eds) Geomagnetic observations and models. IAGA special Sopron book series, chap 11, vol 5. Springer, Heidelberg, pp 277–294. doi:10.1007/978-90-481-9858-0_11 Lowes FJ (1966) Mean-square values on sphere of spherical harmonic vector fields. J Geophys Res 71:2179 Lühr H, Maus S (2006) Direct observation of the F region dynamo currents and the spatial structure of the EEJ by CHAMP. Geophys Res Lett 33:L24102. doi:10.1029/2006GL028374 Lühr H, Maus S, Rother M (2002) First in situ observation of night-time F region currents with the CHAMP satellite. Geophys Res Lett 29(10). doi:10.1029/2001GL013845 Maus S (2007a) CHAMP magnetic mission. In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg, pp 59–60 Maus S (2007b) Electromagnetic ocean effects. In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg Maus S, Haak V (2003) Magnetic field annihilators: invisible magnetization at the magnetic equator. Geophys J Int 155:509–513 Maus S, Lühr H (2005) Signature of the quiet-time magnetospheric magnetic field and its electromagnetic induction in the rotating Earth. Geophys J Int 162:755–763 Maus S, Lühr H (2006) A gravity-driven electric current in the Earth’s ionosphere identified in CHAMP satellite magnetic measurements. Geophys Res Lett 33:L02812. doi:10.1029/2005GL024436 Maus S, Yin F, Lühr H, Manoj C, Rother M, Rauberg J, Michaelis I, Stolle C, Müller R (2008) Resolution of direction of oceanic magnetic lineations by the sixth-generation lithospheric magnetic field model from CHAMP satellite magnetic measurements. Geochem Geophys Geosyst 9(7):Q07021. doi:10.1029/2008GC001949 Merrill R, McFadden P, McElhinny M (1998) The magnetic field of the earth: paleomagnetism, the core, and the deep mantle. Academic, San Diego Newitt LR, Barton CE, Bitterly J (1996) Guide for magnetic repeat station surveys. International Association of Geomagnetism and Aeronomy, Boulder Nimmo F (2007) Energetics of the core. In: Treatise on geophysics, G. Schubert (ed), vol 8. Elsevier, Amsterdam, pp 31–65 Olsen N (1997a) Ionospheric F region currents at middle and low latitudes estimated from Magsat data. J Geophys Res 102(A3):4563–4576 Olsen N (1997b) Geomagnetic tides and related phenomena. In: Wilhelm H, Zürn W, Wenzel H-G (eds) Tidal phenomena. Lecture notes in earth sciences, vol 66. Springer, Berlin/New York
248
N. Olsen et al.
Olsen N (2007a) Ørsted. In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg, pp 743–745 Olsen N (2007b) Natural sources for electromagnetic induction studies. In: Gubbins D, HerreroBervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg Olsen N, Lühr H, Sabaka TJ, Mandea M, Rother M, Tøffner-Clausen L, Choi S (2006) CHAOS – a model of Earth’s magnetic field derived from CHAMP, Ørsted, and SAC-C magnetic satellite data. Geophys J Int 166:67–75. doi:10.1111/j.1365-246X. 2006.02959.x Olsen N, Mandea M, Sabaka TJ, Tøffner-Clausen L (2009) CHAOS-2 – a geomagnetic field model derived from one decade of continuous satellite data. Geophys J Int 179(3):1477–1487. doi:10.1111/j.1365-246X.2009.04386.x Olsen N, Hulot G, Sabaka TJ (2010a) Measuring the Earth’s magnetic field from space: concepts of past, present and future missions. Space Sci Rev 155:65–93. doi:10.1007/s11214-010-9676-5 Olsen N, Mandea M, Sabaka TJ, Tøffner-Clausen L (2010b) The CHAOS-3 geomagnetic field model and candidates for the 11th generation of IGRF. Earth Planets Space 62:719–727 Parkinson WD, Hutton VRS (1989) The electrical conductivity of the earth. In: Jacobs JA (ed) Geomagnetism, vol 3. Academic, London, pp 261–321 Purucker ME (2007) Magsat. In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg, pp 673–674 Purucker M, Whaler K (2007) Crustal magnetism. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 195–235 Purucker M, Langlais B, Olsen N, Hulot G, Mandea M (2002) The southern edge of cratonic North America: evidence from new satellite magnetometer observations. Geophys Res Lett 29(15):8000. doi:10.1029/2001GL013645 Rastogi RG (1989) The equatorial electrojet: magnetic and ionospheric effects. In: Jacobs JA (ed) Geomagnetism, vol 3. Academic, London, pp 461–525 Richmond AD (1989) Modeling the ionospheric wind dynamo: a review. In: Campbell WH (ed) Quiet daily geomagnetic fields. Birkhäuser Verlag, Basel Richmond AD (2002) Modeling the geomagnetic perturbations produced by ionospheric currents, above and below the ionosphere. J Geodynamics 33:143–156 Roberts PH (2007) Theory of the geodynamo. In: Treatise on geophysics, vol 8. Elsevier, Amsterdam, pp 67–106 Runcorn SK (1975) On the interpretation of lunar magnetism. Phys Earth Planet Int 10:327–335 Sabaka TJ, Olsen N (2006) Enhancing comprehensive inversions using the Swarm constellation. Earth Planets Space 58:371–395. http://www.terrapub.co.jp/journals/EPS/pdf/2006/5804/ 58040371.pdf Sabaka TJ, Olsen N (2004) Purucker ME Extending comprehensive models of the Earth’s magnetic field with Ørsted and CHAMP data. Geophys J Int 159:521–547. doi:10.1111/j.1365246X.2004.{02421}.x Sabaka TJ, Olsen N, Langel RA (2002) A comprehensive model of the quiet-time near-Earth magnetic field: phase 3. Geophys J Int 151:32–68 Sabaka TJ, Hulot G, Olsen N (2010) Mathematical properties relevant to geomagnetic field modeling. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics, chap 17. Springer, Heidelberg, pp 504–538. doi:10.1007/978-3-642-01546-5_17 Schmucker U (1985) Magnetic and electric fields due to electromagnetic induction by external sources. In: Landolt-Börnstein, new-series, 5/2b, W. Martienssen (ed). Springer, Berlin/Heidelberg, pp 100–125 Schott J-J, Thebault E (2011) Modeling the Earth’s magnetic field from global to regional scales. In: Mandea M, Korte M (eds) Geomagnetic observations and models. IAGA special Sopron book series, chap 9, vol 5. Springer, Heidelberg. doi:10.1007/978-90-481-9858-0_2 Thebault E, Hemant K, Hulot G, Olsen N (2009) On the geographical distribution of induced timevarying crustal magnetic fields. Geophys Res Lett 36:L01307. doi:10.1029/2008GL036416 Thébault E, Purucker M, Whaler KA, Langlais B, Sabaka TJ (2010) The magnetic field of the Earth’s lithosphere. Space Sci Rev 155:95–127. doi:10.1007/ s11214-010-9667-6
Sources of the Geomagnetic Field and the Modern Data That Enable Their. . .
249
Thomson AWP, Lesur V (2007) An improved geomagnetic data selection algorithm for global geomagnetic field modeling. Geophys J Int 169(3):951–963 Turner GM, Rasson JL, Reeves CV (2007) Observation and measurement techniques. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam Tyler RH, Maus S, Lühr H (2003) Satellite observations of magnetic fields due to ocean tidal flow. Science 299:239–241 Voorhies CV, Sabaka TJ, Purucker M (2002) On magnetic spectra of Earth and Mars. J Geophys Res 107(E6):5034 Wicht J, Harder H, Stellmach S (2010) Numerical dynamo simulations – from basic concepts to realistic models. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics. Springer, Heidelberg
Part III Modeling of the System Earth (Geosphere, Cryosphere, Hydrosphere, Atmosphere, Biosphere, Anthroposphere)
Classical Physical Geodesy Helmut Moritz
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Preliminary Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 What is Geodesy? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Reference Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Basic Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Gravitational Potential and Gravity Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Normal Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 The Geoid and Height Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Gravity Gradients and General Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Key Issues of Theory: Harmonicity, Analytical Continuation, and Convergence Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Harmonic Functions and Spherical Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Convergence and Analytical Continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 More About Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Krarup’s Density Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Key Issues of Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Boundary-Value Problems of Physical Geodesy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Collocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The Earth as a Nonrigid Body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 The Smoothness of the Earth’s Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Inverse Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 The Geoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Monographs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Journal Articles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
256 256 256 257 257 257 258 259 260 267 267 268 271 272 273 273 280 284 284 284 285 285 286 287 287 288 288
H. Moritz () Institute of Navigation, Graz University of Technology, Graz, Austria e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_6
253
254
H. Moritz
Abstract
Geodesy can be defined as the science of the figure of the Earth and its gravitational field, as well as their determination. Even though today the figure of the Earth, understood as the visible Earth’s surface, can be determined purely geometrically by satellites, using Global Positioning System (GPS) for the continents and satellite altimetry for the oceans, it would be pretty useless without gravity. One could not even stand upright or walk without being “told” by gravity where the upright direction is. So as soon as one likes to work with the Earth’s surface, one does need the gravitational field. (Not to speak of the fact that, without this gravitational field, no satellites could orbit around the Earth.) To be different from the existing textbooks, a working knowledge of professional mathematics can be taken for granted. In some areas where professors of geodesy are hesitant to enter too deeply, afraid of losing their students, some fundamental problems can be studied. Of course, there is a brief introduction to terrestrial gravitation as treated in the first few chapters of every textbook of geodesy, such as gravitation and gravity (gravitation plus the centrifugal force of the Earth’s rotation), the geoid, and heights above the ellipsoid (now determined directly by GPS) and above the sea level (a surprisingly difficult problem!). But then, as accuracies rise from 106 in 1960 (about 0. However much the modest sand grain tries to be nonobtrusive, it will cause a singularity into the originally regular geopotential model and thus introduce a convergence sphere (Fig. 2). This shows that the possibility of regular analytic continuation of the geopotential and convergence is a highly complicated and unstable problem. A single sand grain may change convergence into divergence, and think of how many mass points, rocks, molecules, and electrons make up the Earth’s body and cause the most fanciful singularities. A counterexample. This would indicate that for all solid bodies, consisting of many mass points, the analytical continuation of the outer potential into its interior is automatically singular. A counterexample is the homogeneous sphere of radius R. Its external potential is well known to be that of a point mass M situated at its center: GM Vext D ; r > R; (37) r whereas the potential inside the Earth is (int = interior) 1 2 2 Vint D 2G R r ; 3
(38)
which satisfies Poisson’s equation Vint D 4G. The analytical continuation of Vext is evidently the harmonic function Vcont D
GM ; 0 R g and having on SQ a generalized trace v 2 L2 .S /. In fact, thanks to the unique Q (see Cimmino solvability of the Dirichlet Problem for in ˝Q with data in L2 .S/ (1955); McLean (2000); Miranda (2008)), we can identify HH0 fv I v D 0 in ˝Q I v 2 L2 .SQ /g with the space H0 of traces of functions v, also extending the topology (57) to HH0 . So from now we shall agree that HH0 H0 : Now consider in H0 the functionals 1 Lj k .v/ D 4
Z v.R; /Yj k ./d
(58)
with R > RC as in (33); since the points .R; / are in the harmonicity domain of v, such functionals are bounded on H0 . As a matter of fact, if we put x .R; / ; y .R 0 ; 0 / ; F . 0 / D v and we call G.x; y/ the exterior Green function of SQ , by denoting
(59)
304
F. Sansò
Gny .x; y/ H .; 0 /;
(60)
we have Z 1 H .; 0 /F . 0 /dS 4 Z 1 D H .; 0 /F . 0 /R2 0 J 0 d 0 : 4
v.R; / D
(61)
Therefore, putting HC D maxH .; 0 /, which is bounded because R > RC , we ; 0
have 3=2
jv.R; /j
HC RC JC k F k0 K k v k0 : p 4
(62)
Since 1 4
Z jYj k jd 1;
so that jLj k .v/j sup jv.R; /j ;
(62) shows that Lj k are bounded in H0 . More precisely we have k Lj k k K :
(63)
Accordingly, thanks to the Riesz theorem, there are in H0 functions f such that Lj k .v/
0 ; k
jk
k0 K :
j k ./g
(64)
We shall not need the explicit form of f j k g, so we will not give it here; rather we need only to be sure that such functions are linearly independent. nC1 But this is immediate because if we take the functions snm .x/ Rr Ynm ./ Q which are continuous on S, because R > 0, and therefore belong to H0 , we have
0 D
1 4
Z Ynm ./Yj k ./d D ınj ımk :
Accordingly, if we have for some constants aj k
(65)
Geodetic Boundary Value Problem
305 L X
aj k
jk
D 0;
j;kD0
we also have 8.n; m/ with n L; jmj n, 0 D
0 D anm ;
j;kD0
which proves that f j k g are linearly independent. Note that, due to our previous Q convention, we can think of j k as well as functions harmonic in ˝. We note also that Span f j k ; 0 j Lg H0 is an .L C 1/2 D N dimensional subspace of H0 , and therefore, as finite subspace, it is naturally closed. We will call PN the orthogonal projector on H0 . We will consider also the complementary subspace H 0 , such that H0 D H 0 ˚ H0 ; H 0 ?H0 I clearly H 0 is closed and the orthogonal projector on H 0 is I PN . Since u 2 H 0 , PN u D 0 $
D 0 ; 0 j L ;
the following proposition is elementary to prove. Proposition 2. H 0 is precisely the closed subspace of H0 , of functions such that uDO
1 jxjLC2
; jxj ! 1 :
Now we introduce the solution space H1 , defined as the subspace of H0 Z H1 fu 2 H0 I
SQ
jruj2 R3 d < C1g :
(66)
Indeed the topology in H1 is defined by the norm Z k u k1 D Œ jruj2 R3 d 1=2 :
(67)
SQ
Remark 1. that H1 is a subspace of H0 comes from the following reasoning. H0 is equivalent to L2 .SQ /, and indeed the Sobolev space H1 .SQ / is embedded in it, k u k0 c k u kL2 .SQ / c k u kH1 .S/ Q :
(68)
306
F. Sansò
On the other hand, known estimates of the Stekelov-Poincaré operator, @n u, when u is harmonic and SQ is Lipschitz, say that (McLean (2000), p. 145) 0 k u kH1 .S/ Q c k @n u kL2 .S/ Q :
Therefore, on account of the inequality j@n uj jruj; we have too 0 00 k u kH1 .S/ Q c k ru kL2 .S/ Q c k u k1 :
(69)
The relations (68) and (69) together prove that, under the present hypotheses on SQ , k u k0 A k u k1 for some constant A, i.e., H1 H0 . Since it will be useful in the sequel, a numerical determination of an upper bound of A is performed in the Appendix. Because of the above remark, the following proposition makes sense. Proposition 3. Define the subspace of H1 H 1 fu 2 H1 I
0 D 0 ; 0 j Lg I
(70)
then H 1 is a closed subspace of H1 , with the following characterization:
1 jxjLC2
u 2 H1 W u 2 H 1 , u D O
; jxj ! 1 :
(71)
With the above definitions, we see that our SiMP can be expressed in a more synthetic form. Namely, recalling the SiMP boundary operator Bu D ru0 C 2u ;
(72) 2
the problem (48) is equivalent to: find u 2 H 1 and faj k g 2 R.LC1/ such that BujSQ D F C
L X
aj k
j;kD0
for every F 2 H0 .
jk
;
(73)
Geodetic Boundary Value Problem
307
With the help of the projector PN , (73) can be even split into two parts. To do that we have to exploit the following lemma. Lemma 1. The operator B maps H 1 into H 0 , whenever L 1. Proof. Let us put v D Br D
X
xi @i u C 2u I
(74)
i
then v is harmonic in ˝. In fact, noting that xi D 0 and @i u D 0, we have, in ˝ v D 2
X
@j xi @j i u C 2u D 4u D 0:
i;j
Furthermore we note that r@r is a homogenous differential operator and Bsnm .r; / D .n 1/snm .r; / !nC1 R snm .r; / D Ynm ./ : r
(75)
Therefore in ˝ D fr Rg, setting C1 X
uD
unm snm .r; / ;
(76)
n;mDLC1
we derive
v D Bu D
C1 X
.n 1/unm snm .r; / ;
(77)
n;mDLC1
namely, v nm D .n 1/unm :
(78)
Such a relation shows first of all that vDO
1 jxjLC2
; jxj ! 1
(79)
308
F. Sansò
and second that the relation Bu D v is invertible at least in ˝. But then, by the unique continuation property of harmonic functions (see Sansò and Sideris (2013), p. 623), we see that v D 0 in ˝ implies u D 0 in ˝, i.e., B is into. Finally, if u 2 H 1 H1 , since ju0 j jruj, we see that v D ru0 C2u is in L2 .SQ /, i.e., v 2 H0 . This together with (79) proves that v 2 H 0 . t u On the basis of Lemma 1, we see that 8u 2 H 1
.I PN /Bu D Bu
(80)
PN Bu D 0 :
(81)
and 8u 2 H 1
So if we multiply (73) by .I PN / and PN , respectively, we find .I PN /BujSQ D BujSQ D .I PN /F F ; PN Bu D 0 D PN F C
L X aj k
jk
(82)
:
(83)
Given any F 2 H0 (83) determines uniquely faj k g because f j k g are a finite set of linearly independent functions. On the other hand, we can observe that F D .I PN /F can be seen as a function in H 0 , so our SiMP is reduced to find a solution of BujSQ D F
(84)
with F 2 H 0 and u 2 H 1 . Since B maps H 1 into H 0 , which we have already seen, the problem now is to prove that B is also onto. We will do it in the next paragraph by showing that if v 2 H 0 ; vjSQ D F , then there is a solution u of (84) harmonic in ˝ and such that k u k1 < C1.
4
Solution of the SiMP and of the LSMP
The first part of this section is devoted to prove that the SiMP, represented by (48) and subsequently summarized in (84), has one and only one solution in H 1 for every datum F in H 0 . Then, in the last part of the section, we generalize the theorem of existence and uniqueness to the LSMP, by a simple perturbative argument. Theorem 1. The SiMP, formulated in (48), has one and only one solution 2
u 2 H 1 ; faj k I u j L; j K j g 2 R.LC1/ ;
(85)
Geodetic Boundary Value Problem
309
for every datum F 2 H 0 , on the condition that 3JC COL < 1 ;
(86)
where (recall (50), (51) and (56) for the definition of the symbols) COL D
1 ıR JC : C RC LC2
(87)
Moreover, under (86), the solution depends continuously on the data, since k u k1 CL k F k0 ; CL D
(88)
JC 1 3JC COL
(89)
and p p jaj k j 4 RC
R R
!LC1 k F k0 :
(90)
Proof. We already know from the previous section that the SiMP can be decomposed into a finite system (see (83)) j L X X
aj k
jk
D PN F
(91)
j D0kDj
and the equation u 2 H 1 ; BujSQ D F .I PN /F : As for (91) we can exploit the bi-orthogonality of f j C1 R Yj k ./g, (see (65)), to get r
(92) jkg
with fsj k
D
jmj n ; 0 n L I anm D< snm ; PN F >0 : Therefore janm j k snm k0 k F k0 :
(93)
310
F. Sansò
On the other hand Z k
snm k20
D
Ynm ./
4
2
R R
!2nC2
R R
R d
!2LC2 RC I
(94)
by using (94) in (93) one gets (90). We come now to (92). We know from Lemma 1 that B maps continuously H 1 into H 0 . We further on prove that 8u 2 H 1 , if (86) holds, the majorization k u k1 CL k Bu k0 D CL k F k0
(95)
is true. Then (95) implies that the image of B is closed in H 0 . In the second part of the theorem, we will show that there is a subset of the image of B, which is dense in H 0 . Therefore B is into and onto H 0 so that the solution of (92) exists and is unique, 8F 2 H 0 . Since F D .I PN /F , i.e., k F k0 k F k0 , existence and uniqueness of the solution of the SiMP will result; the majorization (88) then is nothing but (95). So we start from proving (95). We will apply the so-called energy identity, already used in Hörmander (1976) and Sansò and Venuti (2008). A short way to get it is to apply the differential identity r fxjruj2 2ru0 ru urug D 0
(96)
(see Jerison and Kenig (1982), p. 37), valid for every u harmonic in ˝, and use the Gauss theorem, noting also that r ndS D R3 d . So we get Z 2 (97) k u k1 D jruj2 R3 d D Z D
Z 2ru0 un dS C Z
2
uun dS Z
F un dS 3
uundS
where we have denominated ˇ F D ru0 C 2u D BuˇSQ :
(98)
We further note that Z Z j F un dS j D j F ./un ./R2 J d j
(99)
JC k F k0 k un R k0 JC k F k0 k u k1
Geodetic Boundary Value Problem
311
as well as (see (131) in the Appendix) Z j
uun dS j JC k u k0 k u k1
(100)
JC COL k u k21 : Putting (97), (99), and (100) together, we obtain k u k21 JC k F k0 k u k1 CJC COL k u k21 ;
(101)
which simplified and solved for k u k1 , if (86) is satisfied, yields (95), with CL as in (89). The geometric meaning of (86) will be discussed later on. We come now to the second part of the proof. Let us consider the space of finite linear combinations of the solid spherical harmonics, i.e., Q D Span fsn;m .x/ I jmj u; :n D 0; : : :g H.˝/
(102)
Since fsnm .x/g and fjrsnm .x/jg are smooth functions in ˝, and in particular bounded on SQ , it is obvious that both relations Q H0 ; H.˝/ Q H1 ; H.˝/
(103)
hold true. Q the harmonics up to degree L; namely we define Now we take away from H.˝/ Q D Span fsnm .x/ I jmj n ; n D L C 1; : : :g I H.˝/
(104)
Q H 0 ; H.˝/ Q H1 : H.˝/
(105)
it is clear that
On the other hand, on account of the relation (75), Bsnm D .n 1/snm ;
(106)
Q with itself. we see that, since L > 1 by hypothesis, B is an isomorphism of H.˝/ Q We shall prove that H.˝/ is dense in H 0 , and so, by the above comment, the theorem will be completely proved. Q is dense in H0 , We achieve our result in two steps: first we will prove that H.˝/ which is well known but will be repeated here for the sake of completeness; then we Q is dense in H 0 . shall constructively prove that H.˝/ As for the first part, we note that it is enough to show that if u 2 H0 is also such that u?H, i.e.,
312
F. Sansò
< u; snm >0 D 0 ;
8n; m;
(107)
then necessarily u D 0. In fact (107) can be written too as Z Ynm ./ u.R ; / nC1 R d D 0 I R
(108)
so if we call u.R ; / ; JR
(109)
Ynm ./ dS D 0 : RnC1
(110)
./ D (108) becomes
Z ./ SQ
Multiplying (110) by Z w.r; 0 / D
rn 0 2nC1 Ynm . /,
with r < R, and adding, we get
1 ./ p 2 2 r C R 2rR cos
0
dS D 0 ;
(111)
meaning that the single layer potential w.r; / is identically zero in a neighborhood of the origin. Whence w.x/ is zero everywhere inside SQ . On the other hand we know that (McLean (2000), Theorem 6.11) w.x/ satisfies the jump relation ˇ ˇ ˇ @w ˇˇ @w ˇˇ ˇ wC jSQ w SQ D 0 ; D ; @nC ˇSQ @n ˇSQ
(112)
if SQ is at least Lipschitz, as we have supposed, and is at least in H 1=2 .SQ /, what is the case because in the present application is in L2 .SQ /. The first of (112) tells us that wC jSQ D 0, namely, that w.x/ is zero everywhere, but for SQ itself. The second relation then implies that D0)uD0; as it was to be proved. Q that it is to a distance Now take any u 2 H 0 ; we prove that there is a v 2 H.˝/ Q such that, for closer than a fixed > 0 from u in H0 . We start by finding vQ 2 H.˝/ an arbitrary " > 0, k u vQ k0 < " ; Q in H0 . what is always possible by virtue of the density of H.˝/
(113)
Geodetic Boundary Value Problem
313
Naturally in general vQ does not belong to H 0 , but we can put vQ D
L X
N X
cnm snm C
n;mD0
cnm snm D vQ L C v
(114)
n;mDLC1
Q with v 2 H.˝/. Now, for all m and n L, since u 2 H 0 , cnm D
0 ;
(115)
jcnm j K k u vQ k0 K" :
(116)
k snm k0 C
(117)
k vL k .L C 1/2 KC " :
(118)
implying that (recall (57))
Since also
when n L, we have too
Accordingly we have, with v 2 H 0 , k u v k0 Dk u vQ C vQ L k0 .1 C .L C 1/2 KC /" :
(119)
Therefore it is enough to choose " such that .1 C .L C 1/2 KC /" < and the proof of the theorem is complete. t u We can now pass to the analysis of the LSMP, that was formulated by (47) as 8 ˆ < u D 0 P Bu D f C aj k ˆ 1 :u D O jxjLC2
jk
in ˝Q Du on SQ jxj ! 1 ;
with f ./ D R g./ ; D D R " r
j"j "C j j C :
(120)
314
F. Sansò
The first remark is that, as we have done for the SiMP, such a problem can be reformulated with the help of the projector PN , as: find u 2 H 1 such that Bu D .I PN /.f Du/
(121)
and numbers faj k g such that L X
aj k
jk
D PN .f Du/ :
(122)
j kD0
The idea is to determine conditions for the existence of u 2 H 1 as solution of (121), when f 2 H 0 and then substitute u in (122), to determine faj k g. Theorem 2. Assume that f 2 H 0 and that the condition JC Œ.3 C C /COL C "C < 1
(123)
is satisfied. Then there is one and only one solution u 2 H 1 of (121) satisfying the inequality k u k1
CL k f k0 I 1 CL ."C C C COL /
(124)
moreover, since k Du k0 < C1, the Eq. (122) determines uniquely the constants faj k g. Proof. We first observe that (90) is identical with (48) if we put F D f Du D f .R " ru u/ :
(125)
Accordingly, since if (123) is satisfied, a fortiori (86) is satisfied too, we can claim that the Eq. (92) is equivalent to (121), and its solution, in force of (88), satisfies the inequality k u k1 CL k f k0 CCL k r" ru k0 CCL k u k0 :
(126)
On the other hand, the following inequalities hold k r" ru k0 "C k u k1 k u k0 C k u k0 C COL k u k1 : Summarizing we obtain the following a priori majorization for the solution of (121)
Geodetic Boundary Value Problem
315
k u k1 CL k f k0 CCL ."C C C COL / k u k1 :
(127)
Now if the condition CL ."C C C COL / < 1
(128)
is satisfied, we find that the operator 1
B 1 D W H ! H
1
is just a contraction, i.e., k B 1 D k< 1, so that (121) has one and only one solution for every f 2 H 0 . Actually, recalling that CL D
JC ; 1 3JC COL
we see that (128) is equivalent to (123). Finally (124) descends immediately from (127). The proof is complete. t u
5
Conclusions
The proofs reported in Sect. 4 are largely based on the paper Sansò and Venuti (2008). However if one compares the present article with the above, one can realize that the proof of the existence of the solution has been added and, maybe even more important, that almost a factor of 2 has been gained in the number of harmonic coefficients to be provided as input data. Roughly speaking, we can summarize the present understanding of the LSMP by saying that if a telluroid with a maximum inclination of 60ı is given and a datum f 2 L2 .SQ / plus the first 12 degrees of the asymptotic expansion of the potential, Q In fact, going to the key condition (123) then a unique solution exists in H 1;2 .S/. and substituting IC D 60ı ; JC D 2; "C Š 0:0067; C Š 0:0134; L D 12; RC D a C 6 km Š 6;384 km; ıR D a b C 6 km Š 26:5 km; one obtains 1 ıR JC .3 C C / JC C "C D 0:9244 < 1 I C RC LC2
316
F. Sansò
in reality the limit value is IC D 61ı :5. Note that the 6 km added to the equatorial radius corresponds roughly to the height of the Chimborazo mountain, which is almost on the equator. To the knowledge of the author, this is the best result known on this problem. However it can be argued that, possibly applying some interpolation theory to the Q norm with that of H 1;2 .SQ /, one could obtain geometric majorization of the L2 .S/ Q conditions on S in the form of some norm Lp .SQ / of J D .cos I /1 , as it has been done in Hörmander’s paper (Hörmander (1976)). One has the impression that, although IC 60ı is not very restrictive, a result in the mean could be the best. A final remark on the choice of L2 .SQ / as topology for the data is that with such a space it is easy to translate the result to a space of generalized random field, including a white noise on SQ (see Grothaus and Raskop (2010)). This means that our solution should be stable even under a white noise perturbation. But the analysis of this argument is out of the scope of the present work.
Appendix In this appendix we aim to prove the subsequent proposition. Proposition 4. Let u 2 H 1 , namely, Z k u k1 D
SQ
and
jruj2 R2 d
12 < C1
(129)
Z
0 D
u.R; /Yj k ./d D 0
(130)
0 j L ; j k j I then Z k u k0 D
12
2
u .R ; /R d
COL k u k1
(131)
where COL D
1 ıR JC : C RC LC2
(132)
Proof. Let us put uC ./ D u.RC ; /
(133)
Geodetic Boundary Value Problem
317
and observe that k u k0 k u uC k0 C k uC k0 12 Z 2 Œu.R ; / u.RC ; / R d Z
12
2
C
(134)
u.RC ; / R d
:
On the other hand, we can write ˇZ ˇ ju uC j2 D ˇˇ
RC R
ˇ2 Z ˇ u0 dr ˇˇ
ıR R RC
Z
RC
RC
u02 rdr
R
1 1 R RC
jruj2 r 2 dr I
R
therefore multiplying by R and integrating over the unit sphere, we get k u uC k20
ıR RC
Z jruj2 d ˝
(135)
˝n˝C
where ˝C fRC rg
(136)
˝n˝C fR r RC g :
(137)
Furthermore, we can write k
uC k20
C1 X
2 RC 4 uC nm n;mDLC1
Z D RC
u.RC ; /2 d ;
(138)
with uC nm D
1 4
Z u.RC ; /Ynm ./d :
(139)
Therefore we can claim too that k
uC k20
C1 X 1 2 .n C 1/uC RC 4 nm LC2 n;mDLC1
! :
(140)
318
F. Sansò
On the other hand it is easy to verify that Z
Z 2
jruj d ˝ D ˝C
uu SC
0
2 RC d
C1 X
D RC 4
2
.n C 1/uC nm
n;mDLC1
so that (140) can be put into the form k
uC k20
1 LC2
Z jruj2 d ˝:
(141)
˝C
Collecting (134)–(136) yields s ıR RC
sZ
1
jruj2 d ˝ C p LC2 s sZ ıR 1 C jruj2 d ˝ : RC LC2 ˝
k u k0
ı˝
sZ jruj2 d ˝ ˝C
So, by applying the Gauss theorem, k u k20
1 ıR C RC LC2
Z uun R2 Jd
(142)
S
COL k u k0 k un R k0 COL k u k0 k u k1 : Dividing both members of (142) by k u k0 , we get (131).
t u
References Cimmino G (1955) Spazi hilbertiani di funzioni armoniche e questioni connesse. In: Equazioni lineari alle derivate parziali. UMI Roma Freeden W, Gerhards C (2013) Geomathematically oriented potential theory. Chapman & Hall/CRC/Taylor Francis, Boca Raton Friedman A (1982) Variational principles and free boundary value problems. Wiley, New York Grothaus M, Raskop T (2010) Oblique stochastic boundary value problems. Handbook of GeoMath, Springer, pp 1052–1076 Hörmander L (1976) The boundary value problems in physical geodesy. Arch Prot Mech Anal 62:51–52 Jerison DS, Kenig C (1982) Boundary value problems on Lipschitz domains. MAA studies in mathematics, University of Pennsylvania, Minneapolis, vol 23, pp 1–68 McLean W (2000) Strongly elliptic systems and boundary integral equations. Cambridge University Press, Cambridge Miranda C (2008) Partial differential equations of elliptic type. Springer, Berlin
Geodetic Boundary Value Problem
319
Molodensky MS, Eremeev VF, Yurkina MI (1960) Methods for the study of the gravitational field of the Earth. Russian Israel Program for Scientific Translations, Jerusalem Moser J (1966) A rapidly convergent iteration method and non-linear differential equations. Acc Sc Norm Sup Pisa 20:265–315 Otero J, Sansò F (1999) An analysis of the scalar geodetic boundary value problems with natural regularity results. J Geod 73:437–435 Sacerdote F, Sansò F (1986) The scalar boundary value problem of physical geodesy. Man Geod 11:15–28 Sansò F (1977) The geodetic boundary value problem in gravity space. Memorie Accademia nazionale dei Lincei, Rome, Italy, vol 14, ser 8, no 3 Sansò F, Sideris M (2013) Geoid determination: theory and methods. Springer, Berlin/Heidelberg Sansò F, Venuti G (2008) On the explicit determination of stability constants for the linearized geodetic boundary value problems. J Geod 82:909–916
Time-Variable Gravity Field and Global Deformation of the Earth Jürgen Kusche
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Mass and Mass Redistribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Earth Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Analysis of TVG and Deformation Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
322 323 329 333 336 337 337
Abstract
The analysis of the Earth’s time-variable gravity field and its changing patterns of deformation plays an important role in Earth system research. These two observables provide, for the first time, a direct measurement of the amount of mass that is redistributed at or near the surface of the Earth by oceanic and atmospheric circulation and through the hydrological cycle. In this chapter, we will first reconsider the relations between gravity and mass change. We will in particular discuss the role of the hypothetical surface mass change that is commonly used to facilitate the inversion of gravity change to density. Then, after a brief discussion of the elastic properties of the Earth, the relation between surface mass change and the three-dimensional deformation field is considered. Both types of observables are then discussed in the framework of inversion. None of our findings are entirely new; we merely aim at a systematic compilation and discuss some frequently made assumptions. Finally, some directions for future research are pointed out.
J. Kusche () Astronomical, Physical and Mathematical Geodesy Group, Bonn University, Bonn, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_8
321
322
1
J. Kusche
Introduction
Mass transports inside the Earth and at or above its surface generate changes in the external gravity field and in the geometrical shape of the Earth. Depending on their magnitude, spatial and temporal scale, they become visible in the observations of modern space-geodetic and terrestrial techniques. Examples for mass transport processes that are sufficiently large to become observed include changes in continental water storage in greater river basins and catchment areas, large-scale snow coverage, present-day ice mass changes occurring at Greenland and Antarctica and at the continental glacier systems, atmospheric pressure changes, sea level change and the redistribution of ocean circulation systems, land-ocean exchange of water, and glacial-isostatic adjustment (cf. Fig. 1). As many of these processes are directly linked to climate, an improved quantification and understanding of their present-day trends and interannual variations from geodetic data will help to separate them from the long-term evolution, typically inferred from proxy data. The order of magnitude for atmospheric, hydrological, and oceanic loads, in terms of the associated change of the geoid, an equipotential surface of the Earth’s gravity field, is about 1 cm or 1 109 in relative terms. For example, in the Amazon region of South America, this geoid change is largely caused by an annual basin-wide oscillation in surface and ground water that amounts to several dm. The variety of geodetic techniques that are sufficiently sensible include intersatellite tracking as currently conducted with the Gravity Recovery and Climate Experiment (GRACE) mission, satellite laser ranging, superconducting relative and free-fall absolute gravimetry, and the monitoring of deformations with the Global Positioning System (GPS) and with the Very-Long Baseline Interferometry (VLBI) network of radio telescopes, Interferometric Synthetic Aperture Radar (InSAR), and of the ocean surface with satellite altimetry and tide gauges. Here, we will limit ourselves to a discussion of the fundamental relations that concern present-day mass redistributions and their observability in time-variable gravity (TVG) and time-variable deformation of the Earth. This is not intended to form the basis for real-data inversion schemes. Rather, we would like to point out essential assumptions and fundamental limitations in some commonly applied algorithms. To this end, it will be sufficient to consider the response of the solid Earth to the mass loading as purely elastic. In fact, this is a first assumption that will not be allowed anymore in the discussion of sea level, when geological timescales are involved at which the Earth’s response is driven by viscous or viscoelastic behavior. Furthermore, we will implicitly consider only that part of mass redistribution or mass transport that is actually associated with local change of density. Mass transport that does not change the local density (stationary currents in the hydrological cycle; i.e., the “motion term” in Earth’s rotation analysis) is in the null-space of the Newton and deformation operators; it cannot be inferred from observations of time-variable gravity or from displacement data.
Time-Variable Gravity Field and Global Deformationof the Earth 100
Static
10
GRACE
CHAMP
1.000
Ice Bottom Topography Fronts Topographic Control Coastal Currents
Quasi Static’ Ocean Circulation (near Surface) Plumes
Mantle Converction
Plate Boundaries Lithosphere Structure
Secular Glacial Isostatic Adjustment
Decadal
Hydrology
Ice Flow
Ground Water
Hydrology Water Balance
Interannual Annual
Sea Ice Ice Sheet Mass Balance
Sea Level Change
Snow Soil Moisture
Storage Variations
Seasonal
Atmosphere
Basin Scale Ocean Flux
Ocean Bottam Currents
Atmosphere
Monthly
Run Off
Postseismic Deformation
Soild Earth and Ocean Tides
Diurnal Semi-Diurnal Instantaneous Time resolution
[km]
GOCE
10.000
323
Volcanos Co-Seismic Deformation
10.000
1.000
100
10
[km]
Spatial resolution
Fig. 1 Spatial and temporal scales of mass transport processes and the resolution limits of satellite gravity missions CHAMP, GRACE, and GOCE (From Ilk et al. 2005)
2
Mass and Mass Redistribution
In what follows, we will adopt an Eulerian representation of the redistribution of mass, where one considers mass density as a 4D field rather than following the path of individual particles (Lagrangian representation). The mass density inside the Earth, including the oceans and the atmosphere, is then described by D x0 ; t :
(1)
Mass density is the source of the external gravitational potential, described by the Newton integral
324
J. Kusche
Z V .x; t/ D G v
.x0 ; t/ dv; jx x0 j
(2)
where v is the volume of the Earth including its fluid and gaseous envelope. The inverse distance (Poisson) kernel can be developed into n 1 1 1 X X r 0 nC1 1 D Ynm .e/Ynm e0 ; 0 0 jx x j r nD0 mDn r .2n C 1/
(3)
where r D jxj, r 0 D jx0 j, and eD
x r
e0 D
x0 : r0
Ynm are the 4-normalized surface spherical harmonics. Using spherical coordinates (spherical longitude) and (spherical colatitude), Ynm D Pnjmj .cos /
cos m; sin jmj;
m 0; m < 0:
Here, the Pnm D …nm Pmn are the 4-normalized associated Legendre functions. 1=2 .nm/Š denotes a normalization factor that depends …nm D .2 ı0m /.2n C 1/ .nCm/Š only on harmonic degree n and order m. The associated Legendre functions relate m=2 d m Pn .u/ to the Legendre polynomials by Pmn .u/ D 1 u2 , and the Legendre d um n d n .u2 1/ polynomials may be expressed by the Rodrigues formula, Pn .u/ D 2n1nŠ d un . On plugging Eq. (3) into Eq. (2), the exterior gravitational potential of the Earth takes on the representation V .x; t/ D G
n 1 X X
1 .2n C 1/r nC1 nD0 mDn
Z
x0 ; t r 0n Ynm e0 dv
Ynm .e/;
(4)
v
which converges outside of a sphere that tightly encloses the Earth. On the other hand, by introducing a reference scale a, the exterior potential V can be written as a general solution of the Laplace equation; V .x; t/ D
n 1 GM X X a nC1 vnm .t/Ynm .e/ a nD0 mDn r
with 4-normalized spherical harmonic coefficients vnm D
cnm ; snjmj ;
m 0; m < 0:
(5)
Time-Variable Gravity Field and Global Deformationof the Earth
325
A direct comparison of Eqs. (4) and (5) provides the source representation of the spherical harmonic coefficients (see also chapter Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications): vnm .t/ D
1 1 1 M .2n C 1/ an
Z
r 0n Ynm e0 x0 ; t dv:
(6)
v
There are 2n C 1 coefficients for each degree n, and each coefficient follows by projecting the density on a single 3D harmonic function (solid spherical harmonic) r n Ynm . An equivalent approach is to expand the inverse distance into a Taylor series: 0 1 3 1 3 X 3 X X X 1 .n/ @ i1 i2 :::in xi01 xi02 : : : xi0n A D jx x0 j nD0 i D1 i D1 i D1 1
2
(7)
n
with components x10 , x20 , x30 of x0 , and .n/ i1 i2 :::in
D
1 nŠ
"
1 @n jxx 0j
@xi01 @xi02 : : : @xi0n
# :
(8)
xDx0
This leads to a representation of the exterior potential through mass moments .n/ Mi1 i2 :::in of degree n: 0 1 3 1 3 X 3 X X X .n/ .n/ @ V .x; t/ D G i1 i2 :::in Mi1 i2 :::in A nD0
i1 D1 i2 D1
(9)
in D1
with .n/
Mi1 i2 :::in D
Z v
xi01 xi02 : : : xi0n x0 ; t dv0 :
(10)
Note that the integrals in Eqs. (6) and (10) can be interpreted as inner products (; ) in a Hilbert space with integrable density defined on v. However, whereas the homogeneous Cartesian polynomials in Eq. (10) do provide a complete basis for the approximation of the density, the solid spherical harmonics in Eq. (6) fail to do so (in fact, they allow to represent the “harmonic density” part C for which C D 0). Due to symmetry of Eq. (10), there are .nC1/.nC2/ independent mass moments of 2 degree n. Of those only 2n C 1 become visible in the gravitational potential, which means that the remaining .nC1/.nC2/ .2n C 1/ D .n1/n span the null-space of the 2 2 Newton integral (Chao 2005). Some of the low-degree coefficients and mass moments deserve a special discussion, e.g., obviously
326
J. Kusche
v00 D
1 M
Z
M .0/ x0 ; t dv D M v
corresponds to the total mass of the Earth, scaled by a reference value M . Since this is largely constant in time, inversion of geodetic data usually assumes ddt v00 D 0 (or ıv00 D 0; see below). Moreover, the degree-1 coefficients directly correspond to the coordinates of the center of mass x0 .t/: p Z .1/ 1 3 M v10 .t/ D x30 dv D 3 D p x30 .t/ 3M v M 3 and 1 v11 .t/ D p x10 .t/ 3
1 v11 .t/ D p x20 .t/ 3
(in the reference system where the spherical coordinates , are referring to). Space-geodetic evidence suggests that if the considered reference frame is fixed to the crust of the Earth, the temporal variation of the center of mass is no more than a few mm, with the x30 -component being largest. This is the case for realizations of the International Terrestrial Reference System (ITRS). Yet rather large mass redistributions are required to shift the geocenter for a few mm, and the study of these effects is subject to current research. Similarly, the degree-2 spherical harmonic coefficients v2m can be related to the .2/ tensor of inertia of the Earth. However, of the .2C1/.2C2/ D 6 mass moments Mi1 i2 , 2 only 2 2 C 1 D 5 become visible in the five spherical harmonic coefficients of the gravity field, with the null-space of the Newton integral being of dimension 6 5 D 1 for degree 2. The annual variation of the “flattening” coefficient v20 is of the order 11010 , the linear rate ddt v20 (predominantly due to the continuing rebound of the Earth in response to deglaciation after the last ice age) at the 1 1011 y1 level. A special case is the variation of degree 2, order 1 coefficients (v21 , v21 ), as those correspond to the position of the figure axis of the Earth (in the reference system where the , are referring to). Since the mean figure axis is close to the mean rotation axis and since the latter can be determined with high precision from the measurement of Earth’s rotation, additional observational constraints for their time variation (of the order of 1 1011 y1 ) exist. Nowadays, the vnm can be observed from precise satellite tracking with global navigation systems, intersatellite ranging, and satellite accelerometry with temporal resolution of 1 month or below and spatial resolution of up to nmax D 120. However, as mentioned earlier, one cannot uniquely invert gravity change into density change. The question had been raised which physically plausible assumptions nevertheless allow to somehow locate the sources of these gravity changes. A common way to regularize this “gravitational-change inverse problem” (GCIP) is to restrict the solution space to density changes within an infinitely thin spherical shell of radius
Time-Variable Gravity Field and Global Deformationof the Earth
327
a (Wahr et al. 1998). This corresponds to the determination of surface mass from an external potential. The spherical harmonic coefficients vnm of the potential caused by a surface density on a sphere are 1 a2 vnm .t/ D M .2n C 1/
Z
e0 ; t Ynm e0 ds:
(11)
s
Since only changes with respect to a reference status, which can be realized through the measurements (e.g., a long-term average), are observable, one defines ı.e0 ; t/ D e0 ; t N e0 and the coefficients of potential change follow from ıvnm .t/ D vnm .t/ vN nm where vN nm D
R t2 t1
1 a2 D M .2n C 1/
Z
ı e0 ; t Ynm e0 ds;
(12)
s
vnm .t 0 /dt 0 . Or, involving the spherical harmonic expansion of ı,
ıvnm D
1 1 4a2 3 ınm D ınm ; M .2n C 1/ e .2n C 1/
(13)
, va D 34 a2 where we made use of the average density of the Earth, e D M va being the volume of a sphere of radius a. Obviously, this relation is coefficient-bycoefficient invertible. Corresponding to the 2n C 1 (unnormalized) coefficients ıvnm at degree n, there are just 2n C 1 surface mass moments, with components e10 , e20 , e30 of e0 , .n/
Z
ıAi1 i1 :::in D an
s
ei01 ei02 : : : ei0n ı e0 ; t ds:
(14)
The integrals in Eqs. (12) and (14) can be viewed as inner products (; ı) in a Hilbert space of integrable (density) functions defined on the sphere. We will again look at low-degree terms. Obviously, ıv00 D
4a2 3 a2 .0/ ı00 D ı00 D ıA M e 4
corresponds to the change in total surface mass. This should be zero as long as P ı considers all subsystems ı D s ı.s/ , whereas a shift of total mass from, e.g., the ocean to the atmosphere may happen very well. For the center-of-mass shift referred to the Earth (strictly speaking, to the center of the sphere where is located on) with mass M ,
328
J. Kusche
4a2 1 1 1 .1/ ı10 D ı10 D p x30 D p ıA3 3M 3e 3 3e
ıv10 .t/ D
and similar for ıv11 and ıv11 . Restriction of the density to a spherical shell serves for eliminating the nullspace of the problem, thus allowing a unique solution for density determination from gravity. In fact, the surface layer could be located on the surface of an ellipsoid of revolution or any other sufficiently smooth surface as well. This could be implemented in Eqs. (11) and (12) by including an upward continuation integral term; however, it would not allow a simple scaling of the coefficients as with Eq. (13) anymore. On the other hand, in the light of comparing inferred surface densities to modeled densities (e.g., from ocean or hydrology models), one could determine point-wise densities on a nonspherical surface from those on a spherical shell by n applying an ar term (Chao 2005). The 3D density and the 2D density are of course related to each other. This can be best seen by writing the source representation of the spherical harmonic coefficients, Eq. (6), in the following way: 1 a2 vnm .t/ D M .2n C 1/
Z (Z s
rmax 0
) 0 nC2 0 r x ; t dr Ynm e0 ds a
(15)
and comparing the term in {. . . } brackets to Eq. (11). Surface mass change ı can thus be considered as a column-integrating “mapping” of ı, say, L2 .s/ L1 .Œ0; rmax / ! L2 .s/ if we assume square integrability. A coarse approximation of the integral in Eq. (15) is used if surface density change ı is transformed to “equivalent water height” change ıhw D ı w with a constant reference value w for the density of water, or if (real) water height change (e.g., from ocean model output) is transformed to surface density change ı D w ıhw . In this case, the assumption is implicitly made that ar 1, introducing an error of the order of .n C 2/f ınm , where f is the flattening of the Earth (about 1/300). For computing the contribution of the Earth’s atmosphere to observed gravity change, it is nowadays accepted that 3D integration should be preferred to the simpler approach where surface pressure anomalies are converted to surface density. For example, in the GRACE analysis (Flechtner 2007) .x0 ; t/ dr in Eq. (15) is dp , assuming hydrostatic equilibrium. Then with g.r/ g a 2 replaced by g.r/ r and r D a C N C 1ˆˆ , a
1 1 a2 vnm .t/ D M .2n C 1/ g
Z (Z s
0 ps
N a C aˆ a
nC4
)
dp Ynm e0 ds:
(16)
Time-Variable Gravity Field and Global Deformationof the Earth
329
In Eq. (16), N is the height of the mean geoid above the ellipsoid, ps is surface pressure, and ˆ is the geopotential height that is derived from vertical level data on temperature, pressure, and humidity. A completely different way to solve the gravitational inverse problem (GIP) (and the GCIP) has been suggested by Michel (2005), who derives the harmonic part C of the density from gravity observations.
3
Earth Model
A caveat must be stated at this point, since so far we have considered the Earth as rigid. In reality, any mass redistribution at the surface or even inside the Earth is accompanied by a deformation of the solid Earth in its surroundings, which may be considered as an elastic, instantaneous response for short timescales and a viscoelastic response for longer timescales. This deformation causes an additional, “indirect” change of the gravity potential, which is not negligible and generally depends on the harmonic degree n of the load ınm . The linear differential equations that describe the deformation and gravity change of an elastic or viscoelastic, symmetric Earth, forced by surface loading of redistributing mass, are usually derived by considering a small perturbation of a radial-symmetric, hydrostatically prestressed background state. In the linearized equation of momentum, r C r.0 g0 u er / 0 rıV ıg0 er
d2 uD0 dt 2
(17)
the first term describes the contribution of the stresses; the second term the advection of the hydrostatic prestress (related to the Lagrangian description of a displaced particle); the third term represents self-gravitation, expressing the change in gravity due to deformation; and the fourth term describes the change in density if one accounts for compressibility. Here, is the incremental stress tensor, u D u.x; t/ the displacement of a particle at position x; ıV the perturbation of the gravitational potential, and ı the perturbation of density. The fifth term in Eq. (17) is necessary when one is interested in the (free or forced) eigenoscillations of the Earth or in the body tides of the Earth caused by planetary potentials; however, the dominant timescales for external loads and the “integration times” for observing systems such as GRACE are rather long, with the consequence that one is usually confident with the quasi-static solutions. Assuming the Earth in hydrostatic equilibrium prior to the deformation is certainly a gross simplification; it is however a necessary assumption to find “simple” solutions with the perturbation methods that are usually applied (e.g., Farrell 1972). Generally, the density perturbation can be expressed by the continuity equation: ı D r .u/:
(18)
330
J. Kusche
For the perturbation of the potential, the Poisson equation ıV D 4Gı
(19)
must hold. These equations are to be completed by a rheological law (e.g., D f ./ for elastic behavior, D f .; P / for viscoelastic Maxwell behavior, being the strain tensor) and boundary conditions for internal interfaces of a stratified Earth and for the free surface. For an elastic model consisting of Z layers, the rheology is usually prescribed by polynomial functions D .z/ .r/; D .z/ .r/; D .z/ .r/ .z/ of the Lamé parameters ; and the density within each layer, rmin .r/ r .z/ rmax . Solutions for potential change and deformation at the surface, obtained for the boundary condition of surface loading, are often expressed through Green’s functions: Z 1Z KV x; x0 ; 0 ı x0 ; t 0 ds d ıV .x; t/ D D0
s
.x0 D ae0 / and Z
1
Z
Ku x; x0 ; 0 ı x0 ; t 0 ds d;
u.x; t/ D D0
s
where the kernels for general (anisotropic, rotating, ellipsoidal, viscoelastic) Earth models can be represented location- and frequency-dependent coefficients, P through 0 k .; ; z/ k 0 nm .0 ; 0 ; z/, where z is the Laplace e.g., K .x; x0 ; 0 / D nm nm transform parameter with unit s1 . Couplings between the degree-n, order-m terms of the load and potential-change and deformation harmonics of other degrees and orders have to be taken into account (e.g., Wang 1991). It should be noted already here that Green’s functions essentially represent the impulse response of the Earth (i.e., Eqs. (17)–(19) together with a rheological law) and, as such, may be determined from measurements under a known load “forcing.” For symmetric, nonrotating, elastic, and isotropic (SNREI) models of the Earth, 0 the knm .; ; z/ D kn0 are simply a function of the harmonic degree and the Green’s function kernels depend on the spatial distance of x and x0 only. Numerical solutions for the kn0 start with the observation that in Eqs. (17)–(19) the reference quantities 0 ; g0 and the Lamé parameters that enter Hooke’s law all depend on the radius r only. Figure 2 shows the load Love numbers – h0n , nln0 , – nk0n that Farrell (1972) computed, using a Gutenberg-Bullen model for the Earth’s rheology, for surface potential change and displacement. (It should be noted that Love numbers are radius dependent but usually in geodesy only the surface limits are applied.) The solution for the gravitational response to a surface mass distributed on top of an elastic Earth can be best described in spherical harmonics: ıvnm .t/ D
1 a2 1 C kn0 M .2n C 1/
Z
ı x0 ; t Ynm e0 ds s
(20)
Time-Variable Gravity Field and Global Deformationof the Earth
331
Fig. 2 Load Love numbers after Farrell (1972), h0n .C/; nln0 . /; nkn0 ./ versus degree n
or ıvnm D
4a2 3 1 1 ınm D ınm ; 1 C kn0 1 C kn0 M .2n C 1/ e .2n C 1/
(21)
where the “1” term is the potential change caused by ı and the load Love number “kn0 ” term describes the incremental potential due to solid Earth deformation. Again, degree 0 and 1 terms deserve special attention: Due to mass conservation, k00 must be zero. Also, l00 D 0 (see below) is obvious since a uniform load on a spherical Earth cannot cause horizontal displacements due to symmetry reasons. In contrast, h00 corresponds to the average compressibility of the elastic Earth and will not vanish. In the local spherical East-North-Up frame, the solution for u is u.x; t/ D
u.x; t/ h.x; t/
Da
n 1 X X ı
nm .t/r
ıhnm .t/
nD0 mDn
Ynm .e/
(22)
with r being the tangential part of the gradient operator r. In spherical coordinates, r D
1 @ 1 @ e C e r @ r sin @
r D r C
@ : @r
The radial displacement function is h0n a2 ıhnm .t/ D gM .2n C 1/
Z
ı x0 ; t Ynm e0 ds
s
and at location x of the Earth’s surface, the radial displacement is ıh.x; t/ D a
n 1 X X nD0 mDn
ıhnm Ynm .e/:
(23)
332
J. Kusche
The lateral displacement function is ln0 a2 ı nm .t/ D gM .2n C 1/
Z
ı x0 ; t Ynm e0 ds:
(24)
s
At location x, the East and North displacement components are n 1 X X ı ıe x0 ; t D a
nm
@ Ynm .e/ @e
nm
@ Ynm .e/: @n
nD0 mDn
and n 1 X X ın x0 ; t D a ı nD0 mDn
A reference system fixed to the center-of-mass of the solid Earth (CE system) is the natural system to compute the dynamics of solid Earth deformation and to model load Love numbers (Blewitt 2003). This system is obviously “blind” to mass transports that shift the center-of-mass of the earth (excluding ocean, atmosphere, etc.); hence, by definition for the degree-1 potential Love number k10 D 0. Note that this is not true for the other degree-1 Love numbers, e.g., Farrell (1972) found h01 D 0:290 and l10 D 0:113. However, the CE system is difficult to realize in practice by space-geodetic “markers.” There are two principal ways to compute deformations for other reference systems: (a) first compute in CE, and subsequently apply a translation; (b) transform degree-1 Love numbers, and compute in the new system. Blewitt (2003) showed that a translation of the reference system origin along the direction of the Load moment can be absorbed in the three load Love numbers by subtracting the “isomorphic parameter” ˛ from them (we follow the convention of Blewitt 2003), kQ10 D k10 ˛, hQ 01 D h01 ˛, and lQ10 D l10 ˛. For example, when transforming from the CE system to the center of mass of Earth system (CM system), which is fixed to the center of the mass of the Earth including the atmospheric and oceanographic loads, one has ˛ D 1: However, in reality the network shift (and rotation and scaling) might not entirely be known, as supposed for the mentioned approaches. In this situation, Kusche and Schrama (2005) and Wu et al. (2006) showed that from 3D displacements in a realistic network and for a given set of Love numbers with h01 ¤ l10 , it is possible to separate residual unknown network translation and degree-1 deformation in an inversion approach. For realistic loads, the theory predicts vertical deformation of up to about 1 cm and horizontal deformation of a few mm. The solution in Eq. (22) refers to the local spherical East-North-Up frame, since it refers to an SNREI model. This must be kept in mind since geodetic displacement vectors are commonly referred to a local ellipsoidal East-North-Up frame. Otherwise a small but systematic error will be introduced from projecting height displacements erroneously onto North displacements.
Time-Variable Gravity Field and Global Deformationof the Earth
333
In reality, the Earth is neither spherically symmetric nor purely elastic or isotropic, and it rotates in space. The magnitude of these effects is generally thought to be small. For example, Métivier et al. (2005) showed that ellipticity of the Earth, under zonal (atmospheric) loading, leads to negligible amplitude and phase perturbations in low-degree geopotential harmonics. Another issue is that, even within the class of SNREI models, differences in rheology would cause differences in the response to loading. Plag et al. (1996) found for the models preliminary reference Earth model (PREM) and parametric Earth model, continental (PEMC), differing in lithospheric properties, differences in the vertical displacement of up to 20 %, in the horizontal displacement of up to 40 %, and in gravity change of up to 20 % in the vicinity (closer than 2ı distance) of the load, but identical responses for distances greater than this value.
4
Analysis of TVG and Deformation Pattern
In many cases, the aim of the analysis of time-variable gravity and deformation patterns is, at least in an intermediate sense, the determination of mass changes within a specific volume: Z .x; t/ N x0 dv; (25) ıM D v
e.g., an ocean basin or a hydrological catchment area. Under the assumption that the thin-layer hypothesis provides an adequate description, ıM can be found by integrating ı over the (spherical) surface s of v (Wahr et al. 1998). The integration is usually performed in spectral domain, with the 4-normalized spherical harmonic coefficients snm of the characteristic function S for the area s Z ıM D
s
n 1 X X ı e0 ; t ds D snm ınm .t/;
(26)
nD0 mDn
or, in brief, ıM D .S ; ı/. Following the launch of the GRACE mission, the analysis of time-variable surface mass loads from GRACE-derived spherical harmonic coefficients via the inverse relation (Wahr et al. 1998) ınm D
e .2n C 1/ ıvnm 3 1 C kn0
(27)
has been applied in a fast-growing number of studies (for an overview, the publication database of the GRACE project at http://www-app2.gfz-potsdam.de/ pb1/op/grace/ might be considered). Since GRACE-derived monthly or weekly ıvnm D vnm vN nm are corrupted with low-frequency noise, and since they exhibit longitudinal artifacts when projected in the space domain, isotropic or anisotropic
334
J. Kusche
filter kernels are applied to these estimates usually. See Kusche (2007) for a review on methods, where in general a set of spectrally weighted coefficients ıQvnm D P filter pq wnm ıvpq is derived. p;q
In principle, spherical harmonic models of surface load can be inferred coefficient-by-coefficient as well once a spherical harmonic model of vertical deformation has been derived: ınm D
e .2n C 1/ ıhnm 3 h0n
(28)
ınm D
e .2n C 1/ ı 3 ln0
(29)
or, of lateral deformation nm :
For the combination of vertical and lateral deformations in an inversion, two approaches exist: (a) either the ınm are estimated directly from discrete u.xi / or (b) in a two-step procedure, first the hnm and nm are estimated separately from discrete height and lateral displacements, and the ınm are subsequently inferred from those. In practice, 3D displacement vectors are available in discrete points of the Earth’s surface. For the network of the International GNSS Service (IGS), a few hundred stations provide continuous data such that a maximum degree and order of well beyond degree and order 20 might be resolved. However, several studies have shown that this rather theoretical resolution cannot be reached. The reason is that spatial aliasing is present in the signal measured at a single site, which measures the sum of all harmonics up to infinity, unless the signal above the theoretical resolution can be removed from other data such as time-variable gravity. Inversion approaches must be seen in the light of positioning accuracies of modern worldwide networks, which are currently of a similar order compared with the deformation signals (few mm horizontally and a factor of 2–3 worse in the vertical direction). Whereas random errors may significantly reduce in the spatial averaging process that is required to form estimates for low-degree harmonics, any systematic errors at the spatial scale of the signals of interest may be potentially dangerous. A particular problem is given by the presence of secular trends in the displacement data that are dominated by nonloading phenomena like glacial isostatic adjustment (GIA), plate motion, and local monument subsidence. If the load ı.t/ is known, from independent measurements (e.g., water level measurements from gauges or surface pressure from barometric measurements and meteorological modeling) the LLN’s kn 0 , hn 0 , ln 0 could, in principle, be determined experimentally from measurements of gravity change and displacement. Plag et al. (1996) coined the term “loading inversion” (and as an alternative “loading tomography,” noting that tomography is usually based on probing along rays and not by body-integrated response) and suggested a method to invert for certain spherical harmonic expansion coefficients of density ınm and Lamé parameter perturbations ınm , ınm , together with polynomial parameters that describe their
Time-Variable Gravity Field and Global Deformationof the Earth
335
variation in the radial direction. This inverse approach to recover elastic properties from measurements of gravity and deformation is followed in planetary exploration, however, for the body Love numbers k2 and h2 . There, the role of the known load is assumed by the known tidal potentials in the solar system. On the other hand, one has the possibility to eliminate the ınm from the equations and determine the ratio ıhnm h0n D 0 1 C kn ıvnm
for each
ı nm ln0 D 1 C kn0 ıvnm
for each m D n : : : n
m D n : : : n
and
from a combination of gravity and displacement data. If all spherical harmonic coefficients up to degree nmax are well determined from data, these equations provide 2n C 1 relations per degree n. Mendes Cerveira et al. (2006) proposed that h0 this freedom might in principle be used to uniquely solve for the ratio 1Cknm0 and nm
0 lnm 0 1Cknm
of global, “azimuth-dependent” LLNs. Other observations of gravity change and deformation of the Earth’s solid and fluid surface may be considered as well. With the global network of superconducting gravimeters, it is believed that annual and short-time variations in mass loading can be observed and compared with GRACE results, after an appropriate removal of local effects (Neumeyer et al. 2006). Absolute (free-fall) gravimeters provide stable time series from which trends in gravity can be obtained and from which time series of superconducting gravimeters can be calibrated. Gravity change ıg D jrV j jrV j as sensed by a terrestrial gravimeter, which is situated on the deforming Earth surface, reads (in spherical approximation, where the magnitude of the gradient is replaced by the radial derivative) ıg.x; t/ D
n 1 GM X X n C 1 2h0n ıvnm .t/Ynm .e/ 2 a nD0 mDn
(30)
and when related to surface mass change, 1 n GM 3 X X n C 1 2h0n ıg.x; t/ D 2 ınm .t/Ynm .e/: a e nD0 mDn 2n C 1
(31)
In principle, observations of the sea level might be considered in a multi-data inversion scheme as well. This requires that the steric sea-level change, which is related to changes in temperature and salinity of the ocean, can be removed from the measured total or volumetric sea level. Sea-level change can be measured using satellite altimeters, as with the current Jason-1 and Jason-2 missions, in an absolute
336
J. Kusche
a r
n+1
1 + k n′ 2n + 1
Space gravity
h ′n (2n + 1)
r (X′, t )
M (n) ii
m (X′, t)
A(n) ii
l ′n (2n + 1)
Vertical Lateral displacement displacement
1 2.....in
1 2.....in
n + 1 – 2h n′ 2n+1
1 + k ′n 2n+1
1 + k ′n – h ′n (2n+1)
Terrestrial gravity
Absolute sealevel
Relative sealevel
Fig. 3 Spectral operators for space and terrestrial gravity, displacement, and sea-level observables
sense, since altimetric orbits refer to an ITRS-type global reference system. Tide gauge observations, on the other hand, provide sea level in a relative sense since they refer to land benchmarks. Ocean bottom pressure recorders measure the load change directly. If one assumes that the ocean response is largely “passive” (Blewitt and Clarke 2003), i.e., the ocean surface follows an equipotential surface, the absolute sea-level change is related to surface load as ıs.x; t/ D
1 n GM 3 X X 1 C kn0 ınm .t/Ynm .e/ C ıs0 .t/ a e nD0 mDn 2n C 1
(32)
and relative sea level as ı sL.x; t/ D
1 n GM 3 X X 1 C kn0 h0n ınm .t/Ynm .e/ C ı sL0 .t/; a e nD0 mDn 2n C 1
(33)
where ıs0 .t/ and ı sL0 .t/ are spatially uniform terms that account for mass conservation (cf. Blewitt 2003). All spectral operators discussed in this chapter are summarized in Fig. 3.
5
Future Directions
Within the limits of accuracy and spatial resolution of current observing systems, inversions for (surface) mass appear to have almost reached their potential. Limitations are, in particular, the achievable accuracy of spherical harmonic coefficients
Time-Variable Gravity Field and Global Deformationof the Earth
337
from GRACE, systematic errors nicknamed as “striping,” the presence of systematic errors in displacement vectors from space-geodetic techniques, and the spatial density and inhomogeneous distribution of global networks (space-geodetic techniques, absolute and superconducting gravimetry). However, moderate improvements in data quality and consistency can be expected from reprocessings such as the anticipated GRACE RL05 products or the IGS reprocessing project. In the long run, GRACE Follow-On missions and geometric positioning in the era of GALILEO will provide, hopefully, bright prospects. At the time of writing, some groups focus on the combination of gravity and geometrical observations with a priori models of mass transport (so-called joint inversions; cf. Wu et al. 2006; Jansen et al. 2009; Rietbroek et al. 2009). The benefit of this strategy is that certain weaknesses of individual techniques can be covered by other techniques. For example, it has been shown that the geocenter motion or degree-1 surface load which is not observed with GRACE can be restituted to some extent by GPS and/or ocean modeling. Research is ongoing in this direction. Another issue is that if the space agencies fail to replace the GRACE mission in time with a follow-on satellite mission, a gap in the observation of the time-variable gravity field might occur. To some extent, the very low degrees of mass loading processes might be recovered during the gap from geometrical techniques and loading inversion, provided that their mentioned limitations can be overcome and a proper cross-calibration is facilitated with satellite gravity data in the overlapping periods (Bettadpur et al. 2008; Plag and Gross 2008). The same situation occurs, if one tries to go back in time before the launch of GRACE, e.g., using (reprocessed) GPS solutions. Anyway, the problem remains extremely challenging.
6
Conclusions
We have reviewed concepts that are currently used in the interpretation of timevariable gravity and deformation fields in terms of mass transports and Earth system research. Potential for future research is seen in particular for combinations of different observables.
References Bettadpur S, Ries J, Save H (2008) Time-variable gravity, low Earth orbiters, and bridging gaps. In: GRACE Science Team Meeting 2008, San Francisco, 12–13 Dec 2008 Blewitt G (2003) Self-consistency in reference frames, geocenter definition, and surface loading of the solid Earth. J Geophys Res 108(B2):2103. doi:10.1029/2002JB002082 Blewitt G, Clarke P (2003) Inversion of Earth’s changing shape to weigh sea level in static equilibrium with surface mass redistribution. J Geophys Res 108(B6):2311. doi:10.1029/2002JB002290 Chao BF (2005) On inversion for mass distribution from global (time-variable) gravity field. J Geodyn 39:223–230 Farrell W (1972) Deformation of the Earth by surface loads. Rev Geophys Space Phys 10(3):761– 797
338
J. Kusche
Flechtner F (2007) AOD1B Product Description Document for Product Releases 01 to 04 (Rev. 3.1, 13 Apr 2007), GR-GFZ-AOD-0001, GFZ Potsdam Ilk KH, Flury J, Rummel R, Schwintzer P, Bosch W, Haas C, Schröter J, Stammer D, Zahel W, Miller H, Dietrich R, Huybrechts P, Schmeling H, Wolf D, Götze HJ, Riegger J, Bardossy A, Güntner A, Gruber T (2005) Mass transport and mass distribution in the Earth system. Contribution of the new generation of satellite gravity and altimetry missions to geosciences, GFZ Potsdam Jansen MWF, Gunter BC, Kusche J (2009) The impact of GRACE, GPS and OBP data on estimates of global mass redistribution. Geophys J Int. doi:10.1111/j.1365-246X.2008.04031.x Kusche J (2007) Approximate decorrelation and non-isotropic smoothing of time-variable GRACE-type gravity fields. J Geodesy 81(11):733–749 Kusche J, Schrama EJO (2005) Surface mass redistribution inversion from global GPS deformation and Gravity Recovery and Climate Experiment (GRACE) gravity data. J Geophys Res 110:B09409. doi:10.1029/2004JB003556 Mendes Cerveira P, Weber R, Schuh H (2006) Theoretical aspects connecting azimuth-dependent Load Love Numbers, spatiotemporal surface geometry changes, geoid height variations and Earth rotation data. In: WIGFR2006, Smolenice Castle, 8–9 May 2006 Métivier L, Greff-Lefftz M, Diament M (2005) A new approach to computing accurate gravity time variations for a realistic earth model with lateral heterogeneities. Geophys J Int 162:570–574 Michel V (2005) Regularized wavelet-based multiresolution recovery of the harmonic mass density distribution from data of the Earth’s gravitational field at satellite height. Inverse Probl 21:997– 1025 Neumeyer J, Barthelmes F, Dierks O, Flechtner F, Harnisch M, Harnisch G, Hinderer J, Imanishi Y, Kroner C, Meurers B, Petrovic S, Reigber C, Schmidt R, Schwintzer P, Sun H-P, Virtanen H (2006) Combination of temporal gravity variations resulting from superconducting gravimeter (SG) recordings, GRACE satellite observations and global hydrology models. J Geodesy 79:573–585 Plag H-P, Gross R (2008) Exploring the link between Earth’s gravity field, rotation and geometry in order to extend the GRACE-determined terrestrial water storage to non-GRACE times. In: GRACE science team meeting 2008, San Francisco, 12–13 Dec 2008 Plag H-P, Jüttner H-U, Rautenberg V (1996) On the possibility of global and regional inversion of exogenic deformations for mechanical properties of the Earth’s interior. J Geodyn 21(3):287– 308 Rietbroek R, Brunnabend S-E, Dahle C, Kusche J, Flechtner F, Schröter J, Timmermann R (2009) Changes in total ocean mass derived from GRACE, GPS, and ocean modeling with weekly resolution. J Geophys Res 114:C11004. doi:10.1029/2009JC005449 Wahr J, Molenaar M, Bryan F (1998) Time variability of the Earth’s gravity field: hydrological and oceanic effects and their possible detection using GRACE. J Geophys Res 103(B12):30205– 30229 Wang R (1991) Tidal deformations on a rotating, spherically asymmetric, visco-elastic and laterally heterogeneous Earth. Peter Lang, Frankfurt/Main Wu X, Heflin MB, Ivins ER, Fukumori I (2006) Seasonal and interannual global surface mass variations from multisatellite geodetic data. J Geophys Res 111:B09401. doi:10.1029/2005JB004100
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution Willi Freeden and Michael Schreiner
Contents 1 2 3 4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SGG in Potential Theoretic Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Decomposition of Tensor Fields by Means of Tensor Spherical Harmonics . . . . . . . . . . . Formulation as Pseudodifferential Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 SGG as Pseudodifferential Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Upward/Downward Continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Operator of the First-Order Radial Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Pseudodifferential Operator for SST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Pseudodifferential Operator of the Second-Order Radial Derivative . . . . . . . . . . . 4.6 Pseudodifferential Operator for Satellite Gravity Gradiometry . . . . . . . . . . . . . . . . 4.7 Survey on Pseudodifferential Operators Relevant in Satellite Technology . . . . . . . 4.8 Classical Boundary Value Problems and Satellite Problems . . . . . . . . . . . . . . . . . . 4.9 A Short Introduction to the Regularization of Ill-Posed Problems . . . . . . . . . . . . . 4.10 Regularization of the Exponentially Ill-Posed SGG-Problem . . . . . . . . . . . . . . . . . 5 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
340 342 348 353 356 357 358 359 360 360 361 363 364 370 371 376 377
Abstract
Satellite gravity gradiometry (SGG) is an ultrasensitive detection technique of the space gravitational gradient (i.e., the Hesse tensor of the Earth’s gravitational potential). In this note, SGG – understood as a spacewise inverse problem of satellite technology – is discussed under three mathematical aspects: First, SGG
W. Freeden () Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany e-mail: [email protected] M. Schreiner Institute for Computational Engineering, University of Buchs, Buchs, Switzerland e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_9
339
340
W. Freeden and M. Schreiner
is considered from potential theoretic point of view as a continuous problem of “harmonic downward continuation.” The space-borne gravity gradients are assumed to be known continuously over the “satellite (orbit) surface”; the purpose is to specify sufficient conditions under which uniqueness and existence can be guaranteed. In a spherical context, mathematical results are outlined by the decomposition of the Hesse matrix in terms of tensor spherical harmonics. Second, the potential theoretic information leads us to a reformulation of the SGG-problem as an ill-posed pseudodifferential equation. Its solution is dealt within classical regularization methods, based on filtering techniques. Third, a very promising method is worked out for developing an immediate interrelation between the Earth’s gravitational potential at the Earth’s surface and the known gravitational tensor.
1
Introduction
Due to the nonspherical shape, the irregularities of its interior mass density, and the movement of the lithospheric plates, the external gravitational field of the Earth shows significant variations. The recognition of the structure of the Earth’s gravitational potential is of tremendous importance for many questions in geosciences, for example, the analysis of present day tectonic motions, the study of the Earth’s interior, models of deformation analysis, the determination of the sea surface topography, and circulations of the oceans, which, of course, have a great influence on the global climate and its change. Therefore, a detailed knowledge of the global gravitational field including the local high-resolution microstructure is essential for various scientific disciplines. Satellite gravity gradiometry (SGG) is a modern domain of studying the characteristics, the structure, and the variation process of the Earth’s gravitational field. The principle of satellite gradiometry can be explained roughly by the following model (cf. Fig. 1): several test masses in a low orbiting satellite feel, due to their distinct positions and the local changes of the gravitational field, different forces, thus yielding different accelerations. The measurements of the relative accelerations
Fig. 1 The principle of a gradiometer
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution
341
between two test masses provide information about the second-order partial derivatives of the gravitational potential. To be more concrete, differences between the displacements of opposite test masses are measured. This yields information on the differences of the forces. Since the gradiometer itself is small, these differences can be identified with differentials so that a so-called full gradiometer gives information on the whole tensor consisting out of all second-order partial derivatives of the gravitational potential, i.e., the Hesse matrix. In an ideal case, the full Hesse matrix can be observed by an array of test masses. On 17 March 2009, the European Space Agency (ESA) began to realize the concept of SGG with the launch of the most sophisticated mission ever to investigate the Earth’s gravitational field, viz. GOCE (Gravity Field and Steady-State Ocean Circulation Explorer). ESA’s 1-ton spacecraft carries a set of six state-of-the-art, high-sensitivity accelerometers to measure the components of the gravity field along all three axes (see the contribution of R. Rummel in this issue for more details on the measuring devices of this satellite). GOCE produced a coverage of the entire Earth with measurements (apart from gaps at the polar regions). For around 20 months, GOCE gathered gravitational data. After running out of propellant, the GOCE satellite begun dropping out of this orbit in October 2013 and made an uncontrolled reentry on 11 November 2013. In order to make this mission successful, ESA and its partners had to overcome an impressive technical challenge by designing a satellite that is orbiting the Earth close enough (at an altitude of only 250 km) to collect high-accuracy gravitational data while being able to filter out disturbances caused, e.g., by the remaining traces of the atmosphere. It is not surprising that, during the last decade, the ambitious mission GOCE motivated many scientific activities such that a huge number of written material is available in different fields concerned with special user group activities, mission synergy, calibration as well as validation procedures, geoscientific progress (in fields like gravity field recovery, ocean circulation, hydrology, glaciology, deformation, climate modeling, etc.), data management, and so on. A survey about the recent status is well demonstrated by the “ESA Living Planet Programme”, which also contains a list on GOCE-publications (see also the contribution by the ESA-Frascati Group in this issue, for information from geodetic point of view the reader is referred, e.g., to the notes (Beutler et al. 2003; ESA 1999, 2007; Rummel et al. 1993), too). Mathematically, the literature dealing with the solution procedures of problems related to SGG can be divided essentially into two classes: the timewise approach and the spacewise approach. The former one considers the measured data as a time series, while the second one supposes that the data are given in advance on a (closed) surface. This chapter is part of the spacewise approach. Its goal is a potential theoretically reflected approach to SGG with strong interest in the characterization of SGG-data types and tensorial oriented solution of the occurring (pseudodifferential) SGGequations by regularization. Particular emphasis is laid on the transition from scalar data types (such as the second-order radial derivative) to full tensor data of the Hesse matrix.
342
2
W. Freeden and M. Schreiner
SGG in Potential Theoretic Perspective
Gravity as observed on the Earth’s surface is the combined effect of the gravitational mass attraction and the centrifugal force due to the Earth’s rotation. The force of gravity provides a directional structure to the space above the Earth’s surface. It is tangential to the vertical plumb lines and perpendicular to all level surfaces. Any water surface at rest is part of a level surface. As if the Earth were a homogeneous, spherical body gravity turns out to be constant all over the Earth’s surface, the wellknown quantity 9.8 ms2 . The plumb lines are directed toward the Earth’s center of mass, and this implies that all level surfaces are nearly spherical, too. However, the gravity decreases from the poles to the equator by about 0.05 ms2 . This is caused by the flattening of the Earth’s figure and the negative effect of the centrifugal force, which is maximal at the equator. Second, high mountains and deep ocean trenches cause the gravity to vary. Third, materials within the Earth’s interior are not uniformly distributed. The irregular gravity field shapes as virtual surface, the geoid. The level surfaces are ideal reference surfaces, for example, for heights. In more detail, the gravity acceleration (gravity) w is the resultant of gravitation v and centrifugal acceleration c, i.e., w D v C c. The centrifugal force c arises as a result of the rotation of the Earth about its axis. We assume here, a rotation of constant angular velocity !0 about the rotational axis x3 , which is further assumed to be fixed with respect to the Earth. The centrifugal acceleration acting on a unit mass is directed outward perpendicularly to the spin axis. If the 3 -axis of an Earthfixed coordinate system coincides with the axis of rotation, then we have c.x/ D !02 3 ^. 3 ^x/. Using the so-called centrifugal potential C.x/ D .1=2/!02 .x12 Cx22 / we can write c D rC . The direction of the gravity w is known as the direction of the plumb line, the quantity jwj is called the gravity intensity (often just gravity). The gravity potential of the Earth can be expressed in the form: W D V CC . The gravity acceleration w is given by w = rW D rV C rC . The surfaces of constant gravity potential W .x/ D const, x 2 R3 , are designated as equipotential (level, or geopotential) surfaces of gravity. The gravity potential W of the Earth is the sum of the gravitational potential V and the centrifugal potential C, i.e., W D V C C . In an Earth’s fixed coordinate system, the centrifugal potential C is explicitly known. Hence, the determination of equipotential surfaces of the potential W is strongly related to the knowledge of the potential V . The gravity vector w given by w.x/ D rx W .x/ where the point x 2 R3 is located outside and on a sphere around the origin with Earth’s radius R, is normal to the equipotential surface passing through the same point. Thus, equipotential surfaces intuitively express the notion of tangential surfaces, as they are normal to the plumb lines given by the direction of the gravity vector (for more details see, for example, Heiskanen and Moritz (1967), Freeden and Schreiner (2009) and the contributions by H. Moritz in this issue). According to the classical Newton’s Law of Gravitation (1687), knowing the density distribution of a body, the gravitational potential can be computed everywhere in R3 . More explicitly, the gravitational potential V of the Earth’s exterior is given by
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution
Z
.y/ dV .y/; jx yj
V .x/ D G Earth
343
x 2 R3 nEarth;
(1)
where G is the gravitational constant .G D 6:6742 1011m3 kg1 s2 / and dV is the (Lebesgue-) volume measure. The properties of the gravitational potential (1) in the Earth’s exterior are appropriately described by the Laplace equation: V .x/ D 0; x 2 R3 nEarth:
(2)
The gravitational potential V as defined by (1) is regular at infinity, i.e., jV .x/j D O
1 ; jxj ! 1: jxj
(3)
For practical purposes, the problem is that in reality the density distribution is very irregular and known only for parts of the upper crust of the Earth. It is actually so that geoscientists would like to know it from measuring the gravitational field. Even if the Earth is supposed to be spherical, the determination of the gravitational potential by integrating Newton’s potential is not achievable. This is the reason why, in simplifying spherical nomenclature, we first expand the so-called reciprocal distance in terms of harmonics (related to the Earth’s mean radius R) as a series 1 2nC1 X X 4R 1 R R .x/Hn;k .y/; D Hn1;k jx yj 2n C 1 nD0 j D1
(4)
R is an inner harmonic of degree n and order k given by where Hn;k
R Hn;k .x/ D
1 R
jxj R
n Yn;k . /; x D jxj ; 2 ;
(5)
R and Hn1;k is an outer harmonic of degree n and order k given by
R Hn1;k .x/ D
1 R
R jxj
nC1 Yn;k . /; x D jxj ; 2
( is the unit sphere in R3 ). Note that the family fYn;k g
nD0;1;::: kD1;:::;2nC1
(6) is an L2 ./-
orthonormal system of scalar spherical harmonics (for more details concerning spherical harmonics see, e.g., Müller (1966), Freeden et al. (1998), Freeden and Schreiner (2009), Freeden and Gerhards (2013), and Freeden and Gutting (2013)). Insertion of the series expansion (4) into Newton’s formula for the external gravitational potential yields
344
W. Freeden and M. Schreiner 1 2nC1 X X 4R Z R R V .x/ D G .y/ Hn;k .y/ dV .y/ Hn1;k .x/: int 2n C 1 R nD0
(7)
kD1
The expansion coefficients of the series (7) are not computable, since their determination requires the knowledge of the density function in the Earth’s interior (see the introductory chapter and the contribution of V. Michel in this issue). In fact, it turns out that there are infinitely many mass distributions, which have the given gravitational potential of the Earth as exterior potential. Nevertheless, collecting the results from potential theory on the Earth’s gravitational field v for the outer space (in spherical approximation) we are confronted with the following (mathematical) characterization: v is an infinitely often differentiable vector field in the exterior of the Earth such that (v1) div v D r v= 0, curl v = L v = 0 in the Earth’s exterior, (v2) v is regular at infinity: jv.x/j D O 1=.jxj2 / ; jxj ! 1. Seen from mathematical point of view, the properties (v1) and (v2) imply that the Earth’s gravitational field v in the exterior of the Earth is a gradient field v D rV , where the gravitational potential V fulfills the properties: V is an infinitely often differentiable scalar field in the exterior of the Earth such that (V1) V is harmonic in the Earth’s exterior, and vice versa. Moreover, the gradient field of the Earth’s gravitational field (i.e., the Jacobi matrix field) v = r v, obeys the following properties: v is an infinitely often differentiable tensor field in the exterior of the Earth such that (v1) div v = r v = 0, curl v D 0 in the Earth’s exterior, (v2) v is regular at infinity: jv.x/j D O 1=.jxj3 / ; jxj ! 1, and vice versa. Combining our identities we finally see that v can be represented as the Hesse tensor of the scalar field V, i.e., v D r ˝ rV D r .2/ V. The technological SGG-principle of determining the tensor field v at satellite altitude is illustrated graphically in Fig. 2. The position of a low orbiting satellite is tracked using GPS. Inside the satellite there is a gradiometer. A simplified model of a gradiometer is sketched in Fig. 1. The photo of the GOCE satellite is contained in the contribution of R. Rummel in this issue. An array of test masses is connected with springs. Once more, the measured quantities are the differences between the displacements of opposite test masses. According to Hooke’s law, the mechanical configuration provides information on the differences of the forces. They, however, are due to local differences of rV . Since the gradiometer itself is small, these differences can be identified with differentials, so that a so-called full gradiometer gives information on the whole tensor consisting out of all second-order partial derivatives of V , i.e., the Hesse matrix v of V . From our preparatory remarks, it becomes obvious that the potential theoretic situation for the SGG-problem can be formulated briefly as follows: Suppose the satellite data v = r ˝ r V are known continuously over the “orbital surface,” the satellite gravity gradiometry problem amounts to the problem of determining V from v D r ˝ rV at the “orbital surface.” Mathematically, SGG is a nonstandard problem of potential theory. The reasons are obvious:
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution
345
GPS satellites
Gradiometry
Mass anomaly
Earth
Fig. 2 The principle of satellite gravity gradiometry (From ESA (1999))
• SGG is ill-posed since the data are not given on the boundary of the domain of interest, i.e., on the Earth’s surface but on a surface in the exterior domain of the Earth, i.e., at a certain height. • Tensorial SGG-data (or scalar manifestations of them) do not form the standard equipment of potential theory (such as, e.g., Dirichlet or Neumann data). Thus, it is – at first sight – not clear whether these data ensure the uniqueness of the SGG-problem or not. • SGG-data have its natural limit because of the strong damping of the highfrequency parts of the (spherical harmonic expansion of the) gravitational potential with increasing satellite heights. For a heuristic explanation of this calamity, let us start from the assumption that the gravitational potential outside the spherical Earth’s surface R with the mean radius R is given by the ordinary expansion in terms of outer harmonics (confer the identity (7))
V .x/ D
1 2nC1 X XZ nD0 kD1
R
R R V .y/Hn1;k .y/d !.y/Hn1;k .x/
(8)
(d! is the usual surface measure). Then it is not hard to see that those parts of R the gravitational potential belonging to the outer harmonics Hn1;k of order n at height H above the Earth’s surface R are damped by a factor ŒR=.R C H / nC1 . Just a way out of this difficulty is seen in SGG, where, e.g., second-order radial derivatives of the gravitational potential are available at a height of typically about 250 km. The second derivatives cause (roughly speaking) an amplification of the potential coefficients by a factor of order n2 . This compensates the
346
W. Freeden and M. Schreiner
damping effect due to the satellite’s height if n is not too large. Nevertheless, in spite of the amplification, the SGG-problem still remains (exponentially) ill-posed. Altogether, the gravitational potential decreases exponentially with increasing height, and therefore the process of transforming, the data down to the Earth surface (usually called “downward continuation”) is unstable. The non-canonical (SGG)-situation of uniqueness within the potential theoretic framework can be demonstrated already by a simple example in spherical context: Suppose that one scalar component of the Hesse tensor is prescribed for all points x at the sphere RCH D fx 2 R3 j jxj D R C H g. Is the gravitational potential V unique on the sphere R D fx 2 R3 j jxj D Rg? The answer is not positive, in general. To see this, we construct a counterexample: If b 2 R3 with jbj = 1 is given, the second-order directional derivative of V at the point x is b T r ˝ rV .x/b. Given a potential V , we construct a vector field b on RCH , such that the second-order directional derivative b T r ˝r Vb is zero: Assume that V is a solution of (2) and (3). For each x 2 RCH , we know that the Hesse tensor r ˝ rV .x/ is symmetric. Thus, there exists an orthogonal matrix A.x/ so that A.x/T .r ˝ rV .x//A.x/ D diag.1 .x/; 2 .x/; 3 .x//, where 1 .x/; 2 .x/; 3 .x/ are the eigenvalues of r ˝ rV .x/. From the harmonicity of V it is clear that 0 D rV .x/ D 1 .x/ C 2 .x/ C 3 .x/. Let 0 D 31=2 .1; 1; 1/T . We define the vector field W RCH ! R3 by .x/ D A.x/0 ; x 2 RCH . Then we obtain T .x/.r ˝ rV .x//.x/ D T0 A.x/T .r ˝ rV .x//A.x/0
(9)
0
10 1 1 .x/ 0 1 0 1 @ A @ D 3.1 1 1/ 1A 0 2 .x/ 0 1 0 0 3 .x/ D 13 .1 .x/ C 2 .x/ C 3 .x// D 0:
(10)
Hence, we have constructed a vector field such that the second-order directional derivative of V in the direction of .x/ is zero for every point x 2 RCH . It can be easily seen that, for a given V , there exist many vector fields showing the same properties for uniqueness as the vector field . Observing these arguments we are led to the conclusion that the function V is undetectable from the directional derivatives corresponding to (see also Schreiner 1994a,b). It is, however, good news that we are not lost here: As a matter of fact, there do exist conditions under which only one quantity of the Hesse tensor yields a unique solution (at least up to low order harmonics). In order to formulate these results, a certain decomposition of the Hesse tensor is necessary, which strongly depends on the separation of the Laplace operator in terms of polar coordinates. In order to follow this path, we start to reformulate the SGG-problem more easily in spherical context. For that purpose we start with some basic facts specifically formulated on the unit sphere D fx 2 R3 j jxj D 1g. As is well-known, any x 2 R3 ; x ¤ 0,
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution
347
can be decomposed uniquely in the form x D r , where the directional part is an element of the unit sphere: 2 . Let fYn;mg W ! R3 , n D 0, 1, . . . , m D 1, . . . , 2n + 1, be an orthonormal set of spherical harmonics. As is well-known (see, e.g., Freeden and Schreiner 2009), the system is complete in L2 ./, hence, each function F 2 L2 ./ can be represented by the spherical harmonic expansion F . / D
1 2nC1 X X
F ^ .n; m/Yn;m . /; 2 ;
(11)
nD0 mD1
with “Fourier coefficients” given by Z ^
F .n; m/ D .F; Yn;m /L2 ./ D
F . /Yn;m . / d !. /:
(12)
Furthermore, the (outer) harmonics Hn1;m W R3 nf0g ! R related to the unit 1 1 sphere are denoted by Hn1;m .x/ D Hn1;m .x/, where Hn1;m .x/ D nC1 .1=jxj /Yn;m .x=jxj/. Clearly, they are harmonic functions and their restrictions coincide on with the corresponding spherical harmonics. Any function F 2 L2 ./ can, thus, be identified with a harmonic potential via the expansion (11), in particular, this holds true for the Earth’s external gravitational potential. This motivates the following mathematical model situation of the SGG-problem to be considered next: (i) Isomorphism: Consider the sphere R R3 around the origin with radius R > ext 0. int R is the inner space of R , and R is the outer space. By virtue of the isomorphism 3 7! R 2 R we assume functions F W R ! R to be defined on . It is clear that the function spaces defined on admit their natural generalizations as spaces of functions defined on R . Obviously, an L2 ./-orthonormal system of spherical harmonics forms an orthogonal system on R (with respect to .; /L2 .R / ). Moreover, with the relationship $ R , the differential operators on R can be related to operators on the unit sphere . In more detail, the surface gradient r IR , the surface curl gradient LIR and the Beltrami operator IR on R , respectively, admit the representation r IR D .1=R/r I1 D .1=R/r , LIR D (1/R/LI1 D (1/R/L , IR D (1/R2 /I1 = (1/R2 / , where r , L , are the surface gradient, surface curl gradient, and the Beltrami operator of the unit sphere , respectively. For Yn being a spherical harmonic of degree n we have IR Yn D .1=R2 /n.n C 1/Yn D .1=R2 / Yn . (ii) Runge Property: Instead of looking for a harmonic function outside and on the (real) Earth, we search for a harmonic function outside the unit sphere (assuming the units are chosen in such a way that the sphere with radius 1 is inside of the Earth and at the same time not too “far away” from the Earth’s boundary). The justification of this simplification (see Fig. 3) is based on the Runge approach (see, e.g., Freeden 1980a; Freeden and Michel 2004): To any
348
W. Freeden and M. Schreiner
Fig. 3 The role of the “Runge sphere” within the spherically reflected SGG-problem ΩR+H
ΩR H
R Earth
harmonic function V outside of the (real) Earth and any given " > 0, there exists a harmonic function U outside of a sphere inside the (real) Earth such that the absolute error jV .x/ U .x/j < " holds true for all points x outside and on the (real) Earth’s surface.
3
Decomposition of Tensor Fields by Means of Tensor Spherical Harmonics
Let us recapitulate that any point 2 may be represented by polar coordinates in a standard way p 1 t 2 .cos ' 1 C sin ' 2 /; 1 t 1; 0 ' < 2; t D cos #; (13) (# 2 Œ0; : (co-)latitude, ': longitude, t: polar distance). Consequently, any element 2 may be represented using its coordinates .'; t/ in accordance with (13). For the representation of vector and tensor fields on the unit sphere , we are led to use a local triad of orthonormal unit vectors in the directions r, ', and t as shown by Fig. 4 (for more details the reader is referred to Freeden and Schreiner (2009) and the references therein). As is well known, the second-order tensor fields on the unit sphere, i.e., f W ! R3 ˝ R3 , can be separated into their tangential and normal parts as follows: D t 3 C
p;nor f D .f / ˝ ;
(14)
pnor; f D ˝ . T f/;
(15)
p;tan f D f p;nor f D f .f / ˝ ;
(16)
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution Fig. 4 Local triads r , , t with respect to two different points and on the unit sphere
349
t (ξ)
r (ξ) f (ξ)
t (h)
f (h) r (h)
ptan; f D f pnor; f D f ˝ . T f/;
(17)
pnor;tan f D pnor; .p;tan f/ D p;tan .pnor; f/
(18)
D ˝ . T f/ . T f / ˝ : The operators pnor;nor , ptan;nor , and ptan;tan are defined analogously. A vector field f W ! R3 ˝ R3 is called normal if f = pnor;nor f and tangential if f = ptan;tan f. It is called left normal if f = pnor; f, left normal/right tangential if f = pnor;tan f, and so on. The constant tensor fields itan and jtan can be defined using the local triads by itan D ' ˝ ' C t ˝ t ; jtan D ^ itan D t ˝ ' ' ˝ t :
(19)
Spherical tensor fields can be discussed in an elegant manner by the use of certain differential processes. Let u be a continuously differentiable vector field on , i.e., u 2 C .1/ ./, given in its coordinate form by u. / D
3 X
Ui . / i ; 2 ; Ui 2 C .1/ ./:
(20)
i D1
Then we define the operators r ˝ and L ˝ by r ˝ u. / D
3 X
.r Ui . // ˝ i ; 2 ;
(21)
L Ui . / ˝ i ; 2 :
(22)
i D1
L ˝ u. / D
3 X i D1
Clearly, r ˝ u and L ˝ u are left tangential. But it is an important fact, that even if u is tangential, the tensor fields r ˝ u and L ˝ u are generally not tangential.
350
W. Freeden and M. Schreiner
It is obvious, that the product rule is valid. To be specific, let F 2 C .1/ ./ and u 2 C .1/ ./ (once more, note that the notation u 2 c.1/ ./ means that the vector field u is a continuously differentiable on ), then r ˝ .F . /u. // D r F . / ˝ u. / C F . /r ˝ u. /;
2 :
(23)
In view of the above equations and definitions, we accordingly introduce operators o.i;k/:C.2/ () ! c.0/ ./ (note that c.0/ ./ is the class of continuous second-order tensor fields on the unit sphere ) by .1;1/
F . / D ˝ F . /;
(24)
.1;2/
F . / D ˝ r F . /;
(25)
.1;3/
F . / D ˝ L F . /;
(26)
.2;1/
F . / D r F . / ˝ ;
(27)
.3;1/
F . / D L F . / ˝ ;
(28)
.2;2/
F . / D itan . /F . /;
(29)
.2;3/
F . / D r ˝ r L ˝ L F . / C 2r F . / ˝ ;
(30)
.3;2/
F . / D r ˝ L C L ˝ r F . / C 2L F . / ˝ ;
(31)
F . / D jtan . /F . /;
(32)
o o o o o o o o
.3;3/
o
2 : After our preparations involving spherical second-order tensor fields it is not difficult to prove the following lemma. Lemma 1. Let F W ! R be sufficiently smooth. Then the following statements are valid: 1. 2. 3. 4. 5. 6.
o.1;1/ F is a normal tensor field. o.1;2/ F and o.1;3/ F are left normal/right tangential. o.2;1/ F and o.3;1/ F are left tangential/right normal. o.2;2/ F , o.2;3/ F , o.3;2/ F and o.3;3/ F are tangential. o.1;1/ F , o.2;2/ F , o.2;3/ F and o.3;2/ F are symmetric. o.3;3/ F is skew-symmetric.
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution
351
7. .o.1;2/ F /T = o.2;1/ F and .o.1;3/ F /T = o.3;1/ F . 8. For 2 8 < F . / for .i; k/ D .1; 1/; .i;k/ trace o F . / D 2F . / for .i; k/ D .2; 2/; : 0 for .i; k/ ¤ .1; 1/; .2; 2/: The tangent representation theorem (cf. Backus 1966, 1967) asserts that if ptan;tan f is the tangential part of a tensor field f 2 c.2/ (), as defined above, then there exist unique scalar fields F2;2 , F3;3 , F2;3 , F3;2 such that Z
Z F2;2 . / d !. / D
F3;3 . / d !. / D 0;
(33)
Z
Z F3;2 . /. i / d !. / D
F2;3 . /. i / d !. / D 0;
i D 1; 2; 3;
(34)
and ptan;tan f D o.2;2/ F2;2 C o.2;3/ F2;3 C o.3;2/ F3;2 C o.3;3/ F3;3 :
(35)
Furthermore, the following orthogonality relations may be formulated: Let F; G W .i;k/ .i 0 ;k 0 / F ( ) ! R be sufficiently smooth. Then for all 2 , we have o F ( ) o 0 0 .i;k/ = 0 whenever we have (i , k/ ¤ (i , k /. The adjoint operators O satisfying Z
Z o
.i;k/
F . / O .i;k/ f. / d !. /;
F . / f. / d !. / D
(36)
for all sufficiently smooth functions F W ! R and tensor fields f W ! R3 ˝ R3 can be deduced by elementary calculations. In more detail, for f 2 c(2) (), we find (cf. Freeden and Schreiner 2009) .1;1/
f. / D T f. / ;
(37)
.1;2/
f. / D r ptan . T f. //;
(38)
.1;3/
f. / D L ptan . T f. //;
(39)
.2;1/
f. / D r ptan .f. / /;
(40)
.3;1/
f. / D L ptan .f. / /;
(41)
.2;2/
f. / D itan . / f. /;
(42)
O O O O O O
352
W. Freeden and M. Schreiner .2;3/
O
f. / D r ptan r ptan; f. / L ptan L ptan; f. / 2r ptan .f. / /;
.3;2/
O
f. / D L ptan r ptan; f. / C r ptan L ptan; f. / 2L ptan .f. / /;
.3;3/
O
f. / D jtan . / f. /;
(43)
(44) (45)
2 . Provided that F W ! R is sufficiently smooth we see that .i 0 ;k 0 / .i;k/ o F . /
O
D 0 if .i; k/ ¤ .i 0 ; k 0 /;
(46)
whereas 8 ˆ ˆ ˆ ˆ ˆ ˆ
0, 2 , we obtain the following decomposition of the Hesse matrix on the sphere RCH , i.e., for x 2 R3 with jxj D R C H : .1;1/
r ˝ rHn1;m ..R C H / / D .n C 1/.n C 2/ .RCH1 /nC3 o
Yn;m . /
.n C 2/ .RCH1 /nC3 .1;2/ .2;1/ o Yn;m . / C o Yn;m . / .2;2/
1 o .nC1/.nC2/ 2 .RCH /nC3 .2;3/
C 12 .RCH1 /nC3 o
(66)
Yn;m . /
Yn;m . /:
Keeping in mind, that any solution of the SGG-problem can be expressed as a series of outer harmonics and using the completeness of the spherical harmonics in the space of square-integrable functions on the unit sphere, it follows that the SGG-problem is uniquely solvable (up to some low order spherical harmonics) by
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution
355
the O .1;1/ , O .1;2/ , O .2;1/ , O .2;2/ , and O .2;3/ components. To be more specific, we are able to formulate the following theorem: Theorem 2. Let V satisfy the following condition V 2 Pot.C .0/ .//; i:e:;
jV .x/j D O
V 2 C .0/ .ext / \ C (2) .ext /;
(67)
V .x/ D 0; x 2 ext ;
(68)
1 ; jxj ! 1; uniformly for all directions: jxj
(69)
Then the following statements are valid: 1. O .i;k/ r ˝ r V ..R C H / / = 0 if (i; k/ 2 f.1; 3/; .3; 1/; .3; 2/; .3; 3/g. 2. O .i;k/ r ˝ r V ..R C H / / = 0 for (i; k/ 2 f.1; 1/; .2; 2/g if and only if V = 0. 3. O .i;k/ r ˝ r V ..R C H / / = 0 for (i; k/ 2 f.1; 2/; .2; 1/g if and only if Vj is constant. 4. O .2;3/ r ˝ r V ..R C H / / = 0 if and only if Vj is linear combination of spherical harmonics of degree 0 and 1. This theorem gives detailed information, which tensor components of the Hesse tensor ensure the uniqueness of the SGG-problem (see also the considerations due to Schreiner (1994a) and Freeden et al. (2002), Freeden and Nutz (2011)). Anyway, for a potential V of class P ot.C .0/ .// with vanishing spherical harmonic moments of degree 0 and 1 such as the Earth’s disturbing potential (see, e.g., Heiskanen and Moritz (1967) for its definition) uniqueness is assured in all cases (listed in Theorem 2). Since we now know at least in the spherical setting, which conditions guarantee the uniqueness of an SGG-solution we can turn to the question of how to find a solution and what we mean with a solution, since we have to take into account the ill-posedness. To this end, we are interested here in analyzing the problem step by step. We start with the reformulation of the SGG-problem as pseudodifferential equation on the sphere, give a short overview on regularization, and show how this ingredients can be composed together to regularize the SGG-data. In doing so, we find great help by discussing how classical boundary value problems in gravitational field of the Earth as well as modern satellite problems may be transferred into pseudodifferential equations, thereby always assuming the spherically oriented geometry. Indeed, it is helpful to treat the classical Dirichlet and Neumann boundary value problem as well as significant satellite problems such as satellite-to-satellite tracking (SST) and SGG.
356
4.1
W. Freeden and M. Schreiner
SGG as Pseudodifferential Equation
Let † R3 be a regular surface, i.e., we assume the following properties: (i) † divides the Euclidean space R3 into the bounded region †int (inner space) and the unbounded region †ext (outer space) so that †ext D R3 n†int , † D †int \†ext with ø = †int \†ext , (ii) †int contains the origin, (iii) † is a closed and compact surface free of double points, (iv) † is locally of class C (2) (see Freeden and Schreiner (2004), Freeden and Gerhards (2013) for more details concerning regular surfaces). From our preparatory considerations (in particular, from the Introduction), it can be deduced that a gravitational potential of interest may be understood to be a member of the class V 2 Pot.C .0/ .†//, i.e.,
jV .x/j D O
V 2 C .2/ .†ext / \ C .2/ .†ext /;
(70)
V .x/ D 0; x 2 †ext ;
(71)
1 ; jxj ! 1; uniformly for all directions: jxj
(72)
Assume that R D fx 2 R3 j jxj D Rg is a (Runge) sphere with radius R around the origin, i.e., a sphere that lies entirely inside †, i.e., R †int . On the class L2 .R / we impose the inner product .; /L2 .R / . Then we know that the functions 1 form an orthonormal set of functions on R , i.e., given F 2 L2 .R /, its Y R n;m R Fourier expansion reads F .x/ D
1 2nC1 x X X 1 F; Y ; Y n;m n;m R2 R L2 .R / R nD0 mD1
x 2 R :
(73)
Instead of considering potentials that are harmonic outside † and continuous on †, we now consider potentials that are harmonic outside R and that are of class L2 .R /. In accordance with our notation we define ( 2
Pot.L .R // D x 7!
1 2nC1 X X
1 R2
nD0 mD1 RnC1 Y jxjnC1 n;m
F; Yn;m . R /
L2 .R /
x jxj
o j F 2 L2 .R / :
(74)
Clearly, Pot.L2 .R // is a “subset” of Pot.C 0 .†// in the sense that if V 2 Pot.L2 .R //, then V j†ext 2 Pot.C 0 .†//. The “difference” of these two spaces is not “too large”: Indeed, we know from the Runge approximation theorem (cf. Freeden 1980a), that for every " > 0 and every V 2 Pot.C 0 .†// there exists a VO 2Pot.L2 .R // such that supx2†ext jV .x/ VO .x/j < . Thus, in all geosciences, it is common (but not strictly consistent with the Runge argumentation) to identify
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution
357
R with the surface of the Earth and to assume that the restriction V jR is of class L2 .R /. Clearly, we have a canonical isomorphism between L2 .R / and Pot.L2 .R //, which is defined via the trace operator, i.e., the restriction to R and its harmonic continuation, respectively.
4.2
Upward/Downward Continuation
1 is Let RCH be the sphere with radius R C H . The system RCH Yn;m RCH then orthonormal in L2 .RCH /. (We assume H to be the height of a satellite above the Earth’s surface.) Let F 2 Pot.L2 .R // be represented in the form 1 2nC1 X X 1 x RnC1 x 7! F; Yn;m : Yn;m 2 nC1 2 R R jxj jxj L .R / nD0 mD1
(75)
Then the restriction of F on RCH reads F jRCH
1 2nC1 X X 1 x RnC1 F; Yn;m : W x 7! Yn;m R2 R L2 .R / .R C H /nC1 RCH nD0 mD1 (76)
Hence, any element R1 Yn;m R of the orthonormal system in L2 .R / is mapped to a function Rn =.R C H /n 1/.R C H / Yn;m ( / .R C H //. The operation defined in such away is called upward continuation. It is representable by the pseudodifferential operator (for more details on pseudodifferential operators the reader should consult Svensson (1983), Schneider (1997), Freeden et al. (1998), and Freeden (1999), Freeden and Schreiner (2009) ƒR;H W L2 .R / ! L2 .RCH / up with associated symbol ^ ƒR;H .n/ D up
Rn : .R C H /n
(77)
In other words, we have ƒR;H up
^ 1 1 D ƒR;H Yn;m Y : .n/ n;m up R R RCH RCH
(78)
358
W. Freeden and M. Schreiner
The image of ƒR;H up is given by Picard’s criterion (cf. Theorem 4):
2 ƒR;H up .L .R //
D
1 2nC1 P P .RCH /n 2 F 2 L2 .RCH /j Rn nD0 mD1 2 1 F; RCH Yn;m RCH 0; so that
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution
367
lim R˛ ƒx D x for all x 2 H;
˛!0
i.e., the operators R˛ ƒ converge pointwise to the identity. From the theory of inverse problems (see, e.g., Kirsch 1996) it is also clear that if ƒ W H ! K is compact and H has infinite dimension (as it is the case for the application we have in mind), then the operators R˛ are not uniformly bounded, i.e., there exists a sequence (˛ j / with limj !1 ˛ j = 0 and jjR˛j jjL.K;H/ ! 1 for j ! 1:
(106)
Note that the convergence of R˛ ƒx in Definition 1 is based on y D ƒx, i.e., on unperturbed data. In practice, the right-hand side is affected by errors and then no convergence is achieved. Instead, one is (or has to be) satisfied with an approximate solution based on a certain choice of the regularization parameter. Let us discuss the error of the solution. For this purpose, we let y 2 R.ƒ/ be the (unknown) exact right-hand side and y ı 2 K be the measured data with jjy y ı jjK < ı:
(107)
x ˛;ı D R˛ y ı ;
(108)
For a fixed ˛ > 0, we let
and look at x ˛;ı as an approximation of the solution x of ƒx D y. Then the error can be split as follows: jjx ˛;ı xjjH D jjR˛ y ı xjjH jjR˛ y ı R˛ yjjH C jjR˛ y xjjH jjR˛ jjL.K;H/ jjy ı yjjK C jjR˛ y xjjH ;
(109)
such that jjx ˛;ı xjjH ıjjR˛ jjL.K;H/ C jjR˛ ƒx xjjH :
(110)
We see that the error between the exact and the approximate solution consists of two parts: The first term is the product of the bound for the error in the data and the norm of the regularization parameter R˛ . This term will usually tend to infinity for ˛ ! 0 if the inverse ƒ1 is unbounded and ƒ is compact (cf. (106)). The second term denotes the approximation error jj.R˛ ƒ1 /yjjH for the exact right-hand side y = ƒx. This error tends to zero as ˛ ! 0 by the definition of a regularization strategy. Thus, both parts of the error show a diametrically oriented behavior. A typical picture of the errors in dependence on the regularization parameter ˛ is
368
W. Freeden and M. Schreiner
Fig. 9 Typical behavior of the total error in a regularization process
Error
Total error δ Ra
Ra Λ x
–x
H
L ( K,H )
a
sketched in Fig. 9. Thus, a strategy is needed to choose ˛ dependent on ı in order to keep the error as small as possible, i.e., we would like to minimize ıjjR˛ jjL.K;H/ C jjR˛ ƒx xjjH :
(111)
In principle, we distinguish two classes of parameter choice rules: If ˛ = ˛(ı) does not depend on ı, we call ˛ = ˛(ı) an a priori parameter choice rule. Otherwise ˛ also depends on y ı and we call ˛ = ˛(ı, y ı / an a posteriori parameter choice rule. It is usual to say a parameter choice rule is convergent, if for ı ! 0 the rule is such that lim supfjjR˛.ı;y ı / y ı T C yjjH jy ı 2 K; jjy ı yjjK ıg D 0
ı!0
(112)
and lim supf˛.ı; y ı / jy ı 2 K; jjy y ı jjK ıg D 0:
ı!0
(113)
We stop here the discussion of parameter choice rules. For more material the interested reader is referred to, e.g., Engl et al. (1996) and Kirsch (1996). The remaining part of this section is devoted to the case that ƒ is compact, since then we gain benefits from the spectral representations of the operators. If ƒ W H ! K is compact, a singular system ( n ; vn , un / is defined as follows: fn2 gn2N are the nonzero eigenvalues of the self-adjoint operator ƒ*ƒ (ƒ* is the adjoint operator of ƒ/, written down in decreasing order with multiplicity. The family fvn gn2N constitutes a corresponding complete orthonormal system of eigenvectors of ƒ*ƒ. We let n > 0 and define the family fun gn2N via un D ƒvn =jjƒvn jjK . The sequence fun gn2N forms a complete orthonormal system of eigenvectors of ƒƒ*, and the following formulas are valid: ƒvn D n un ;
(114)
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution
ƒ un D n vn ; ƒx D
1 X
369
(115)
n .x; vn /H un ; x 2 H;
(116)
nD1
ƒ y D
1 X
n .y; un /K vn ; y 2 K:
(117)
nD1
The convergence of the infinite series is understood with respect to the Hilbert space norms under consideration. The identities (116) and (117) are called the singular value expansions of the corresponding operators. If there are infinitely many singular values, they accumulate (only) at 0, i.e., limn!1 n = 0. Theorem 4. Let (n ; vn , un / be a singular system for the compact linear operator ƒ, y 2 K. Then we have y 2 d.ƒC / if and only if
1 X j.y; un /K j2 nD1
n2
< 1;
(118)
and for y 2 D.ƒC / it holds C
ƒ yD
1 X .y; un /K nD1
n
vn :
(119)
The condition (118) is the Picard criterion. It says that a best-approximate solution of ƒx D y exists only if the Fourier coefficients of y decay fastly enough relative to the singular values. The representation (119) of the best-approximate solution motivates a method for the construction of regularization operators, namely by damping the factors 1/ n in such a way that the series converges for all y 2 K. We are looking for filters q W .0; 1/ .0; jjƒjjL.H;K// ! R
(120)
such that
R˛ y D
1 X q.˛; n / nD1
n
.y; un /K vn ;
y 2 K;
is a regularization strategy. The following statement is known, e.g., from Kirsch (1996).
370
W. Freeden and M. Schreiner
Theorem 5. Let ƒ W H ! K be compact with singular system (n ; vn , un ). Assume that q from (120) has the following properties: 1. jq(˛,)j 1 for all ˛ > 0 and 0 < jjƒjjL.H;K/ : 2. For every ˛ > 0 there exists a c(˛) so that jq(˛,)j c(˛) for all 0 < jjƒjjL.H;K/ : 3. lim˛!0 q(˛; ) = 1 for every 0 jjƒjjL.H;K/ : Then the operator R˛ W K ! H, ˛ > 0, defined by R˛ y D
1 X q.˛; n /
n
nD1
.y; un /K vn ;
y 2 K;
(121)
is a regularization strategy with jjR˛ jjL.K;H/ c.˛/: The function q is called a regularizing filter for ƒ. Two important examples should be mentioned: q.˛; / D
2 ˛ C 2
(122)
defines the Tikhonov regularization, whereas q.˛; / D
1; 2 ˛; 0; 2 < ˛;
(123)
leads to the regularization by truncated singular value decomposition.
4.10
Regularization of the Exponentially Ill-Posed SGG-Problem
We are now in the position to have a closer look at the role of the regularization techniques particularly working for the SGG-problem. In (95), the SGG-problem is formulated as pseudodifferential equation: Given G 2 L2 ./, find F 2 L2 ./ so that ƒR;H SGG F D G with ^ ƒR;H .n/ D SGG
.n C 1/.n C 2/ Rn : n .R C H / .R C H /2
(124)
Switching now to a finite dimensional space (which is then the realization of the regularization by a singular value truncation), we are interested in a solution of the representation FN D
N 2nC1 X X nD1 mD1
F ^ .n; m/Yn;m :
(125)
Satellite Gravity Gradiometry (SGG): From Scalar to Tensorial Solution
371
Using a decomposition of G of the form
GD
N 2nC1 X X
G ^ .n; m/Yn;m ;
(126)
nD1 mD1
we end up with the spectral equations ^ ƒR;H .n/F ^ .n; m/ D G ^ .n; m/; n D 1; : : : ; N; m D 1; : : : ; 2n C 1: SGG (127) In other words, in connection with (125) and (126), we find the relations F ^ .n; m/ D
G ^ .n; m/ ; n D 1; : : : ; N; m D 1; : : : ; 2n C 1: ^ ƒR;H .n/ SGG
(128)
For the realization of this solution we have to find the coefficients G ^ .n, m/. Of course, we are confronted with the usual problems of integration, aliasing, and so on. The identity (128) also opens the perspective for SGG-applications by bandlimited regularization wavelets in Earth’s gravitational field determination. For more details, we refer to Freeden et al. (1997), Schneider (1996, 1997), Freeden and Schneider (1998), Glockner (2002), Hesse (2003), and Freeden and Nutz (2011). The book written by Freeden (1999) contains non-bandlimited versions of (harmonic) regularization wavelets. Multiscale regularization by use of spherical up functions is the content of the papers by Freeden and Schreiner (2004) and Schreiner (2004).
5
Future Directions
The regularization schemes described above are based on the decomposition of the Hesse tensor at satellite’s height into scalar ingredients due to geometrical properties (normal, tangential, mixed) as well as to analytical properties originated by differentiation processes involving physically defined quantities (such as divergence, curl, etc). SGG-regularization, however, is more suitable and effective if it is based on algorithms involving the full Hesse tensor such as from the GOCE mission (for more insight into the tensorial decomposition of GOCE-data, the reader is referred to the contribution of R. Rummel in this issue). In addition, see Rummel and van Gelderen (1992) and Rummel (1997). Our context initiates another approach to tensor spherical harmonics. Based on cartesian operators (see Freeden and Schreiner 2009), the construction principle .i;k/ starts from operators oQ n ; i; k 2 f1; 2; 3g given by
372
W. Freeden and M. Schreiner
oQ .1;1/ F .x/ D .2n C 3/x jxj2 rx ˝ .2n C 1/x jxj2 rx F .x/; n
(129)
oQ .1;2/ F .x/ D .2n 1/x jxj2 rx ˝ rx F .x/; n
(130)
oQ .1;3/ F .x/ D .2n C 1/x jxj2 rx ˝ .x ^ rx /F .x/; n
(131)
oQ .2;1/ F .x/ D rx ˝ .2n C 1/x jxj2 rx F .x/; n
(132)
oQ .2;2/ F .x/ D rx ˝ rx F .x/; n
(133)
oQ .2;3/ F .x/ D rx ˝ .x ^ rx /F .x/; n
(134)
oQ .3;1/ F .x/ D .x ^ rx / ˝ .2n C 1/x jxj2 rx F .x/; n
(135)
oQ .3;2/ F .x/ D .x ^ rx / ˝ rx /F .x/; n
(136)
oQ .3;3/ F .x/ D .x ^ rx / ˝ .x ^ rx /F .x/ n
(137)
for x 2 R3 and sufficiently smooth functions F W R3 ! R: Elementary calculations in cartesian coordinates lead us in a straightforward way to the following result. Lemma 2. Let Hn ; n 2 N0 , be a homogeneous harmonic polynomial of degree n. .i;k/ Then, oQ n Hn is a homogeneous harmonic tensor polynomial of degree deg.i;k/ .n/, where 8 ˆ n2 ˆ ˆ ˆ ˆ b0 T Df
for u D b0 1 for u ! 1; T .u/ D O u
(2) (3) (4)
where f is assumed to be a known square-integrable function, f 2 L2 . The asymptotic condition in Eq. (4) means that the harmonic function T approaches zero at infinity. The solution of the Laplace equation (2) can be written in terms of ellipsoidal harmonics as follows (Heiskanen and Moritz 1967; Moon and Spencer 1961; Thong and Grafarend 1989): Qj m i Eu Yj m./; T .u; / D Tj m Qj m i bE0 j D0 mDj j 1 X X
(5)
where Qj m i Eu are the Legendre function of the second kind, Yj m./ are complex spherical harmonics of degree j and order m, and Tj m are coefficients to be determined from the boundary condition of Eq. (3). Substituting Eq. (5) into Eq. (3), and expanding f ./ in a series of spherical harmonics, f ./ D
Z j 1 X X j D0 mDj
0
f .0 /Yjm .0 /d 0 Yj m./
(6)
388
E.W. Grafarend et al.
where 0 is the full solid angle and d D sin#d #d , and comparing the coefficients at spherical harmonics Yj m ./ in the result, one gets Z f .0 /Yjm.0 /d 0
Tj m D
(7)
0
for j D 0; 1; : : : , and m D j; : : : ; j where Yjm is the conjugate complex spherical harmonic of Yj m . Furthermore, substituting coefficients Tj m into Eq. (5) and interchanging the order of summation over j and m and integration over 0 due to the uniform convergence of the series expansion given by Eq. (5), the solution to the ellipsoidal Dirichlet boundary-value problem, Eqs. (2)–(4), formally reads Z
Qj m i Eu Yjm .0 /Yj m./d 0 f . / Tj m Qj m i bE0 j D0 mDj 0
T .u; / D 0
j 1 X X
(8)
From a practical point of view, the spectral form of Eq. (8) of the solution to the Dirichlet problem given by Eqs. (2)–(4) may often become inconvenient, since the construction of Qj m.z/ functions and their summation up to high degrees and orders (j 104 105 ) is time consuming and numerically unstable (Sona 1995). Moreover, in the case that the level ellipsoid u D b0 deviates from a sphere by only a tiny amount, which is the case for the Earth, the solution of the problem should be close to the solution to the same problem but formulated on a sphere. One should thus attempt to rewrite T .u; / as a sum of the well-known Poisson integral (Kellogg 1929, Sect. IX.4), which solves the Dirichlet problem on a sphere, plus the corrections due to the ellipticity of the boundary. An evident advantage of such a decomposition is that existing theories as well as numerical codes for solving the Dirichlet problem on a sphere can simply be corrected for the ellipticity of the boundary.
2.2
Power-Series Representation of the Integral Kernel
Thong (1993) and Martinec and (1997) showed that the Legendre Grafarend function of the second kind, Qj m i Eu , can be developed in an infinite power series of the first eccentricity e: 1 u .j C m/Š j C1 X e D .1/m.j C1/=2 aj mk e 2k Qj m i E .2j C 1/ŠŠ
(9)
kD0
where coefficients aj mk can, for instance, be defined by the recurrence relation as aj m0 D 1
(10)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
aj mk D
.j C 2k 1/2 m2 aj mk1 2k.2j C 2k C 1/
for k 1:
389
(11)
Throughout this chapter, it is assumed that the eccentricity e0 of the reference ellipsoid u D b0 e0 D q
E b02
(12)
C E2
is less than 1. Then, for points .u; / being outside or on the reference ellipsoid, i.e., when u D b0 , the series in Eq. (9) is convergent. By Eq. (9), the ratio of the Legendre functions of the second kind in Eq. (8) reads
Qj m i Eu Qj m i bE0
D
e e0
j C1
1 P
aj mk e 2k
kD0 1 P kD0
:
(13)
aj mk e02k
Dividing the polynomials in Eq. (13) term by term, one can write 1 P kD0 1 P kD0
aj mk e 2k D1C aj mk e02k
1 X
bj mk ;
(14)
kD1
where the explicit forms of the first few constituents read (15) bj m1 D aj m1 e 2 e02 ; 4 (16) bj m2 D aj m2 e e04 aj2 m1 e02 e 2 e02 ; 6 bj m3 D aj m3 e e06 aj m2 aj m1 e02 e 4 C e 2 e02 2e04 C aj3 m1 e04 e 2 e02 : (17) Generally, bj mk D O e 2r e02s ; r C s D k:
(18)
To get an analytical expression for the kth term of the series in Eq. (14), some cumbersome algebraic manipulations have to be performed. To avoid them, confine to the case where the computation point ranges in a limited layer above the reference ellipsoid (for instance, topographical layer), namely, b0 < u < b0 C9;000 m, which includes all the actual topographical masses of the Earth. For this restricted case, which is, however, often considered when geodetic boundary-value problems are
390
E.W. Grafarend et al.
solved, express the first eccentricity e of the computation point by means of the first eccentricity e0 of the reference ellipsoid and a quantity "; " > 0, as e D e0 .1 "/:
(19)
Assuming b0 < u < b0 C 9;000 m means that " < 1:4 103 , and one can put approximately e 2k D e02k .1 2k"/;
(20)
bj m1 D 2"e02 aj m1 :
(21)
This allows one to write 1 P kD0 1 P kD0
1 P
aj mk e 2k D1 aj mk e02k
kD1 2" 1 P
kaj mk e02k
kD0
:
(22)
aj mk e02k
Expand the fraction on the right-hand side of the last equation (divided by aj m1 ) into an infinite power series e02 : 1 P
1 aj m1
kD1 1 P
kaj mk e02k
kD0
D aj mk e02k
1 X
ˇj mk e02k :
(23)
kD1
To find the coefficients ˇj mk , rewrite the last equation in the form 1 1 X
aj m1
kaj mk e02k D
kD1
1 X
aj mk e02k
kD0
1 X
ˇj ml e02l :
(24)
lD1
Since both the series on the right-hand side are absolutely convergent, their product may be rearranged as 1 X
aj mk e02k
kD0
1 X
ˇj ml e02l D
lD1
1 X k X
ˇj ml aj m;kl e02k :
(25)
kD1 lD1
Substituting Eq. (25) into Eq. (24) and equating coefficients at e02k on both sides of Eq. (24), one obtains X aj mk D ˇj ml aj mkl ; aj m1 k
k
lD1
k D 1; 2 : : :
(26)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
391
which yields the recurrence relation for ˇj mk : aj mk X ˇj ml aj m;kl ; aj m1 k1
ˇj mk D k
k D 2; 3 : : :
(27)
lD1
with the starting value ˇj m1 D 1:
(28)
With the recurrence relation in Eq. (27), one may easily construct the higher coefficients ˇj mk : aj m2 aj m1 ; aj m1 aj m3 D3 3aj m2 C aj2 m1 : aj m1
ˇj m2 D 2
(29)
ˇj m3
(30)
This process may be continued infinitely. Important properties of the coefficients ˇj mk are 1 D ˇj m1 > ˇj m2 > ˇj m3 > 0;
(31)
ˇjj k > ˇjj 1k > > ˇj 0k :
(32)
Figure 1 demonstrates these properties for j D 30. Fig. 1 Coefficients ˇj mk for j D 30, m D 0; 20; 30, and k D 1; : : :; 100
392
E.W. Grafarend et al.
Finally, substituting the series Eq. (23) into Eq. (22) and using Eqs. (21) and (28) for bj m1 and ˇj m1 , respectively, yields 1 P kD0 1 P kD0
2.3
aj mk e 2k D 1 C bj m1 1 C aj mk e02k
1 X
! ˇj mk e02k2
:
(33)
kD2
The Approximation of O.e20 /
The harmonic upward or downward continuation of the potential or the gravitation between the Geoid to the Earth’s surface is an example of the practical application of the boundary-value problem Eqs. (2)–(4) (Engels et al. 1993; Martinec 1996; Vaniˇcek et al. 1996). To make the theory as simple as possible but still matching the requirements on Geoid height accuracy, one should keep throughout the following derivations the terms of magnitudes of the order of the first eccentricity e02 of the Earth’s level ellipsoid and neglect the term of higher powers of e02 . This approximation is justifiable because the error introduced by this approximation is at most 1. 5 105 , which then causes an error of at most 2 mm in the Geoidal heights. Keeping in mind the inequalities in Eq. (31), and assuming that e02 1, the magnitude of the last term in Eq. (33) is of the order of e02 , 1 X
ˇj mk e02k2 e02 :
(34)
kD2
Hence, Eq. (13) becomes u j C1 Qj m i e E D 1 C aj m1 e 2 e02 1 C O e02 : b0 e0 Qj m i E
(35)
Substituting Eq. (35) into Eq. (8), evaluating a j m1 according to Eq. (11) for k D 1, and bearing in mind the classical Laplace addition theorem for spherical harmonics (for instance, Varshalovich et al. 1989, p. 164), Pj .cos / D
j X 4 Yj m./Yjm .0 /; 2j C 1 mDj
(36)
is the angular where Pi .cos / is the Legendre polynomial of degree j and distance between directions and 0 , one gets Z
1 f .0 / K sph .t; cos / 2e02 k ell .t; ; 0 /.1 C O.e02 // d 0 ; T .u; / D 4 0 (37)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
393
where tD
e : e0
(38)
K sph .t; cos / is the spherical Poisson kernel (Heiskanen and Moritz 1967; Kellogg 1929; Pick et al. 1973), K sph .t; cos / D
1 X
.2j C 1/t j C1 Pj .cos / D
j D0
t.1 t 2 / g3
(39)
with g g.t; cos / D
p
1 C t 2 2t cos
(40)
.j C 1/2 m2 Yj m ./Yjm.0 /: 2.2j C 3/
(41)
and k ell .t; ; 0 / stands for ell
0
2
k .t; ; / D 4.1t /
j 1 X X j D0 mDj
t j C1
Moreover, it holds that (t < 1) jk ell .t; ; 0 /j k ell .t; ; / k ell .t; ; /j#D0 D
1 2j C 1 1 t 2 X j C1 t .j C 1/2 2 j D0 2j C 3
b0 ;
2 @T C D f for u D b0 @u u 1 c ; u ! 1; T .u/ D C O u u3
(80) (81) (82)
where f is assumed to be a known square-integrable function, i.e., f 2 L2 , and c is a constant. A limiting case E D 0 when the reference ellipsoid of revolution, u D b0 , reduces to a sphere x 2 C y 2 C z2 D b0 , and the boundary-value problem Eqs. (80)–(82) reduces to the spherical Stokes problem (Heiskanen and Moritz 1967, Sect. 2.16) will also be considered. To guarantee the existence of a solution in this particular case, the first-degree harmonics of f ./ have to be removed by means of the postulate (Holota 1995) Z 0
f ./Ylm ./d ;
m D 1; 0; 1
(83)
Here, Y1m ./ are complex spherical harmonics of the first degree and order m, the asterisk denotes a complex conjugate, 0 is the full solid angle, and d D sin #d #d . Throughout the chapter it is assumed that conditions (5) are satisfied. Moreover, to guarantee the uniqueness of the solution, the first-degree harmonics have to be removed from the potential as the asymptotic condition (82) indicates. The question of the uniqueness of the problem (80)–(82) for a general case E ¤ 0 will be examined later in Sect. 3.5. The problem (80)–(82) will be called the ellipsoidal Stokes boundary-value problem since it generalizes the traditional Stokes boundary-value problem formulated on a sphere. It approximates the boundary-value problem for Geoid determination (Martinec and Vaniˇcek 1996) or the Molodensky scalar boundaryvalue problem (Heck 1991, Sect. 6.3) by maintaining the two largest terms in boundary condition (81) and omitting two small ellipsoidal correction terms. (The effect of the ellipsoidal correction terms on the solution of the ellipsoidal Stokes boundary-value problem will be treated in a separate chapter.) Alternatively, the boundary-value problem for Geoid determination may be formulated in geodetic
404
E.W. Grafarend et al.
coordinates (Otero 1995) which results in a boundary condition of a form slightly different from Eq. (81). However, ellipsoidal coordinates and formulation (80)– (82) will be used since the Laplace operator is separable in ellipsoidal coordinates (Moon and Spencer 1961) which will substantially help in finding a solution to the problem. The solution of the Laplace equation (80) can be written in terms of ellipsoidal harmonics as follows (Moon and Spencer 1961, p. 32; Heiskanen and Moritz 1967, Sect. 1.20; Thong and Grafarend 1989, p. 302): Qj m i Eu Yj m./; T .u; / D Tj m Qj m i bE0 j D0 mDj j 1 X X
(84)
where Qj m i Eu are the Legendre function of the second kind and Tj m are coefficients to be determined from boundary condition (81). Equation A.11, derived in Appendix A (see Thong and Grafarend 1989), shows that u 1 DO Qj m i ; E uj C1
u ! 1:
(85)
To satisfy asymptotic condition (82), the term with j D 1 must be eliminated from the sum over j in Eq. (84), i.e., Qj m i Eu Yj m ./: T .u; / D Tj m Qj m i bE0 j D0 mDj j 1 X X
(86)
j ¤1
Substituting expansion (86) into boundary conditions (81) yields j 1 X X j D0 mDj j ¤1
"
1 Qj m i bE0
ˇ dQj m i Eu ˇˇ ˇ ˇ du
uDb0
# b0 2 Tj mYj m ./ D f ./: C Qj m i b0 E
(87)
Moreover, expanding function f in a series of spherical harmonics
f ./ D
Z j 1 X X j D0 mDj
0
f .0 /Yjm .0 /d 0 Yj m./;
(88)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
405
where the first-degree spherical harmonics of f ./ are equal to zero by assumption (83), substituting expansion (88) into Eq. (87), and comparing the coefficients at spherical harmonics Yj m./ in the result yields R
Tj m
f .0 /Yjm .0 /d 0 ˇ " # D ˇ dQj m .i Eu / ˇ b 1 2 0 C b0 Qj m i E ˇ b du Qj m i E0 ˇ 0
(89)
uDb0
for j D 0; 2; : : :, and m D j; : : :; j . Substituting coefficients Tj m into Eq. (86) and changing the order of summation over j and m and of integration over 0 , due to the uniform convergence of the series, one obtains Z f .0 /
T .u; / D 0
j 1 X X
˛j m .u/Yjm.0 /Yj m./d 0 ;
(90)
j D0 mDj
where functions ˛j m .u/ have been introduced: ˛j m .u/ D
Qj m i Eu dQj m .i Eu / juDb0 du
C
2 b0 Qj m
i bE0
(91)
to abbreviate notations.
3.2
The Zero-Degree Harmonic of T
The zero-degree harmonic of potential T .u; / is to be calculated. One first has ˛00 .u/ D
Q00 i Eu dQ00 .i Eu / juDb0 du
C
2 Q00 b0
i bu0
;
(92)
where Q00 i Eu can be expressed as (Arfken 1968, Eq. 12.222) u E D i arctan Q00 i E u with i D gets
(93)
p 1. Taking the derivatives of the last equation with respect to u, one dQ00 i Eu iE D 2 : du u C E2
(94)
406
E.W. Grafarend et al.
By substituting Eqs. (93) and (94) into Eq. (92), ˛00 .u/ D
E
arctan E b02 CE 2
2 b0
:
u
arctan
(95)
E b0
: Since arctan x D x for x 1, one can see that ˛00 .u/ D
c u
for u ! 1
(96)
with cD
3.3
arctan E b02 CE 2
2 b0
E :
u
arctan
(97)
E b0
Solution on the Reference Ellipsoid of Revolution
Solution (90) can now be expressed in the way indicated above as ˛00 .u/ T .u; / D 4
Z f .0 /d 0 0
Z f .0 /
C
j 1 X X
˛j m .u/Yjm.0 /Yj m ./d 0 :
(98)
j D2 mDj
0
In particular, the interest is in finding potential T .u; / on the reference ellipsoid of revolution u D b0 , i.e., function T .b0 ; /. In this case, the general formulae (91) and (98) reduce to T .b0 ; / D
˛00 .u/ 4
Z f .0 /d 0 0
Z f .0 /
C 0
j 1 X X
˛j m .b0 /Yjm .0 /Yj m ./d 0 ;
(99)
j D2 mDj
where ˛00 .b0 / are given by Eq. (95) for u D b0 and the other ˛j m .b0 / read
˛j m .b0 / D
Qj m i bE0 dQj m .i Eu / juDb0 du
C
2 Q b0 j m
i bE0
:
(100)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
407
From the practical point of view, the spectral form Eq. (99) of the solution of the Stokes boundary-value problem Eqs. (80)–(82) is often inconvenient, since constructing the spectral components of f ./ and summing them up to high degrees and orders may become time consuming and numerically unstable. Moreover, for the case in which the reference ellipsoid of revolution u D b0 deviates only slightly from a sphere, which is the case of the Earth, the solution to the problem should be close to the solution for the same problem but formulated for a sphere. It is thus attempted to rewrite T .b0 ; / as a sum of the well-known Stokes integral plus the corrections due to the ellipticity of the boundary. An evident advantage of such a decomposition is that existing theories as well as numerical codes for Geoidal height computations can simply be corrected for the ellipticity of the Geoid. To make the theory as simple as possible, but still maintain the requirements for Geoidal height accuracy throughout the following derivations, one shall retain the terms of magnitudes of the order of the first eccentricity e02 of an ellipsoid of revolution and neglect the term of higher power of e02 . This approximation is justifiable, because the error introduced by this approximation is 1 5 105 at the most which then is expected to cause an error of no more than 2 mm in the Geoid heights.
3.4
The Derivative of the Legendre Function of the Second Kind
Now, look for the derivative of the Legendre function Qj m.i Eu / with respect to variable u. Equation A.11, derived in Appendix A (see Thong and Grafarend 1989), shows that Qj m i Eu can be expressed as an infinite power series of first eccentricity e, E eDp 2 u C E2
(101)
in the form 1 u .j C m/Š j C1 X e D .1/m.j C1/=2 Qj m i ˛j mk e 2k ; E .2j C 1/ŠŠ
(102)
kD0
where coefficient ˛j mk can, for instance, be defined by recurrence relations (A.14). In particular, one shall need ˛j m0 D 1
(103)
and ˛j m1 D
.j C 1/2 m2 ; 2.2j C 3/
˛j m2 D
.j C 3/2 m2 ˛j m1 : 4.2j C 5/
(104)
408
E.W. Grafarend et al.
Throughout this chapter it is assumed that the eccentricity e0 of the reference ellipsoid of revolution u D b0 , e0 D q
E b02
(105)
C E2
is less than 1. For points .u; / outside, or on the reference ellipsoid of revolution, i.e., when u b0 , series Eq. (102) then converges. One can take the derivative of this series with respect to u and change the order of integration and summation since the resulting series Eq. (106) is uniformly convergent. Consequently, 1 dQj m i Eu .j C m/Š j C1 X de D .1/m.j C1/=2 e .2k C j C 1/˛j mk e 2k : du .2j C 1/ŠŠ du kD0 (106) The derivative of eccentricity e with respect to u can be easily obtained from Eq. (101): de e D .1 e 2 / : du u
(107)
Substituting Eq. (107) into Eq. (106) yields 1 dQj m i Eu .j C m/Š e j C1 X D .1/m.j C1/=2 .1 e 2 / .2k j 1/˛j mk e 2k : du .2j C 1/ŠŠ u kD0 (108)
3.5
The Uniqueness of the Solution
Now, one is ready to express the sum in the denominator of Eq. (100) u dQj m i b0 2 E j i C Q uDb0 jm du b0 E D .1/m.j C1/=2
j C1 1 P
.j C m/Š e0 .2j C 1/ŠŠ b0
kD0
˛j mk
1 e02 .2k j 1/ C 2 e02k: (109)
Substituting Eqs. (102) and (109) into Eq. (100) yields 1 P
˛j m .b0 / D b0
1 P kD0
kD0
˛j mk
˛j mk e02k
1 e02 .2k C j C 1/ 2 e02k
:
(110)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
409
The last formula enables one to examine the uniqueness of the solution of boundaryvalue problem Eqs. (80)–(82). Let the denominator of expression Eq. (110) for ˛j m be denoted by dj m .e02 /,
dj m .e02 / D
1 X
˛j mk
1 e02 .2k C j C 1/ 2 e02k
(111)
kD0
j D 2; : : :; m D 0; 1; : : :; j , and investigate the dependence of dj m.e02 / on e02 . Numerical examinations have resulted in Fig. 1 showing the behavior of dj m .e02 / for three values of e02 as functions of angular degree j and order m, j D 2; : : :; 8; m D 0; 2; : : :; j ; the combined index j m WD j .j C 1/=2 C m C 1. If e02 < 0:42303, then all coefficients dj m.e02 / are positive which means that the solution of boundaryvalue problem Eqs. (80)–(82) is unique. Once, e02 Š 0:42303; d22.e02 / D 0, and the solution of the problem is not unique. If the size of e02 is increased further, see, for instance, the curve denoted by triangles in Fig. 5 for e02 D 0:7, other dj m .e02 / may then vanish and problem Eqs. (80)–(82) have a nonunique solution. One can conclude that the uniqueness of the solution can only be guaranteed if the square of the first eccentricity is less than 0.42303. Fortunately, this condition is satisfied for the Earth since e02 D 0:006 694 380 (Moritz 1980) for the ellipsoid best fitting the Earth’s figure (the corresponding dj m.e02 / are plotted in Fig. 5 as black dots).
Fig. 5 Functions dj m (e0 2 ) for various e0 2 as functions of degree j and order m; the combined index is as follows j m D j .j C 1/=2 C m C 1
410
E.W. Grafarend et al.
The Approximation up to O.e0 2 /
3.6
Arrange formula Eq. (110) into a form that is more suitable for highlighting the Stokes contribution. After some cumbersome but straightforward algebra, one arrives at 2 ˛j m .b0 / D
1 h P
b0 6 61 j 1 4
kD1
1C
2k ˛ j 1 j mk 1 P
.1 C
kD1
1C
i 3 ˛j m;k1 e02k 7 7; 5 2k ˛j m;k1 /e0
2k j 1
2k j 1 /.˛j mk
(112)
where ˛j m0 has been substituted from Eq. (103). The ratio of the two power series in Eq. (112) can further be expanded as a power series of eccentricity e02 . The explicit forms of the first two terms of such a series are as follows: 1 h P kD1
1C
2k j 1 ˛j mk
1 P
kD1
C.1 C
.1 C
2k j 1 /˛j m;k1
2k j 1 /.˛j mk
i
e02k D dj m1 e02 C dj m2 e04 C R;
(113)
˛j m;k1 /e02k
where 2 j C1 ˛j m1 (114) j 1 j 1 h i j C 1 2 2 2 2.j 1/˛ D .j C 1/˛ C .j C 3/˛ j m2 j m1 j m1 .j 1/2 j 1 (115)
dj m1 D dj m2
and R is the residual of the series expansion. It shall now be attempted to estimate the sizes of particular terms on the righthand side of Eq. (113). Substituting for ˛j m1 and ˛j m2 from Eq. (104) into Eqs. (114) and (115) and after some more algebra, one gets
dj m1 D
j 2 C 3j C 2 C m2 .j 1/.2j C 3/
(116)
and
dj m2 D
10j 3 C 50j 2 C 100j C 58 C 2m2 .3j C 4/ j C1 2 ˛ : j m1 .j 1/2 .2j C 3/.2j C 5/ j 1
(117)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
411
Hence, jdj m1j
1 2 16 2j 2 C 3j C 2 D1C C .j 1/.2j C 3/ j 1 .j 1/.2j C 3/ 7
(118)
for j 2 and jmj j . It shall continue estimating the maximum size of the first constituent creating coefficients dj m2 for j 2 and jmj j :
0
0/ < 0/ D 0/;
1 2 ." .X 2 C Y 2 C Z 2 / 2"2
i1=2 p C .X 2 C Y 2 C Z 2 "2 /2 C 4"2 Z 2 / ; uD
(220)
p 1 2 .X C Y 2 C Z 2 "2 C .X 2 C Y 2 C Z 2 "2 /2 C 4"2 Z 2 / 2
(221) 1=2 : (222)
(iii) Jacobi matrix 2
3 D X D' X Du X J D 4 D Y D' Y Du Y 5 ; D Z D' Z Du Z p 2 p u2 C "2 cos ' sin u2 C "2 sin ' cos p p 6 J D 4 u2 C "2 cos ' cos u2 C "2 sin ' sin 0
u cos '
(223)
p u u2 C"2 p u u2 C"2
cos ' cos
3
7 cos ' sin 5 :
sin ' (224)
(iv) Metric 3 d d u J T J 4 d ' 5 ; du 2
dS 2 D Œd
d'
(225)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
433
G D JTJ; (226) 2 2 3 .u C "2 / cos2 ' 0 0 2 2 2 5; G D4 0 0 u C " sin ' 2 2 2 2 2 0 0 .u C " sin '/=.u C " / (227) dS 2 D .u2 C "2 / cos2 'd 2 C .u2 C "2 sin2 '/d ' 2 C
u2 C "2 sin2 ' 2 du : u2 C "2 (228)
The derivatives {Dƒ x; D' x,Dux} are presented in order to construct in Eqs. (229) and (230) a local orthonormal frame of reference, namely, E D D x jjD xjj; E' D D' x jjD' xjj; Eu D Du x jjDu xjj, in short {E ; E' ; Eu jP }, also called {ellipsoidal east, ellipsoidal north, ellipsoidal vertical} at P . The local orthonormal frame of reference {E ; E' ; Eu jP } at P is related to the global orthonormal frame of reference {E1 ; E2 ; E3 jO} at O by the orthonormal matrix T1 . Thanks to the transformation Eq. (165) fe1 ; e2 ; e3 g ! feƒ; e'; eug, one has already succeeded to establish the orthonormal matrix T2 . ; ' ), Eq. (233). Such a relation will again be used to transform the ellipsoidal orthonormal frame of reference {E ; E' ; Eu jP } to the reference “east, north, vertical” frame {e ; e' ; eu jP }. Indeed ŒE ; E' ; Eu T D T1 Œe1 ; e2 ; e3 T D T1 T2T Œe ; e' ; eu T or Œe ; e' ; eu T D T2 T1T ŒE ; E' ; Eu T is the compound transformation T of Eqs. (231)–(235). As soon as one develops T close to the identity, namely, by means of the decomposition (Eqs. (236)–(239)) as well as of a special case (Eq. (240)) of the binomial series, ("u/ < 1, one gains the elements of the matrix T (case study t23 /. The matrix T is finally decomposed into the unit matrix I 3 and the incremental antisymmetric matrix A, Eqs. (248) and (249). Construction of the local reference frame {E ; E' ; Eu jP } from the Jacobi ellipsoidal coordinates of point P on the topography: D X @x1 @x2 @x3 D E1 C E2 C E3 E D jjD Xjj @ @ @ " 2 2 #1=2 @x1 @x2 @x3 2 C C ; @ @ @
E' D
D' X @x1 @x2 @x3 D E1 C E2 C E3 jjD' Xjj @' @' @' " #1=2 @x1 2 @x2 2 @x3 2 C C ; @' @' @'
434
E.W. Grafarend et al.
@x1 @x2 @x3 Du X D E1 C E2 C E3 Eu D jjDu Xjj @u @u @u " #1=2 @x1 2 @x2 2 @x3 2 C C ; @u @u @u 3 2 3 E1 E 4 E' 5 D T 1 4 E2 5 ; Eu E3
(229)
2
2
sin
cos
(230)
0
3
6 7 6 7 p p 6 7 2 2 2 2 6 u C" 7 u C" u 6q 7 q q cos sin ' sin sin ' cos ' 6 7 2' 2' 2' 2 2 2 2 2 2 6 7: u C " sin u C " sin u C " sin T 1 D6 7 6 7 6 7 p 6 7 2 2 u C" u u 6 7 cos cos ' q sin cos ' q sin ' 5 4q 2 2 2 2 2 2 2 2 2 u C " sin ' u C " sin ' u C " sin '
Basis transformation from geometry space to gravity space: 2
2 3 2 3 3 e ” E E 4 e' 5 D T 2 T T1 4 E' 5 D T 4 E' 5 ” Eu Eu e”
(231)
: T 1 D T 1 .ƒ; '; u/;
(232)
: T 2 D T 2 .” ; '” / D R E ” C ; '” ; 0 ; 2 2
(233)
subject to
T WD T 2 T T1 D T .ƒ; '; uI ” ; '” /; T 1 ; T 2 2 SO.3/;
(234)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
2
p u2 C "2 sin. ” / sin ' p u2 C "2 sin2 ' p u2 C "2 cos. ” / sin '” C p u2 C "2 sin2 '
u sin. ” / cos ' p u2 C "2 sin2 '
435
3
6cos. ” / 7 6 7 6 7 6 7 6 sin. ” / 7 / cos ' sin ' u cos. ” ” 7 6 sin '” C p 6 7 2 6 7 u2 C "2 sin ' 6 7 p 6 7 2 2 6 u C " sin ' cos '” 7 u cos ' cos '” 6 7 p p C C T D6 7: 6 7 u2 C " sin2 ' u2 C " sin2 ' 6 7 p 6 7 6 sin. / u2 C "2 cos. ” / sin ' cos '” 7 / cos ' cos ' u cos. ” ” 6 cos ' p p C C7 6 7 2 2 2 2 2 2 u C " sin ' u C " sin ' 6 7 6 7 p 6 7 6 u2 C "2 sin ' sin '” 7 u cos ' sin '” 4 5 C p Cp u2 C "2 sin2 ' u2 C "2 sin2 ' (235)
Additive decomposition: D ” C ı , ” D ' ı;
(236)
' D '” C ı' , '” D ' ı';
(237)
: sin.ƒ ı/ D sin ƒ cos ƒı;
(238)
: sin.' ı'/ D cos ' C sin 'ı':
(239)
Special case of binomial series: 3 1 15 .1 C x/1=2 D 1 x C x 2 x 3 C O.4/: 2 8 48 Condition: u > " D
(240)
p a2 b 2
!1=2 "2 sin2 ' p (241) D 1C u2 u2 C "2 sin2 ' " sin ' 1 " sin ' 2 3 " sin ' 4 D 1 C C O6 2 u 8 u u s 1=2 "2 cos2 ' u2 C "2 D 1 C (242) u2 C "2 u2 C "2 sin2 ' 2 " cos ' 1 "2 cos2 ' 3 "2 cos2 ' : D 1 C C O p 6 2 u2 C "2 8 u2 C "2 u2 C "2 u
436
E.W. Grafarend et al.
Case study: "
t23
# "2 sin2 ' D 1 C O.4/ . sin ' cos ' C cos2 'ı'/ u2
(243)
"2 cos2 ' C O.4/ .sin ' cos ' C sin2 'ı'/; C 1 2 u C "2
"2 sin ' cos ' : t23 D ı' C 2 2 2u .u C "2 /
(244)
.u2 C "2 / sin2 ' C u2 cos2 ' .u2 C "2 / sin ' cos 'ı' C u2 sin ' cos 'ı' ; "2 sin ' cos ' : t23 D ı' C 2u2 sin2 ' C
(245)
u2 u2 2 cos ' sin ' cos 'ı' C 2 sin ' cos 'ı' ; u2 C "2 u C "2
: t23 D ı' C ."2 =4u2 / sin 2':
(246)
Linearized basis transformation from geometry space to gravity space: 3 2 3 32 1 sin 'ıL cos 'ıL E e ” 4 e' 5 D 4 sin 'ıL 1 ı' C ."=4u2/ sin 2' 5 4 E' 5 ” 2 1 Eu cos 'ıL ı' ."=4u / sin 2' e” (247) subject to 2
T D I 3 C ıA;
(248)
2
3 0 sin 'ıL cos 'ıL ıA D 4 sin 'ıL 0 ı' C ."=4u2/ sin 2' 5 : 2 0 cos 'ıL ı' ."=4u / sin 2'
(249)
Based upon the linearized version of the transformation {E ƒ ; E ' ; E u g ! fe ƒ ; e ' ; e u }, Eqs. (231)–(249), one can build up the representation of the incremental gravity vector ı, Eq. (176), in terms of [e ƒ ; e ' ; e u ]: Eqs. (250)– (253) illustrate the third representation of the incremental gravity vector -ı, now in the basis [E ƒ ; E ' ; E u ]. The second basic result has to be interpreted as follows. The third representation of the incremental gravity vector ı contains horizontal components as well as a vertical component which are all functions of ( ; ; ı ). For instance, the east component Eƒ is a function of (first order) and of ( ; ı ) (second order). Or the vertical component Eu is a function of ı (first order) and
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
437
of ( ; ) (second order). If one concentrates on first-order terms only, Eqs. (252) and (253) prove the identity of the first and third definition of vertical deflections. Vertical deflections with respect to a basis in geometry space: Jacobi ellipsoidal coordinates (ƒ; '; u).
ı” D e ” ; e '” ; e ” ;
(250)
2
3 ” C sin 'ı” cos 'ıı” 6 7 "2 : sin 'ı” C ” ı' C 4u 2 sin 2' ı 7 : ı” D ŒE E' Eu 6 4 5 "2 cos 'ı” C ı' C 4u sin 2' ” C ı 2
(251)
Vertical deflections :
D cos '” ı” D cos 'ı;
(252)
"2 : D ı'” D ı' C 2 sin 2': 4u
(253)
The potential theory of the horizontal and vertical components of the gravity field is reviewed in Eqs. (254)–(262). First, the reference gravity vector as the gradient of the gravity potential and the incremental gravity vector ı as the gradient of the incremental gravity potential, also called disturbing potential, newly represented in the Jacobi ellipsoidal frame of reference {E ; E ' ; E u jP }, are presented. The elements (g ; g' ' ; guu / of the matrix of the metric G have to be implemented. Second, the updated highlight is the first-order potential representation ( ; ; ı ), Eqs. (257)–(259), as functionals of type (Dƒ ıw ; D' ıw ; Du ıw /. Potential theory of horizontal and vertical components of the gravity field: 1 1 1 ” D grad w D E p D w C E' p D' w C Eu p Du w; g g' ' guu
(254)
1 1 1 ı” D grad ıw D E p D ıw C E' p D' ıw C Eu p Du ıw: g g' ' guu
(255)
Functionals of the disturbing potential ıw: 3 p 2 3 .1= g /D ıw ”
7 6 7 6 .1=pg' ' /D' ıw 7 D T 6 4 ” 5 ; 5 4 p .1= guu /Du ıw ı” 2
1
D ”
! 1 1 1 D ıw C sin 'ı p D' ıw cos 'ı p Du ıw ; p g g' ' guu
(256)
(257)
438
E.W. Grafarend et al.
! 1 1 1 "2 1p D sin 'ı p D ıwC p D' ıw ı' C 2 sin 2' guu Du ıw ; ” g g' ' 4u (258) ! "2 1 1 1 D ıwC ı' C 2 sin 2' p D' ıwC p Du ıw ; ı” D cos 'ı p g 4u g' ' gu (259) if sin 'ı cos 'ı ı' C ."2 =4u2 / sin 2' D ıw 1; p D ıw 1; D' ıw 1; p p g g g' ' cos 'ı ı' C ."2 =4u2 / sin 2' sin 'ı D' ıw 1; p Du ıw 1; Du ıw 1; p p g' ' guu guu then 1 : 1
D p D ıw; 2 ” u C "2 cos '
(260)
1 : 1 D p D' ıw; ” u2 C "2 sin2 '
(261)
s : ı” D
6
u2 C "2 sin2 ' Du ıw: u2 C "2
(262)
Potential Theory of Horizontal and Vertical Components of the Gravity Field: Gravity Disturbance and Vertical Deflections
Now the gravitational disturbing potential in terms of Jacobi ellipsoidal harmonics is represented. As soon as one takes reference to a normal potential of SomiglianaPizzetti type, the ellipsoidal harmonics of degree/order (0,0), (1,0), (1, 1), (1,1) and (2,0) are eliminated from the gravitational disturbing potential. In order to present the potential theory of the horizontal and vertical components of the gravity field, namely, ellipsoidal vertical deflections and ellipsoidal gravity disturbance, one has to make a decision about what is the proper choice of the ellipsoidal potential field of reference w.ƒ; '; u/ and about the related ellipsoidal incremental potential field ıw.ƒ; '; u/, also called “disturbing potential.”
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
6.1
439
Ellipsoidal Reference Potential of Type Somigliana-Pizzetti
There has been made three proposals for an ellipsoidal potential field of reference. The first choice is the zero-degree term [arccot.u="/ .GM="/ of the external ellipsoidal harmonic expansion. Indeed, it would correspond to the zero-order term GM =r of the external spherical harmonic expansion. As proven in Grafarend and Ardalan (1999), the equipotential surface w D Œarccot.u="/ .GM="/ D constant where GM is the geocentric gravitational constant. Unfortunately, such an equipotential reference surface does not include the rotation of the Earth, namely, its cen trifugal potential 2 "2 1 C .u="/2 =3 C 2 "2 1 C .u="/2 P2 .sin '/=3. Accordingly, there has been made the proposal for a second choice, namely, to choose [arccot(u="/ GM=" C 2 "2 .u2 C "2 / cos2 '=2. In this approach the zero-degree term of the gravitational potential and the centrifugal potential is superimposed. Unfortunately, the level surface [arccot(u="/ GM=" C .1 C P2 .sin '//2 "2 Œ1 C .u="/2 =3 D constant is not an ellipsoid of revolution. It is for this reason that the proposal for the third choice has been chosen. Superimpose the gravitational potential, which is externally expanded in ellipsoidal harmonics, and the centrifugal potential represented also in ellipsoidal base functions (the centrifugal potential is not a harmonic function) and postulate an equipotential reference surface to be a level ellipsoid. Such a level ellipsoid should be an ellipsoid of revolution. Such an ellipsoidal reference field has been developed by Pizzetti (1894) and Somigliana (1929) and is properly called Somigliana-Pizzetti reference potential. The Euclidean length of its gradient is referred to as the International Gravity Formula, which recently has been developed to the sub-nano Gal level by Ardalan and Grafarend (2001). Here, the recommendations followed are that of the International Association of Geodesy, namely, Moritz 1984, to use the Somigliana-Pizzetti potential of a level ellipsoid as the reference potential summarized in Eqs. (263)–(267). Reference gravity potential field of type Somigliana-Pizzetti. Reference Level Ellipsoid, Semimajor axis a, semiminor p axis b, Absolute eccentricity " D a2 b 2 , E 2a;b D X 2 R3 j.X 2 C Y 2 /=a2 C Z 2 =b 2 D 1:
(263)
The first version of the reference potential field u 2 3 C 1 arccot u" 3 u" " GM 1 2 2 w.'; u/ D arccot C a 2 .3 sin2 ' 1/ " " 6 3 b" C 1 arccot b" 3 b" u
1 C 2 .u2 C "2 / cos2 ': 2 Input: 4 parameters: GM, , a, " D
(264) p a2 b 2 or b.
440
E.W. Grafarend et al.
Legendre polynomials of the first and second kind: arccot .u="/ D Q00 .u="/; u 2 C1 3 " 2 ! b 3 C1 "
u u 3 " " b arccot " 3 b"
arccot
D
Q20 .u="/ ; Q20 .b="/
2 3 sin2 ' 1 D p P20 .sin '/: 5
(265)
(266)
The second version of the reference potential field: p GM 1 5 2 2 Q20 .u="/ w.'; u/ D Q00 .u="/ C a P .sin '/ C ! 2 .u2 C "2 / cos2 ': " 15 Q20 .b="/ 20 2 (267) Constraints: reference gravity potential field of type Somigliana-Pizzetti. Conditions for the ellipsoidal terms of degree/order (0,0) and (2,0). The first version: 1 .0; 0/ W u00 C 2 a2 D W0 ; 3 p 5 2 2 a D 0: .2; 0/ W u20 15
(268)
(269)
The first condition in multipole expansion: u00 D
GM b GM b arccot D p ; arccot p " " a2 b 2 a2 b 2 GM b u00 D Q00 : " "
(270)
(271)
Corollary:
GM
b
p arccot p a2 b 2 a2 b 2
1 C 2 a2 W0 D 0: 3
(272)
The second condition in multipole expansion: u20
G1 D " 2
"
# ! 2 b b b 3 3 J20 ; C 1 arccot " " "
(273)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
u20
G D Q20 "
b J20 : "
441
(274)
Ellipsoidal multipole of degree/order (2,0): Z2 d 0
J20 D
! 0 2 u 1 3 d u0 .u02 C "2 sin2 ' 0 / C1 2 "
.0 ;' 0 / u0 Z
Z=2 d ' 0 cos ' 0
=2
0
0
p 5 .3 sin2 ' 0 1/.0 ; ' 0 ; u0 /: 2
(275)
Functional of the mass density field .ƒ0 ; ' 0 ; u0 /: 0
J20
0
.;' / uZ 0 Z2 Z=2 u 0 0 0 0 02 2 2 0 P20 D d d ' cos ' d u .u C" sin ' /P20 .sin ' 0 /.; '; u0 /: " 0
0
=2
(276) Cartesian multipole of degree two versus ellipsoidal multipole of degree/order (2,0): J11 D A; J22 D B; J33 D C;
(277)
Z Jpq D
d w3 .jjX jj2 ıpq Xp Xq /.X; Y; Z/ 8p; q 2 f1; 2; 3g:
(278)
Z d w3 .Y 2 C Z 2 /.X; Y; Z/;
J11 D Z
d w3 .X 2 C Z 2 /.X; Y; Z/;
J22 D
(279)
Z J33 D J20 D
d w3 .Y 2 C Y 2 /.X; Y; Z/; ACB 1p 1 C C M "2 : 5 2 3 2 " 2
(280)
The second condition in Cartesian multipole expansion: u20
u20
! " # p b 3 5 1 b b 2 2 2 3 D C1 arccot 3 G.A C B 2C /C GM" ; 8 "3 " " " 3 (281) p 1 ACB 3 5 1 b 2 C C GM" G : (282) D Q 4 "3 20 " 2 3
442
E.W. Grafarend et al.
Corollary: G 3 1 1 GM C .A C B 2C / p p "3 4 a 2 b 2 8 3 a2 b 2 b2 b 1 b 3 2 3 2 a2 D 0: C 1 arccot p p 2 2 2 2 a b2 15 a b a b (283) Corollary (second ellipsoidal condition): u20
b GM arccot C20 D " "
(284)
is a transformation of the dimensionless ellipsoidal coefficient of degree (2,0) C20 to the nondimensionless ellipsoidal coefficient of degree/order (2,0), and then the second condition in the ellipsoidal harmonic coefficient C20 is
1 C20 p 2 a2 D 0: p arccot p 2 2 2 2 3 5 a b a b GM
b
(285)
Lemma (World Geodetic Datum, Grafarend and Ardalan 1999) If the parameters {W0 ; GM; C20 ; } are given, the Newton iteration of the nonlinear two condition equations is contractive and leads to a D a (W0 ; GM; C20 ; /; b D b.W0 ; GM; C20 ; /, and " D ".W0 ; GM; C20 ; /.
6.2
Ellipsoidal Reference Gravity Intensity of Type Somigliana-Pizzetti
Vertical deflections { .ƒ; '; u/; .ƒ; '; u/} as defined as the longitudinal and lateral derivatives of the disturbing potential are normalized by means of reference gravity intensity .; '; u/ DW jjgrad.; '; u/jj (Eqs. (254)–(262), (285), and (286)). Here the aim is at computing the modulus of reference gravity with respect to the ellipsoidal reference potential of type Somigliana-Pizzetti. The detailed computation of jj grad.; '; u/jj is presented by means of Eqs. (285)–(291), two lemmas, and two corollaries. Since the reference potential of type Somigliana-Pizzetti w.'; u/ depends only on spheroidal latitude ' and spheroidal height u, the modulus of the reference gravity vector Eq. (286) is a nonlinear operator based upon the lateral derivative D' w and the vertical derivative Du w. As soon as one departs from the standard representation of the gradient operator in orthogonal coordinates, namely, grad.w/ D e .g /1=2 D w C e' .g' ' /1=2 D' w C eu .guu /1=2 Du w, one arrives at the standard form of jjgradwjj of type Eq. (286). Here, for the near-field computation, one shall assume x D .u2 C "2 /1 .D' w=Du w/. Accordingly by means of Eq. (287),
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
443
p 1 C x is expanded in binomial series and is led to the first-order approximation of D jjgradwjj by Eq. (288). Obviously up to O.2/, it is sufficient to compute the vertical derivative jDu wj. An explicit version of Du w is given by Eqs. (291) and (292), in the form of ellipsoidal base functions by Eq. (293). Reference gravity intensity of type Somigliana-Pizzetti, International Gravity Formula: ” D jjgradw.'; u/jj D q D q D
p hgradw.'; u/jgradw.'; u/i
.u2 C "2 sin2 '/1 .D' w/2 C .u2 C "2 /.u2 C "2 sin2 '/1 .Du w/2
(286)
q .u2 C "2 /=.u2 C "2 sin2 '/ .Du w/2 C .u2 C "2 /1 .D' w/2 ;
q q ” D .u2 C "2 /=.u2 C "2 sin2 '/jDu wj 1 C .u2 C "2 /1 .D' w=Du w/2 ; (287) x D.u2 C "2 /1 .D' w=Du w/2 ;
p 1 1 1 C x D 1 C x x 2 C O.3/8jxj < 1: 2 8
If x D .u2 C "2 /1 .D' w=Du w/2 < 1, then q ”D
.u2 C "2 /=.u2 C "2 sin2 '/jDu wj C O.2/:
(288)
The first version of Du w reads u u 2 6u " C 1 u2 C" 2 arccot " 3 " 2 GM 1 2 2 " 2 C a .3 sin ' 1/ Du w D 2 2 u C "2 6 3 b C 1 arccot b 3 b "
C 2 u cos2 ';
Du w D
"
(289)
GM 1 1 2 a 2 2 u2 C "2 6 u C "2
sin2 ' 6 1 C 2 a 2 2 2 u C "2
"
3 "
6
u 2 "
2 C 1 uarccot u" 2" 3 u" C 2 2 3 b" C 1 arccot b" 3 b"
2 C 1 uarccot u" 2" 3 u" C 2 C2 u cos2 ': b u 2 b 3 " C 1 arccot " 3 "
u 2 "
(290)
444
E.W. Grafarend et al.
The second version of Du w reads Du w D
p ŒQ .u="/ 0 5 2 2 GM ŒQ00 .u="/ 0 C a P20 .sin '/ 20 C2 u cos2 ': " 15 Q20 .b="/
(291)
A more useful closed-form representation of the reference gravity intensity of type Somigliana-Pizzetti will be given by two lemmas and two corollaries. First, if one collects the coefficients of {1; cos2 '; sin2 'g by {0 , c , s }, one is led to the representation of by Eqs. (292)–(295) expressedpin the Lemma 1. Second, if p one takes advantage of the (a; b) representation of u2 C "2 = u2 C "2 sin2 ' and decompose 0 according to 0 .sin2 ' C cos2 '), one is led to the alternative elegant representation of by Eqs. (296)–(303), presented in Lemma 2. Lemma 1. Reference gravity intensity of type Somigliana-Pizzetti, International Gravity Formula: If x D .u2 C "2 /1 .D' w=Du w/2 1 holds, where w D w.'; u/ is the reference potential field of type Somigliana-Pizzetti, then its gravity field intensity .'; u/ can be represented up to the order O.2/ by q .u2 C "2 /=.u2 C "2 sin2 '/j”0 C ”c cos2 ' C ”s sin2 'j
”D
(292)
subject to 0 D
a2
1 1 GM 2 a 2 2 C .u C b/.u b/ 3 a C .u C b/.u b/
3..u="/2 C 1/uarccot.u="/ ".3.u="/2 C 2/ ; .3.b="/2 C 1/arccot.b="/ 3.b="/ ”c D 2 u; ”s D 2 a 2
(293)
(294)
3..u="/2 C 1/uarccot.u="/ ".3.u="/2 C 2/ 1 : a2 C .u C b/.u b/ .3.b="/2 C 1/arccot.b="/ 3.b="/ (295)
Lemma 2. Reference gravity intensity of type Somigliana-Pizzetti, International Gravity Formula: If x D .u2 C"2 /1 .D' w=Du w/2 .k l/Š
0 2.2k C 1/ ; D' Pkl .sin '/ D .k C l/Š ˆ > p : ; 2k C 1D' Pkl .sin '/ for l D 0
(380)
D' Pkl .sin '/ D Pk;lC1 .sin '/ l tan 'Pkl .sin '/; Du qkl D
1 Du Qkl .u="/: Qkl .b="/
(381)
(382)
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
461
Finally by means of Eqs. (383)–(386), one can derive an explicit representation of vertical deflections, in particular Eq. (384) for East .ƒ; '; u/, Eq. (385) for North .ƒ; '; u/, and Eq. (386) for the gravity disturbance ı.ƒ; '; u/. Synthesis of windowed vertical deflections and windowed gravity disturbance explicitly in terms of ellipsoidal harmonics: ı” D gradıw D P .e ” C e' ” C eu ı”/ C O.2/;
.; '; u/ D
(383)
1 GM arccot.b="/ p ”.'; u/ " u2 C "2 cos '
K X k X
qkl .u/Pkl .sin '/.lckl sin l C lskl cos l/;
(384)
kD2 lD0 .k;l/¤.2;0/
.; '; u/ D
1 GM arccot.b="/ p ”.'; u/ " u2 C "2 sin2 ' K X k X
qkl .u/ŒD' Pkl .sin '/ .ckl cos l C skl sin l/;
(385)
kD2 lD0 .k;l/¤.2;0/
p GM arccot.b="/ u2 C "2 ı .; '; u/ D p " u2 C "2 sin2 '
K X k X
ŒDu qkl .u/ Pkl .sin '/.ckl cos l C skl sin l/:
(386)
kD2 lD0 .k;l/¤.2;0/
8
Case Studies
For an ellipsoidal harmonic Gravity Earth Model (SEGEN: http://www.unistuttgart.de/gi/research/paper/coefficients/coefficients.zip) up to degree/order 360/360, the global maps of ellipsoidal vertical deflections and ellipsoidal gravity disturbances which transfer a great amount of geophysical information in a properly chosen equiareal ellipsoidal map projection are computed. The potential theory of the incremental gravity vector ı in the operational Jacobi ellipsoidal frame of reference fE ƒ , E ' , E u jP g at a point P has been outlined. The incremental potential, also called disturbing potential ıw, had been defined with respect to the Somigliana-Pizzetti reference potential w.'; u/, which includes the ellipsoidal harmonic coefficients up to degree/order (2,0) and the centrifugal potential. Note that the Somigliana-Pizzetti reference potential w.'; u/ is not a harmonic function, a result caused by centrifugal potential. Accordingly the incremental potential ıw is a harmonic function apart from the topographic
462
E.W. Grafarend et al.
Table 2 Form parameters of the ellipsoid EA1 ;A1 ;A22 (tide-free system)
A1 = a = 6,378,136.59 m A2 = b = 6,356,751 m p " D a2 b 2 D 521;853:56 m
masses on top of the Somigliana-Pizzetti level ellipsoid, also called International Reference Ellipsoid EA1 ;A1 ;A2 . The ellipsoidal topographic potential, namely, the 2 ellipsoidal terrain effect, as well as topographic gravity will be introduced in a subsequent contribution. Here, the global and regional computations of vertical deflections with respect to Somigliana-Pizzetti modulus of gravity (closed form) and gravity disturbance are presented. It begins with a review of the input data: Table 2 summarizes the data of the International Reference Ellipsoid EA1 ;A1 ;A2 , 2 also called World Geodetic Datum 2000 (Grafarend and Ardalan 1999; Groten 2000). The ellipsoidal harmonic coefficients {Ckl ; Skl } are given at http://www. uni-stuttgart.de/gi/research/paper/coefficients/coefficients.zip (for the derivation of those ellipsoidal harmonic coefficients, see Ardalan and Grafarend 2000). A sample set of degree/order terms up to (360/360) is given in Table 3. Based on those input data of Eqs. (383)–(386) and Table 2, the following have been computed: Plate I: West-east component of the deflection of the vertical on the International Reference Ellipsoid, Mollweide projection Plate II: West-east component of the deflection of the vertical on the International Reference Ellipsoid, polar regions, equal-area azimutal projection of EA1 ;A1 ;A2 2 Plate III: South-north component of the deflection of the vertical on the International Reference Ellipsoid, Mollweide projection Plate IV: South-north component of the deflection of the vertical on the International Reference Ellipsoid, polar regions, equal-area azimutal projection of EA1 ;A1 ;A2 2 Plate V: Gravity disturbance ı on the International Reference Ellipsoid, Mollweide projection The seismotectonic structure of the Earth is strongly visible in all plates, but a detailed analysis is left to a subsequent contribution. For a study of the Mollweide projection of the biaxial ellipsoid, refer to Grafarend and Heidenreich (1995).
9
Curvilinear Datum Transformations
The conformal group C7 (3), also called similarity transformation, over R3 is introduced here. The seven parameters which characterize C7 (3) are (i) translation (three parameters: tx ; ty ; tz ), (ii) rotation (three parameters: SO.3/ D fR 2 R 3 3 jR T R D; I3 det R D C1g, for instance, Cardan angles (˛; ˇ; ), R.˛; ˇ; / D R 1 .˛/R 2 .ˇ/R 3 . /), and (iii) dilatation (one parameter, also called scale: 1 + s). Under the action of the conformal group (similarity transformation T),
Spacetime Modeling of the Earth’s Gravity Field by Ellipsoidal Harmonics
463
Table 3 Ellipsoidal harmonic coefficients of SEGEN (Special Ellipsoid Gravity Earth Normal) tide-free system (Ardalan and Grafarend 2000; Ekman 1996) k 0 1 1 2 2 2 3 3 3 3 4 4 4 4 4 5 5 5 5 5 5 10 20 36 60 120 180 240 300 360
l 0 0 1 0 1 2 0 1 2 3 0 1 2 3 4 0 1 2 3 4 5 10 20 36 60 120 180 240 300 360
Ckl C1:00000000E C 00 C0:00000000E C 00 C0:00000000E C 00 C5:13772373137E 04 2:02347635955E 10 C2:43914352398E 06 C9:53080597633E 07 2:02347635955E 10 C9:04627768605E 07 C7:21072657057E 07 2:50440406743E 07 5:36322406987E 07 C3:57427905814E 07 C9:90771803829E 07 1:88560802735E 07 C7:41875911881E 08 6:21023518983E 08 C6:56698810466E 07 4:49691388810E 07 2:95301647654E 07 C1:74971983203E 07 C1:00538634409E 07 C4:01448327968E 09 C4:60146465720E 09 C4:23068069789E 09 4:56798788660E 10 4:06572704272E 10 2:30780589856E 10 5:02336888312E 11 4:47516389678E 25
Skl C0:00000000E C 00 C0:00000000E C 00 C0:00000000E C 00 C0:00000000E C 00 C1:27304012031E 09 1:40016683654E 06 C0:00000000E C 00 C1:27304012031E 09 6:19025944205E 07 C1:41435626958E 06 C0:00000000E C 00 4:73435295583E 07 C6:58806099520E 07 2:00928369177E 07 C3:08853169333E 07 C0:00000000E C 00 9:44155466167E 08 3:26265030571E 07 2:10406407888E 07 C4:96658876769E 08 6:69384278219E 07 2:40148449520E 08 1:20450644785E 08 5:94245336314E 09 C3:92983780545E 10 1:59135018852E 09 5:87726119822E 10 4:60857985599E 11 1:01275530680E 10 8:30224945525E 11
angles and distance ratios in an Euclidean R3 are left equivariant (invariant). The standard representation of T 2 C 7 .3/ is given by Eqs. (387)–(389). {x,y,z} is a set of coordinates which locate a point in a three-dimensional Weitzenböck space Wl3 called left. In abstract, {X,Y,Z} is a set of coordinates locating a homologous point in a three-dimensional Weitzenböck spaceWr3 called right. A Weitzenböck space W3 is a three-dimensional space in which angles or Euclidean distance ratios are the structure elements. For instance, if the scalar product of two relative placement vectors X2 X1 and X3 X1 is denoted by the bracket X2 X1 jX3 X1 , the cosine of the angle
464
E.W. Grafarend et al.
‰X D< .X 2 X 1 /, .X3 X 1 / > can be represented by cos ‰X D< X2 X1 jX3 X1 > .k X2 X1 k2 k X3 X1 k2 /. k X2 X1 k2 and k X3 X1 k2 denote the Euclidean length of the relative placement vector X2 X1 and X3 X1 , respectively. The transformation T: Wl3 $ Wr3 leave angles (“space angles”) and distance ratios equivariant, namely, cos ‰X D cos x and k X2 X1 k2 =jjX3 X1 k2 D jjx2 x1 k2 =jjx3 x1 k2 , a property also called invariance under the similarity transformation: 8
m .˛ 0 / d˛ 0 ;
(135)
0
0. In view of Eqs. 131, 134, and 135, Eq. 132 takes the
sm Q ' m0 .m0 m1 /
˛0 ; s
(136)
which is correct to the first order in ˛v0 =s. Using the abbreviations e D m10 ;
(137)
e0 D .m10 m11 /˛10 ;
(138)
e D m20
(139)
0e D .m20 m21 /˛20 ;
(140)
Gravitational Viscoelastodynamics
645
we finally obtain sm Q 1 ' e
e0 ; s
(141)
sm Q 2 ' e
0e ; s
(142)
which are asymptotically correct for large s. From the properties of m .t t 0 / specified above, it follows that the values of e ; e0 ; e , and 0e are nonnegative and continuously differentiable for Xi 2 X [ XC , with jump discontinuities admitted for Xi 2 @X .
Small-s Asymptotes For sufficiently small s, the first integral on the right-hand side of Eq. 133 may be R s0 neglected. As a result of the convergence of 0 m .˛ 0 /=˛ 0 d˛ 0 , we arrive at the asymptotic approximation Z 1 Z 1 sm .˛ 0 / 0 s d˛ ' m .˛ 0 / 0 d˛ 0 : (143) 0 sC˛ ˛ 0 0 Applying the mean-value theorem of integral calculus, we now have Z 1 Z 1 m .˛ 0 / 0 1 d˛ D m .˛ 0 /d˛ 0 ; ˛0 ˛1 0 0
(144)
where ˛v1 0. From Eqs. 131, 143, and 144, Eq. 132 becomes sm Q ' m1 C .m0 m1 /
s ; ˛1
(145)
which is correct to the first order in s=˛v0 . We simplify this by means of h D m11 ; h0 D
m10 m11 ˛11
h D m21 ; 0h D
m20 m21 : ˛21
(146) (147) (148) (149)
Since h D 0 by Eqs. 129 and 148, we obtain sm Q 1 ' h C h0 s;
(150)
sm Q 2 ' 0h s;
(151)
646
D. Wolf
which are asymptotically correct for small s. From the properties of m .t t 0 / described above, it follows that the values of h ; h0 , and 0h are nonnegative and continuously differentiable for Xi 2 X [ XC , with jump discontinuities admitted for Xi 2 @X .
4.3
Asymptotic Incremental Field Equations and Interface Conditions
Small-t Asymptotes: Field Theory of GED By the generalized initial-value theorem for Laplace transforms (Appendix 1: Laplace Transform), the small-t asymptote to Eq. 88 corresponds to the larges asymptote to Eq. 125. However, for s sufficiently large, s m Q 1 and s m Q 2 may be approximated by Eqs. 141 and 142, respectively, whose substitution into Eq. 125 provides uQ i;j C uQ j;i uQ k;k 2 2 .ı/ 0e : tQij D ıij e e uQ k;k Ce .Qui;j C uQ j;i /ıij e0 0e 3 3 s s (152) In view of Eqs. 208, 210, and 214 (Appendix 1: Laplace Transform), inverse Laplace transformation from the s domain to the t domain gives D ıij e 23 e uk;k C e .ui;j C uj;i / ıij e0 23 0e a0t uk;k .t 0 / dt 0 0e a0t Œui;j .t 0 / C uj;i .t 0 / dt 0 ; (153) with e called elastic bulk modulus, e elastic shear modulus, 0e anelastic bulk modulus, and 0e anelastic shear modulus. Equation 153 is to be complemented by the remaining incremental field equations, Eqs. 86–87, and the associated incremental interface conditions, Eqs. 92–95. Together, they constitute the materiallocal form of the small-t asymptotes to the incremental field equations of GVED ./ .ı/ in terms of gi ; tij ; ui , and ./ . We refer to the equations also as generalized incremental field equations and interface conditions of GED. If the integrals in Eq. 153 are neglected, it simplifies to the incremental constitutive equation of elasticity. In this case, the small-t asymptotes to the incremental field equations and interface conditions of viscoelastodynamics agree with the ordinary incremental field equations and interface conditions of GED (e.g., Love 1911; Dahlen 1974; Grafarend 1982). .ı/
tij
Large-t Asymptotes: Field Theory of GVD By the generalized final-value theorem for Laplace transforms (Appendix 1: Laplace Transform), the large-t asymptote to Eq. 100 corresponds to the small-s asymptote to Eq. 126. However, with s sufficiently small, s m Q 1 and s m Q 2 may be replaced by Eqs. 150 and 151, respectively, whose substitution into Eq. 126 leads to
Gravitational Viscoelastodynamics
647
2 0 .0/ 0 Qtij./ D ıij p;k uQ k C h uQ k;k C ıij h h s uQ k;k C 0h s.Qui;j C uQ j;i /: 3
(154) .0/
Considering Eqs. 208, 209, 214 (Appendix 1: Laplace Transform), and ui D 0, inverse Laplace transformation from the s domain to the t domain results in 2 0 .0/ 0 D ıij p;k uk C h uk;k C ıij h h dt uk;k C 0h dt .ui;j C uj;i /; 3 (155) with h referred to as hydrostatic bulk modulus, h0 as viscous bulk modulus (bulk viscosity), and 0h as viscous shear modulus (shear viscosity). We reduce the large-t asymptote to an expression for $ ./ by recalling that, for a fluid not necessarily in hydrostatic equilibrium, $ ./ is related to the other state variables by the same ./ function that relates p ./ D ti i =3 to these variables in the case of hydrostatic equilibrium (e.g., Malvern 1969). Putting dt = 0 in Eq. 155 [4] thus yields ./ tij
.0/
$ ./ D p;i ui h ui;i :
(156)
To replace this by a more familiar expression, we compare Eqs. 122, 123, and 156, giving
@ @
.0/ D
h ; .0/
(157)
D
l ; .0/
(158)
D
v ; ' .0/
(159)
and put
@ @ @ @'
.0/
.0/
where l and v are the compositional and entropic moduli, respectively. Upon substitution of Eqs. 157–159, Eq. 124 takes the form $ ./ D
h ./ l .0/ v .0/ .0/ ;i ui .0/ ';i ui ; .0/ '
(160)
which is the incremental state equation of a fluid whose total state equation is given by Eq. 112. Considering Eqs. 122, 155, 156, and 160 and the assumption of isocompositional and isentropic perturbations, .ı/ D ' .ı/ D 0, the local form of the incremental field equations of GVED, Eqs. 97–100, reduces to
648
D. Wolf
./ .0/ ./ tij;j C gi ./ C .0/ gi C 2ij k k dt uj D .0/ d2t ui ; ./
gi
D . ./ C
./
/;i ;
./
;i i D 4G..0/ ui /;i ; ./ tij
D ıij $
./
(161) (162) (163)
2 0 0 C ıij h h dt uk;k C 0h dt .ui;j C uj;i /; 3
(164)
h ./ l C .0/ ./ C .0/ ' ./; .0/ '
(165)
$ ./ D
./ D ..0/ ui /;i ;
(166)
.0/
(167)
.0/
(168)
./ D ;i ui ; ' ./ D ';i ui :
These equations are completed by the associated incremental interface conditions, Eqs. 101–104. Together, they constitute the local form of the large-t asymptotes to ./ ./ the incremental equations of GVED in terms of gi ; tij ; ui ; ./ , $ ./ , ./ , ./ , and ' ./ ; in particular, they agree with the incremental field equations and interface conditions of GVD (e.g., Backus 1967; Jarvis and McKenzie 1980).
5
Approximate Incremental Field Theories
This section is concerned with simplified field theories. We suppose that the fluid is isocompositional and isentropic in each of the domains X and XC : .0/
.0/
;i D ';i D 0;
(169)
that rotational and tidal effects are negligible: i D 0; &D
D 0;
(170) (171)
that the approximation of quasi-static perturbations applies: d2t ui D 0;
(172)
Gravitational Viscoelastodynamics
649
and that the bulk relaxation is negligible: m1 .t t 0 / D h :
(173)
We note that in the absence of tidal forces, the perturbations are now solely due to the incremental interface-mass density, . The field theories of GHS and GVED satisfying Eqs. 169–173 are referred to as approximate incremental field theories. Upon introducing further restrictions, the analysis is divided into the case of local incompressibility (Sect. 5.1), which accounts for an initial density gradient due to self-compression, and the case of material incompressibility (Sect. 5.2), where the initial state is also taken as incompressible.
5.1
Local Incompressibility
Equations for the Initial Fields .0/ Upon consideration of Eqs. 169–171 and elimination of gi , the initial field equations of GHS, Eqs. 48–52, simplify to .0/
.0/
p;i C .0/ ;i D 0;
(174)
;i i D 4G.0/ ;
.0/
(175)
p .0/ D b ..0/ /;
(176)
with b the barotropic state function. The initial interface conditions, Eqs. 53–57, continue to apply. Solutions to the approximate equations for the initial fields can be shown to exist for the level surfaces of p .0/ ; .0/ , and .0/ being concentric spheres, coaxial cylinders, or parallel planes (e.g., Batchelor 1967). To eliminate p .0/ , consider the gradient of Eq. 176: .0/
p;i D
d b d
.0/ .0/
;i ;
(177)
where (d b =d/.0/ must be constant on the level surfaces. Comparing Eq. 174 with 177 [4] then yields
d b d
.0/ .0/
.0/
;i D .0/ ;i ;
(178)
which is the Williamson-Adams equation (e.g., Williamson and Adams 1923; Bullen 1975). With (d b =d/.0/ prescribed, Eqs. 175 and 178 are to be solved for .0/ and .0/ .
650
D. Wolf
Equations for the Incremental Fields: Local Form Using Eq. 173 and m2 .t t 0 / D .t t 0 /;
(179)
substitution of Eq. 106 into Eq. 100 leads to ./
tij
Rt .0/ D ıij p;k uk C h uk;k 23 ıij 0 .t t 0 /dt 0 Œuk;k .t 0 / dt 0 Rt C 0 .t t 0 /dt 0 Œui;j .t 0 / C uj;i .t 0 / dt 0 :
(180)
Since, by setting dt D 0 in Eq. 180, we find .0/
$ ./ D p;i ui h ui;i
(181)
and, by comparing Eqs. 122, 123, and 181 and observing d b =d D @ =@, we obtain
d b d
.0/ D
h ; .0/
(182)
it follows from Eqs. 177, 181, and 182 that $ ./ D
h .0/ . ui /;i : .0/
(183)
./
Note that, with p ./ D ti i =3 per definitionem, Eqs. 180 and 181 yield $ ./ D p ./ ;
(184)
which will henceforth be implied. Suppose now that Eq. 183 can be replaced by the simultaneous conditions h ! 1;
(185)
..0/ ui /;i ! 0;
(186)
p ./ D finite:
(187)
The significance of Eq. 186 becomes evident, if we note that, by Eqs. 120 and 122, .0/ the condition (.0/ ui /;i D 0 is equivalent to the condition .ı/ D ;i ui or ./ D 0. Equation 186 thus states that the compressibility of a displaced particle is constrained to the extent that the material incremental density “follows” the prescribed initial density gradient so that the local incremental density vanishes. For this reason, we refer to Eq. 186 as local incremental incompressibility condition.
Gravitational Viscoelastodynamics
651 ./
Taking into account Eqs. 170–172, 178, 180, 182, and 186 and eliminating gi , the local form of the incremental field equations and interface conditions of GVED, Eqs. 97–104, [4] reduces to ./
./
tij;j C .0/ ;i
D0
(188)
./
;i i D 0; ./
tij
(189)
.0/ R 2.0/ t D ıij p ./ C ıij 3h ;k 0 .t t 0 /dt 0 Œuk .t 0 / dt 0 Rt C 0 .t t 0 /dt 0 Œui;j .t 0 / C uj;i .t 0 / dt 0 ;
(190)
.0/
ui;i D
h h
.0/
nj
ui ;
h
(191)
Œui C D0
(192)
Œ ./ C D 0;
(193)
iC ./
;i 4Gui D 4G;
(194)
iC ./ .0/ .0/ tij ıij .0/ ;k uk D ni :
(195)
.0/
ni
.0/ ;i
./
The approximate incremental equations are to be solved for p ./ , tij ; ui , and ./ , where .0/ and .0/ must satisfy the appropriate initial field equations and interface conditions. We observe that the hydrostatic bulk modulus, h , remains finite in Eqs. 190 and 191. This is because it enters into these equations as a consequence of substituting Eq. 182 into Eq. 178 for the initial fields, for which the approximation given by Eq. 185 does not apply. If .0/ is prescribed and ./ is neglected, the mechanical and gravitational effects decouple. In this case, solutions for the initial state are readily found. The decoupled incremental equations were integrated for a Newton-viscous spherical Earth model (Li and Yuen 1987; Wu and Yuen 1991) and for a Maxwell-viscoelastic planar Earth model (Wolf and Kaufmann 2000). The solution to the coupled incremental equations for a Maxwell-viscoelastic spherical Earth model has recently been derived by Martinec et al. (2001).
5.2
Material Incompressibility
We proceed using the supposition that the material is incompressible. As a result, the initial state is incompressible, whence .0/ = constant replaces Eq. 176 and h ! 1
652
D. Wolf
applies also in Eqs. 190 and 191, the latter reducing to the conventional material incremental incompressibility condition.
Equations for the Initial Fields With these additional restrictions, the approximate initial field equations of GHS for local incompressibility, Eqs. 174–176, further simplify to those applying to material incompressibility: .0/
.0/
p;i C .0/ ;i D 0;
(196)
;i i D 4G.0/ ;
.0/
(197)
.0/ D constant:
(198)
However, the initial interface conditions, Eqs. 53–57, continue to apply.
Equations for the Incremental Fields: Local Form Owing to the additional assumptions, the approximate incremental field equations and interface conditions of GVED for local incompressibility, Eqs. 188–195, further reduce to the conventional form valid for material incompressibility: ./
./
tij;j C .0/ ;i
D0
./
;i i D 0; Z ./
tij
t
D ıij p ./ C
(199) (200)
.t t 0 /dt 0 Œui;j .t 0 / C uj;i .t 0 / dt 0 ;
(201)
0
h
.0/
ni h
.0/
nj
ui;i D 0;
(202)
Œui C D 0;
(203)
Œ ./ C D 0;
(204)
iC ./
;i 4G.0/ ui D 4G;
(205)
iC ./ .0/ .0/ tij ıij .0/ ;k uk D ni :
(206)
Deductions of analytical solutions to these equations for one- or two-layer Earth models are given elsewhere (e.g., Wolf 1984, 1994; Amelung and Wolf 1994; Rümpker and Wolf 1996; Wu and Ni 1996). In addition, a number of analytical
Gravitational Viscoelastodynamics
653
or semi-analytical solutions for multilayer Earth models have been obtained (e.g., Sabadini et al. 1982; Wu and Peltier 1982; Wolf 1985d; Wu 1990; Spada et al. 1992; Vermeersen and Sabadini 1997; Martinec and Wolf 1998; Wieczerkowski 1999). An instructive solution to a simplified form of the above equations has been derived by Wolf (1991b). All these solutions refer to lateral homogeneity. Recently, the derivation of solutions for laterally heterogeneous Earth models has also received attention. Whereas Kaufmann and Wolf (1999) and Tromp and Mitrovica (1999a,b, 2000) limited the theory to small perturbations of the parameters in the lateral direction, D’Agostino et al. (1997), Martinec (1998, 2000), and Martinec and Wolf (1999) developed solution techniques valid for arbitrarily large perturbations.
6
Summary
The results of this review can be summarized as follows: 1. We have defined the Lagrangian and Eulerian representations of arbitrary fields and provided expressions for the relationship between the fields and their gradients in these kinematic representations. In correspondence with the Lagrangian and Eulerian representations, we have also defined the material and local increments of the fields. Using the relation between the kinematic representations, this has allowed us to establish the material and local forms of the fundamental perturbation equation. 2. Postulating only the differential form of the fundamental principles of continuum mechanics and potential theory in the Lagrangian representation, we have then presented a concise derivation of the material, material-local, and local forms of the incremental field equations and interface conditions of GVED. These equations describe infinitesimal, gravitational-viscoelastic perturbations of compositionally and entropically stratified, compressible, rotating fluids initially in hydrostatic equilibrium. 3. Following this, we have obtained, as the short-time asymptotes to the incremental field equations and interface conditions of GVED, a system of equations referred to as generalized incremental field equations and interface conditions of GED. The long-time asymptotes agree with the incremental field equations and interface conditions of GVD. In particular, we have shown that the incremental thermodynamic pressure entering into the long-time asymptote to the incremental constitutive equation of viscoelasticity satisfies the incremental state equation appropriate to viscous fluids. 4. Finally, we have adopted several simplifying assumptions and developed approximate field theories applying to gravitational-viscoelastic perturbations of isocompositional, isentropic, and compressible or incompressible fluid domains.
654
D. Wolf
Appendix 1: Laplace Transform The Laplace transform, `Œf .t/ , of a function, f .t/, is defined by Z 1 f .t/e st dt; s 2 S; LŒf .t/ D
(207)
0
where s is the inverse Laplace time and S is the complex s domain (e.g., LePage 1980). We assume here that f .t/ is continuous for all t 2 T and of exponential order as t ! 1, which are sufficient conditions for the convergence of the Laplace integral in Eq. 207 for Re s larger than some value, sR . Defining LŒf .t/ D fQ.s/ and assuming the same properties for g.t/, elementary consequences are then LŒa f .t/ C b g.t/ D afQ.s/ C b g.s/; Q
a; b D constant;
LŒdt f .t/ D s fQ.s/ f .0/; Z
t
L 0
Z
t
L
fQ.s/ ; f .t 0 / dt 0 D s
Q f .t t 0 /g.t 0 / dt 0 D fQ.s/g.s/;
(208) (209) (210)
(211)
0
LŒ1 D LŒe s0 t D
1 ; s
1 ; s C s0
(212)
s0 D constant:
(213)
If LŒf .t/ is the forward Laplace transform of f .t/, then f .t/ is called inverse Laplace transform of LŒf .t/ . This is written as L1 fLŒf .t/ g D f .t/. Since LŒf .t/ D fQ.s/, it follows that L1 ŒfQ.s/ D f .t/;
t 2T;
(214)
which admits the immediate inversion of the forward transforms listed above.
Generalized Initial- and Final-Value Theorems Some useful consequences of Eqs. 207 and 214 are the generalized initial- and final-value theorems. Assuming that the appropriate limits exist, the first theorem states that an asymptotic approximation, p.t/ to f .t/ for small t, corresponds to an asymptotic approximation, p.s/ Q to fQ.s/ for large s. Similarly, according to the second theorem, an asymptotic approximation, q.t/ to f .t/ for large t, corresponds to an asymptotic approximation, q.s/ Q to fQ.s/ for small s.
Gravitational Viscoelastodynamics
655
Appendix 2: List of Important Symbols Symbol d2 ri dnt fQ Fij ::: fij ::: Fij :::;k fij :::;k ./ fij ::: .ı/ fij ::: .0/ fij ::: fij˙::: G gi i, j , . . . j l
Name Differential area at ri nth-order material-derivative operator with respect to t Laplace transform of f Eulerian representation of Cartesian tensor field Lagrangian representation of Cartesian tensor field Gradient of Fij ::: with respect to rk Gradient of fij ::: with respect to Xk Local increment of fij ::: Material increment of fij ::: Initial value of fij ::: Increase of fij ::: across @R in direction of ni Newton’s gravitational constant Gravity force per unit mass Index subscripts of Cartesian tensor Jacobian determinant Compositional modulus
m m m1 m2 m0 m1 mijk` ni p ri s t t0 tij ui v
Relaxation function Relaxation spectrum Bulk-relaxation function Shear-relaxation function Small-t limit of relaxation function Large-t limit of relaxation function Anisotropic relaxation function Outward unit normal with respect to @R Mechanical pressure Position of place, current position of particle Inverse Laplace time Current time Excitation time Cauchy stress Displacement Entropic modulus
Xi ˛0 ıij
Initial position of material point Inverse spectral time .0/ Magnitude of gi on @R.0/ Kronecker symbol
First reference Sect. 3.1 Sect. 3.1 Sect. 4 Sect. 2.1 Sect. 2.1 Sect. 2.1 Sect. 2.1 Sect. 2.2 Sect. 2.2 Sect. 2.2 Sect. 2.3 Sect. 3.1 Sect. 3.1 Sect. 2 Sect. 3.1 Section “Large-t Asymptotes: Field Theory of GVD” Section “Constitutive Equation” Sect. 4.1 Section “Constitutive Equation” Section “Constitutive Equation” Sect. 4.1 Sect. 4.1 Section “Constitutive Equation” Sect. 2.3 Sect. 3.2 Sect. 2.1 Sect. 4 Sect. 2.1 Sect. 3.1 Sect. 3.1 Sect. 2.1 Section “Large-t Asymptotes: Field Theory of GVD” Sect. 2.1 Sect. 4.1 Section “Material Form” Sect. 2 (continued)
656 Symbol "ijk e e0 h h0 e 0e 0h b $
D. Wolf
@R
Name Levi-Civita symbol Elastic bulk modulus Anelastic bulk modulus Hydrostatic bulk modulus Viscous bulk modulus Composition Shear-relaxation function Elastic shear modulus Anelastic shear modulus Viscous shear modulus State function Barotropic state function Thermodynamic pressure Volume-mass density (Incremental) interface-mass density Piola-Kirchhoff stress Gravitational potential Entropy density Centrifugal potential Tidal potential Angular velocity Euclidian space domain Laplace transformation functional Inverse Laplace transformation functional Anisotropic relaxation functional Internal ri domain External ri domain s domain t domain Internal Xi domain External Xi domain Interface between R and
@X
Interface between X and Sect. 2.1
ij
' & ˝i E L L1 Mij R RC S T X XC
First reference Sect. 2 Section “Large-s Asymptotes” Section “Large-s Asymptotes” Section “Small-s Asymptotes” Section “Small-s Asymptotes” Sect. 3.2 Section “Equations for the Incremental Fields: Local Form” Section “Large-s Asymptotes” Section “Large-s Asymptotes” Section “Small-s Asymptotes” Sect. 3.2 Sect. 5.1 Sect. 3.4 Sect. 3.1 Sect. 3.1 Sect. 3.1 Sect. 3.1 Sect. 3.2 Sect. 3.1 Sect. 3.1 Sect. 3.1 Sect. 2.1 Sect. 6 Sect. 6 Sect. 3.1 Sect. 2.1 Sect. 2.1 Sect. 4 Sect. 2.1 Sect. 2.1 Sect. 2.1 Sect. 2.1
RC XC
Gravitational Viscoelastodynamics
657
References Amelung F, Wolf D (1994) Viscoelastic perturbations of the earth: significance of the incremental gravitational force in models of glacial isostasy. Geophys J Int 117:864–879 Backus GE (1967) Converting vector and tensor equations to scalar equations in spherical coordinates. Geophys J R Astron Soc 13:71–101 Batchelor GK (1967) An introduction to fluid dynamics. Cambridge University Press, Cambridge Biot MA (1959) The influence of gravity on the folding of a layered viscoelastic medium under compression. J Franklin Inst 267:211–228 Biot MA (1965) Mechanics of incremental deformations. Wiley, New York Bullen KE (1975) The Earth’s density. Chapman and Hall, London Cathles LM (1975) The viscosity of the Earth’s mantle. Princeton University Press, Princeton Chandrasekhar S (1961) Hydrodynamic and hydromagnetic stability. Clarendon Press, Oxford Christensen RM (1982) Theory of viscoelasticity, 2nd edn. Academic, New York Corrieu V, Thoraval C, Ricard Y (1995) Mantle dynamics and geoid green functions. Geophys J Int 120:516–532 D’Agostino G, Spada G, Sabadini R (1997) Postglacial rebound and lateral viscosity variations: a semi-analytical approach based on a spherical model with Maxwell rheology. Geophys J Int 129:F9–F13 Dahlen FA (1972) Elastic dislocation theory for a self-gravitating elastic configuration with an initial static stress field. Geophys J R Astron Soc 28:357–383 Dahlen FA (1973) Elastic dislocation theory for a self-gravitating elastic configuration with an initial static stress field II: energy release. Geophys J R Astron Soc 31:469–484 Dahlen FA (1974) On the static deformation of an earth model with a fluid core. Geophys J R Astron Soc 36:461–485 Dahlen FA, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton Darwin GH (1879) On the bodily tides of viscous and semi-elastic spheroids, and on the ocean tides upon a yielding nucleus. Philos Trans R Soc Lond Part 1 170:1–35 Dehant V, Wahr JM (1991) The response of a compressible, non-homogeneous earth to internal loading: theory. J Geomagn Geoelectr 43:157–178 Eringen AC (1989) Mechanics of continua, 2nd edn. R. E. Krieger, Malabar Golden JM, Graham GAC (1988) Boundary value problems in linear viscoelasticity. Springer, Berlin Grafarend EW (1982) Six lectures on geodesy and global geodynamics. Mitt Geodät Inst Tech Univ Graz 41:531–685 Hanyk L, Yuen DA, Matyska C (1996) Initial-value and modal approaches for transient viscoelastic responses with complex viscosity profiles. Geophys J Int 127:348–362 Hanyk L, Matyska C, Yuen DA (1999) Secular gravitational instability of a compressible viscoelastic sphere. Geophys Res Lett 26:557–560 Haskell NA (1935) The motion of a viscous fluid under a surface load. Physics 6:265–269 Haskell NA (1936) The motion of a viscous fluid under a surface load, 2. Physics 7:56–61 Jarvis GT, McKenzie DP (1980) Convection in a compressible fluid with infinite Prandtl number. J Fluid Mech 96:515–583 Johnston P, Lambeck K, Wolf D (1997) Material versus isobaric internal boundaries in the earth and their influence on postglacial rebound. Geophys J Int 129:252–268 Kaufmann G, Wolf D (1999) Effects of lateral viscosity variations on postglacial rebound: an analytical approach. Geophys J Int 137:489–500 Krauss W (1973) Methods and results of theoretical oceanography, vol. 1: dynamics of the homogeneous and the quasihomogeneous ocean. Bornträger, Berlin LePage WR (1980) Complex variables and the Laplace transform for engineers. Dover, New York
658
D. Wolf
Li G, Yuen DA (1987) Viscous relaxation of a compressible spherical shell. Geophys Res Lett 14:1227–1230 Love AEH (1911) Some problems of geodynamics. Cambridge University Press, Cambridge Malvern LE (1969) Introduction to the mechanics of a continuous medium. Prentice-Hall, Englewood Cliffs Martinec Z (1999) Spectral, initial value approach for viscoelastic relaxation of a spherical earth with three-dimensional viscosity-I. Theory. Geophys J Int 137:469–488 Martinec Z (2000) Spectral-finite element approach to three-dimensional viscoelastic relaxation in a spherical earth. Geophys J Int 142:117–141 Martinec Z, Wolf D (1998) Explicit form of the propagator matrix for a multi-layered, incompressible viscoelastic sphere. Scientific technical report GFZ Potsdam, STR98/08, p 13 Martinec Z, Wolf D (1999) Gravitational-viscoelastic relaxation of eccentrically nested spheres. Geophys J Int 138:45–66 Martinec Z, Thoma M, Wolf D (2001) Material versus local incompressibility and its influence on glacial-isostatic adjustment. Geophys J Int 144:136–156 Mitrovica JX, Davis JL, Shapiro II (1994) A spectral formalism for computing three-dimensional deformation due to surface loads 1. Theory. J Geophys Res 99:7057–7073 O’Connell RJ (1971) Pleistocene glaciation and the viscosity of the lower mantle. Geophys J R Astron Soc 23:299–327 Panasyuk SV, Hager BH, Forte AM (1996) Understanding the effects of mantle compressibility on geoid kernels. Geophys J Int 124:121–133 Parsons BE (1972) Changes in the Earth’s shape. Ph.D. thesis, Cambridge University, Cambridge Peltier WR (1974) The impulse response of a Maxwell earth. Rev Geophys Space Phys 12: 649–669 Peltier WR (ed) (1989) Mantle convection, plate tectonics and geodynamics. Gordon and Breach, New York Ramsey AS (1981) Newtonian attraction. Cambridge University Press, Cambridge Rayleigh L (1906) On the dilatational stability of the earth. Proc R Soc Lond Ser A 77:486–499 Rümpker G, Wolf D (1996) Viscoelastic relaxation of a Burgers half space: implications for the interpretation of the Fennoscandian uplift. Geophys J Int 124:541–555 Sabadini R, Yuen DA, Boschi E (1982) Polar wandering and the forced responses of a rotating, multilayered, viscoelastic planet. J Geophys Res 87:2885–2903 Spada G, Sabadini R, Yuen DA, Ricard Y (1992) Effects on post-glacial rebound from the hard rheology in the transition zone. Geophys J Int 109:683–700 Tromp J, Mitrovica JX (1999a) Surface loading of a viscoelastic earth-I. General theory. Geophys J Int 137:847–855 Tromp J, Mitrovica JX (1999b) Surface loading of a viscoelastic earth-II. Spherical models. Geophys J Int 137:856–872 Tromp J, Mitrovica JX (2000) Surface loading of a viscoelastic planet-III. Aspherical models. Geophys J Int 140:425–441 Vermeersen LLA, Mitrovica JX (2000) Gravitational stability of spherical self-gravitating relaxation models. Geophys J Int 142:351–360 Vermeersen LLA, Sabadini R (1997) A new class of stratified viscoelastic models by analytical techniques. Geophys J Int 129:531–570 Vermeersen LLA, Vlaar NJ (1991) The gravito-elastodynamics of a pre-stressed elastic earth. Geophys J Int 104:555–563 Vermeersen LLA, Sabadini R, Spada G (1996) Compressible rotational deformation. Geophys J Int 126:735–761 Wieczerkowski K (1999) Gravito-Viskoelastodynamik für verallgemeinerte Rheologien mit Anwendungen auf den Jupitermond Io und die Erde. Publ Deutsch Geod Komm Ser C 515:130 Williamson ED, Adams LH (1923) Density distribution in the earth. J Wash Acad Sci 13:413–428 Wolf D (1984) The relaxation of spherical and flat Maxwell earth models and effects due to the presence of the lithosphere. J Geophys 56:24–33 Wolf D (1985a) Thick-plate flexure re-examined. Geophys J R Astron Soc 80:265–273
Gravitational Viscoelastodynamics
659
Wolf D (1985b) On Boussinesq’s problem for Maxwell continua subject to an external gravity field. Geophys J R Astron Soc 80:275–279 Wolf D (1985c) The normal modes of a uniform, compressible Maxwell half-space. J Geophys 56:100–105 Wolf D (1985d) The normal modes of a layered, incompressible Maxwell half-space. J Geophys 57:106–117 Wolf D (1991a) Viscoelastodynamics of a stratified, compressible planet: incremental field equations and short- and long-time asymptotes. Geophys J Int 104:401–417 Wolf D (1991b) Boussinesq’s problem of viscoelasticity. Terra Nova 3:401–407 Wolf D (1994) Lamé’s problem of gravitational viscoelasticity: the isochemical, incompressible planet. Geophys J Int 116:321–348 Wolf D (1997) Gravitational viscoelastodynamics for a hydrostatic planet. Publ Deutsch Geod Komm Ser C 452:96 Wolf D, Kaufmann G (2000) Effects due to compressional and compositional density stratification on load-induced Maxwell-viscoelastic perturbations. Geophys J Int 140:51–62 Wu P (1990) Deformation of internal boundaries in a viscoelastic earth and topographic coupling between the mantle and the core. Geophys J Int 101:213–231 Wu P (1992) Viscoelastic versus viscous deformation and the advection of prestress. Geophys J Int 108:136–142 Wu P, Ni Z (1996) Some analytical solutions for the viscoelastic gravitational relaxation of a twolayer non-self-gravitating incompressible spherical earth. Geophys J Int 126:413–436 Wu P, Peltier WR (1982) Viscous gravitational relaxation. Geophys J R Astron Soc 70:435–485 Wu J, Yuen DA (1991) Post-glacial relaxation of a viscously stratified compressible mantle. Geophys J Int 104:331–349
Elastic and Viscoelastic Response of the Lithosphere to Surface Loading Volker Klemann, Maik Thomas, and Harald Schuh
Contents 1 2
Loading Response and Their Observation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . General Response of the Earth to Surface Loads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Rheological Stratification of the Solid Earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Elastic Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Anelasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Viscous Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Viscoelasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Fundamental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Field Equations for a Spherical Planet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The Elastic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 The Viscoelastic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
662 663 664 664 664 665 666 667 667 669 670 673 673 674
V. Klemann () Geodesy and Remote Sensing, Helmholtz Centre Potsdam, GFZ German Research Centre for Geosciences, Potsdam, Germany e-mail: [email protected] M. Thomas Geodesy and Remote Sensing, Helmholtz Centre Potsdam, GFZ German Research Centre for Geosciences, Potsdam, Germany Institute of Meteorology, Freie Universität Berlin, Berlin, Germany H. Schuh Geodesy and Remote Sensing, Helmholtz Centre Potsdam, GFZ German Research Centre for Geosciences, Potsdam, Germany TU Berlin, Berlin, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_90
661
662
V. Klemann et al.
Abstract
This chapter presents the main rheological features of the lithosphere and of the upper mantle with respect to deformations of the solid Earth in response to time-varying surface loading. Rheological aspects as numerical modeling are discussed in view of the general strategies applied in geophysics, which are perturbations of a self-gravitating, non-rotating, linearly viscoelastic, isostatically pre-stressed Earth (SNRVEI) in a spherical geometry. Based on this model, limits of present modeling and future directions are discussed.
Dedicated to Prof. Dr. Detlef Wolf, who passed away unexpectedly in 2013.
1
Loading Response and Their Observation
Understanding the deformational response of the solid Earth to surface loads is important in different aspects: the interpretation of geodetic observations (see Kusche 2010, Vol. 1), the influence on surface processes like ocean dynamics (see Kuhlmann et al. (2014), this volume) and ice dynamics (e.g., van den Berg et al. 2006, 2008; Gomez et al. 2012), and for the inference of Earth’s material parameters (e.g., Cathles 1975). The loading processes can be categorized firstly by mass distribution, which is addressed as static loads distributed at the Earth’s surface or representing the uncompensated topography and crustal structure mainly due to sedimentary and tectonic processes, and secondly by mass transport, which means the mass redistribution in time mainly of water and air at the Earth’s surface. The former category will not be addressed in this text, as the rheological features of the solid Earth are not relevant and the response reduces to the isostatic equilibrium state of flexed plate swimming on the Earth’s mantle which is considered as an inviscid fluid (for details, see Watts 2001). The mathematical point of view therein is the estimation of the flexural rigidity of the lithosphere under the dual statistical– mechanistic viewpoint of known topography and gravity (Simons and Olhede 2013) and is beyond local isostatic compensation models like those of Airy and Pratt. Of more interest are temporal changes in mass distribution, which appear as time variable signals in geodetic observations. In particular, the increasing number of satellite-based Earth observation systems of altimetry and InSAR, the scientific use of GNSS systems like GPS, GLONASS, or the upcoming GALILEO system, but also the successful gravity missions of CHAMP, GRACE, and GOCE during the last decades fostered the interpretation and understanding of the different loading processes. GRACE is one of the prominent examples where the analysis of mass transport processes resulted in a closer cooperation of disciplines like oceanography, hydrology, geodesy, but also geodynamics in order to reach a better understanding of the Earth’s response to loading (e.g., Kusche et al. 2012). Although the satellite-based observation systems are superior to terrestrial surveying, e.g., due to global coverage, consistency of data, better reproducibility
Elastic and Viscoelastic Response of the Lithosphere to Surface Loading
663
and repeatability (Blewitt 2009), surface observation techniques like GNSS, VLBI (Schuh and Behrend 2012), surface gravimetry (Lambert et al. 2006), and tide gauges (Holgate et al. 2013) remain irrecoverable due to their long observation time series length and sometimes also higher accuracy and higher spatial resolution. The response of the Earth itself is a matter of time scale and spatial scale on which the process acts. Based on these demanding aspects, we will structure the following overview with respect to scales of the respective processes. Surface mass redistributions which are relevant for geodetic observables are tidal loading, mass redistribution in the global water cycle, co- and postseismic deformations, and glacial isostatic adjustment (GIA), to name the most prominent examples. In order to allow a consistent coupling between different disciplines, the main geometry considered will be that of a sphere (see Freeden and Schreiner 2010, Vol. 2).
2
General Response of the Earth to Surface Loads
The response of the solid Earth to surface loads can be categorized according to extension, duration, and strength of the loading force. The spatial scale, often parameterized by the wavelength of the excitation, determines the penetration of the deformation. Local loads of several kilometers extension only deform the Earth crust or lithosphere, whereas loads exceeding hundreds of km induce a deformation field that reaches into the Earth’s mantle or in case of global loads of some thousand km encompass the whole Earth body. With respect to the duration of the loading, the Earth-material response follows different intrinsic deformation mechanisms. Instantaneous or short period processes result in an elastic to anelastic deformation, where already for periods of some minutes, anelastic behavior has to be considered. With the duration of the process, the ductile behavior has to be taken into account. Viscoelasticity as the adequate continuum mechanical model holds in principle for time scales from decades to about one hundred thousand years and describes the transient deformation and viscous flow of Earth material toward the isostatic state. In this state which is considered for processes on geological time scales, the loading force is in equilibrium to the flexure of the lithosphere considered as an elastic plate, the buoyancy of the underlying mantle, and on global scales the convective forces of the Earth’s mantle flow. If the acting forces are exceeding the strength of the material, the Earth material deforms brittly or plastically. For periodic loading processes, this excess is usually not reached. That nevertheless brittle failure in the upper crust is observed by postglacial faults results from the superimposed tectonic stress state. It is assumed that the Earth’s upper crust is in a state of frictional equilibrium with active faults limiting strength and is consistent with Mohr–Coulomb theory and according to Byerlee’s law (Byerlee 1978) with friction coefficients f of 0.6–1.0 as derived from laboratory experiments (e.g., Bürgmann and Dresen 2008). These forces keep the stress state of the brittle crust near to its fault stability margin. In consequence, small induced stresses may destabilize the material and lead to rupture (Johnston 1989; Wu and Hasegawa 1996).
664
2.1
V. Klemann et al.
Rheological Stratification of the Solid Earth
The spherical composition of the Earth is crucial for understanding its deformational behavior. From the surface to the center, the Earth consists of a solid lithosphere, a solid mantle, a fluid outer core, and a solid inner core. The lithosphere is split into the Earth’s crust and mantle lithosphere, where the upper crust is understood as elastic to brittle and the lower crust together with the mantle lithosphere as elastic, ductile, or plastic (Bürgmann and Dresen 2008). The Earth mantle below the lithosphere is considered to behave elastically for loading processes not exceeding a few years, but viscously for longtime processes like mantle convection or GIA. The fluid outer core then encompasses a highly ductile solid inner core.
2.2
Elastic Behavior
The most simple description of a solid, which is usually applied to describe the elastic deformations of the Earth, is that of a linear, elastic, and isotropic continuum according to Hooke’s law. That means in balance of momentum, the surface quantities, incremental strain, " D 12 .grad u C gradT u/, and Cauchy stress, , describing the surface forces, f acting on the surface n of a specific volume element, f D n, are expressed by symmetric tensors of second order and are related by D M ":
(1)
Assuming isotropy, the fourth-order elasticity tensor, M , is completely described by two linearly independent constants, the elastic moduli: D . C 2 / tr " C 2 "
(2)
In addition to this expression with the two Lamé’s parameters, and , other pairs of moduli are used which are motivated from the experimental setup for their technical determination but in general describe the compressional and shear behavior of the material. From continuum mechanical considerations, the different pairs are uniquely related to each other by algebraic operations. For a more mathematical discussion of elasticity, see Marsden and Hughes (1983).
2.3
Anelasticity
Deviations from elastic behavior appear due to the fabric of the crystalline material of the Earth’s interior. Already at short times like the free oscillations of the Earth’s body, which are excited by large earthquakes and which show periods ranging from minutes to about 1 h, anelasticity is observed in dissipation of the
Elastic and Viscoelastic Response of the Lithosphere to Surface Loading
665
seismic wave energy and dispersion, the frequency dependence of the seismic velocities (e.g., Lapwood and Usami 1981). Models to describe this frequency dependence are based on distributions of Debye peaks and use the observation that the dissipation of damping of the oscillations shows only a weak dependency on the frequency. This results in the empirical frequency band models with constant quality factor Q, Q1 .!/ D const. (Kanamori and Anderson 1977), or the Q–˛ model, Q1 .!/ / ! ˛ , where ˛ lies between 0.25 and 0.5 (Anderson and Minster 1979). These absorption-band models are applicable to describe the damping in a specific frequency band, but cannot be extended to long-time processes. Specifically, it is not possible to describe the change from elastic to viscous behavior with such models.
2.4
Viscous Behavior
From laboratory experiments as physical principles, the longtime material deformations of lithospheric and mantle material are governed by the rheological mechanisms of the respective material. Considering the strain rate results from the cumulative response of the heterogeneous rocks to the applied stress eP D
X
eP k D
k
X 1
k
! t
(3)
k
the strain rates due to the different processes add up and so the fluidities, i.e., the inverse of the respective viscosities, k , of the acting processes. In consequence, those which result in the smallest viscosities will dominate the response. For rheological laws, the strain rate and stress reduce to their deviatoric or shear components, Œe; t D Œ; 13 trŒ; I , as the rheology, in general, is a process that minimizes the material’s shear energy. Furthermore, anisotropies of the viscosity are usually neglected, although it can be considered in this formulation. Generally, the rheological laws for the Earth material are described by eP D A t
n1
d
m
fHr2 O
QCpV t: exp RT
(4)
When considering the second invariant of the deviatoric stress r t D
1 t W tT 2
(5)
to represent the effective stress, the colon denotes the double scalar product, the form of this material law remains isotropic also when the material law is non-linear with (power-law) stress exponents, n, usually considered between 1 and 4 (e.g., Ranalli 1987). The material law depends on a number of parameters that have to
666
V. Klemann et al.
be determined from experimental setups. There, A is a general material parameter. Furthermore, the material behavior depends on the grain size d with grain-size exponent m being mechanism depending 2 or 3, the water content parameterized by its water fugacity, fH2 O , with the fugacity exponent, r. As a temperatureactivated process, the viscosity inside the Earth is governed by temperature, T , and pressure, p, according to the Arrhenius law with universal gas constant, R, and the material-dependent quantities activation energy, Q, and activation volume, V . The trade-off between temperature and pressure increases with depth specifies the vertical viscosity profile in the upper mantle. The viscosity reaches a minimum in the asthenosphere (below the lithosphere), as the temperature gradient drops due to the change from conductive heat transport in the lithosphere to convective heat transport in the mantle, whereas the pressure further increases lithostatically. The parameters which specify the material’s rheological behavior are usually determined by laboratory experiments. Due to the experimental limitations, the applied strain rates are usually between 104 and 106 s1 , while those appearing in the Earth are 109 and 1013 s1 , and the time scales on which the experiments proceed are much shorter, seconds against years to millennia. In consequence, the experimental data have to be extrapolated by orders of magnitude to the dimensions on which the geoprocesses act. This holds particularly for loading processes where in comparison to the strain rates for tectonic processes given above they are again reduced and only reach 1014 s1 . One further drawback, is that the actual grain size, d , of mantle material is not known. From an experimental point of view, the acting defects in a rock can be reduced to the processes of two mechanisms (e.g., van den Berg et al. 1993): eP diff: C eP disl: D
1 2
.1= diff: C 1= disl: / t ;
(6)
where the diffusion creep eP diff: results from the mobility of point defects in the crystal structure with a stress exponent of n D 1, whereas the dislocation creep eP disl: follows a power law with n > 1 and is due to surface or line defects of the crystal structure (e.g., Karato 2008). In consequence, the diffusion creep dominates at low stresses, whereas the dislocation creep acts at high stresses. Considering (4), in the presence of a tectonic stress field, a loading process inducing significantly lower stresses will follow the effective viscosity governed by the tectonic stress field (Gasperini et al. 1992, 2004). In contrast, Schmeling (1987) superimposed the flow fields due to GIA and convection in the upper mantle and proposed an anisotropic viscosity law depending on the direction of the two flow fields.
2.5
Viscoelasticity
For processes like postseismic deformations or GIA, i.e., processes that proceed on time scales of decades to millennia, the transition from elastic to viscous behavior has to be investigated to describe the deformation toward an isostatic equilibrium. The usual way to do this is to apply a viscoelastic description (see Wolf 2010, Vol.
Elastic and Viscoelastic Response of the Lithosphere to Surface Loading
667
1). For the Earth, usually, the Maxwell viscoelasticity eP D
1 1 tC tP 2
2
(7)
is considered (Peltier 1976; Wu and Peltier 1982), which describes the linear creep of the material according to a Newtonian fluid when applying a constant stress, whereas its inverse relation describes the elastic stress due to a constant strain and its relaxation which decays with the Maxwell time M D =. The typical Maxwell time for the Earth’s mantle is 218 year and follows from the average mantle viscosity of 1021 Pa s (Haskell 1935) and an average shear modulus of 145 109 N/m2 . Higher-order equations are discussed in the literature in order to analyze transient behavior on shorter time scales like the Burger’s body (Rümpker and Wolf 1996). A way to consider a composite rheology like in (6) is to introduce an effective viscosity which replaces the dynamical viscosity, , used to parameterize the Maxwell body (Gasperini et al. 1992): eP D
1 C A t n1 2 diff:
t C
1 tP 2
(8)
This kind of rheology is discussed repeatedly in the literature (Wu and Wang 2008; van der Wal et al. 2010, 2013) but demands the solution in the time domain by using a finite element code (Gasperini et al. 1992; Wu 2002) or solving the field equation as an initial value problem in the time domain (Martinec 1999, 2000) where lateral variations in the viscosity can be considered (Klemann et al. 2007, 2008). Usually, for GIA, the linear Maxwell rheology is applied, which allows to transform the time domain problem into the Laplace domain (see Wolf 2010, Vol. 1).
3
Fundamental Results
At the basis of a continuum mechanical approach, the momentum equations are usually formulated in the Lagrange domain for the elastic problem as for the viscoelastic problem.
3.1
Field Equations for a Spherical Planet
The reaction of the solid Earth to surface mass redistribution is generally described by a self-gravitating, nonrotating, viscoelastic continuum which is isostatically d2 prestressed (SNRVEI), where the inertia term, 0 dt 2 u (Kusche 2010, Eq. 17), is neglected in the momentum equation:
668
V. Klemann et al.
div 0 grad ıV C div.0 u/ grad V0 grad.0 u grad V0 / D 0
in B
(9)
In addition, the Poisson equation r 2 ıV C 4 G div.0 u/ D 0
in B
(10)
relates deformation and gravity perturbation (e.g., Martinec 2000). Here, is the stress tensor, which is related to the deformation according to the above-discussed material laws; u is the displacement; ıV is the sum of the perturbation of the initial gravitational potential, V0 , of the body, B, due to the deformation process and of the potential of the externally applied load; G is the Newton gravitational constant; and 0 is the unperturbed mass density. In general, the equations are solved for a spherically symmetric geometry, meaning that all material parameters are only functions of the radial distance, r. In the case of the density and elasticity, this is mandatory, as otherwise the hydrostatically prestressed reference state is not spherical anymore. If a stratified body is considered, we have to introduce interface conditions for the displacement and the normal component of the stress to be continuous and similarly for the potential and the gravity (for details, see Dahlen 1974). The boundary conditions at the Earth’s surface are er er e r .e r e r / e r ŒıV C Œgrad ıV C e r C 4 G 0 .u e r /
9 D g0 .a/ > > = D 0 > D 0 > ; D 4 G
on @B ;
(11)
where represents the excitation considered as a surface mass density applied at the Earth’s surface, a; the minus superscript denotes the value inside the body; and the brackets denote the difference across the boundary. The choice of a spherical symmetry with respect to the elastic parameters allows to split the problem into a radially depending part, r, and a laterally depending part, D .; /, represented by spherical harmonics. The displacement field then reads
u.r; ; t/ D a
n 1 X 1 X X
l Unm .r; t/ Y lnm ./ ;
(12)
lD1 nD0 mDn
where the vector spherical harmonics of degree n and order m Y 1 nm WD Ynm e r ;
Y C1 nm WD grad Ynm
(13)
describe the spheroidal part, the vector spherical harmonics Y 0nm WD e r grad Ynm
(14)
Elastic and Viscoelastic Response of the Lithosphere to Surface Loading
669
describe the toroidal part of the displacement field, Ynm are the scalar spherical harmonics, and grad WD grad e r @r . Accordingly, the gravitational potential is represented by ıV .r; ; t/ D
n 1 GM X X nm .r; t/ Ynm ./ : a nD0 mDn
(15)
The relation between stress, , and deformation, grad u, is specified by the material law. The toroidal part is only excited if the boundary conditions contain a toroidal component (Martinec 2007) or if the material parameters become laterally variable (Klemann et al. 2008). The displacement field at the surface is therefore ˙1 represented by two radial functions, Unm .r; t/ in the spectral domain.
3.2
The Elastic Problem
For elastic deformations, solutions of Eqs. (2) and (9)–(11) can be expressed by three sets of load Love numbers (LLNs) as discussed in Kusche (2010) where we identify 1 Unm D
4 a2 h0n n m ; M 2nC1
C1 Unm D
ln0 4 a2 n m ; M 2nC1
nm D
4 a2 1 C kn0 n m ; M 2nC1
(16)
and the load coefficients nm
1 D 4
Z ./ Ynm ./ d :
(17)
4
These load Love numbers are usually listed in the literature and consider different Earth models (radial functions of density, incompressibility, and shear module), so Farrell (1972) presented the LLNs for the Gutenberg–Bullen model. Usually, the LLNs given for solid Earth corrections are calculated for the PREM model Dziewonski and Anderson (1981). Wang et al. (2012) compared the LLNs of different global Earth models and showed that deviations become evident for n > 100 due to differences in the asthenosphere which is the main difference between PREM and the ak135 model (Kennett et al. 1995) (see Fig. 1). The consideration of a different crust (Laske et al. 2012) shows that this thin top layer dominates the response for n > 1;000, where the relative difference between the two Earth models with same crust is negligible (dashed lines), but is of influence already at lower degrees, where it masks the differences between the two Earth models PREM and
670
V. Klemann et al.
Relative deviation of h’n
1.00 0.75 0.50 0.25 0.00 −0.25 1
10
100
1000
10000
Legendre degree, n Fig. 1 Relative deviations of elastic LLN h0n from PREM for ak135 (red, solid) and if the crust of the PREM (black dashed) or ak135 (red dashed) is replaced by a structure considering hard sediments in the model CRUST 2.0 (after Wang et al. 2012)
ak135. This demands for local processes the consideration of a regionally adjusted lithosphere structure.
3.3
The Viscoelastic Problem
For the linear viscoelastic problem, the time dependence of the material law (7) replaces Hooke’s law. Then, the response of a point source with a Heaviside loading history, ( h.t/ D
0W t 0, the density stratification is unstable, as a downward displaced incremental volume element becomes more dense than the surrounding material due to hydrostatic compression. In the spectral representation of a viscoelastic continuum, this results in positive modes, si > 0, Plag and Jüttner (1995) discussed them as Rayleigh–Taylor modes, leading to instability (21). Furthermore, denumerable sets of modes inside bounded intervals, si 2 Œsi1 ; si2 for i D 1 ; : : : ; 1, appear on the positive and on the negative s-axis (Vermeersen et al. 1996; Hanyk et al. 1999; Cambiotti and Sabadini 2010). Cambiotti et al. (2013) showed that such sets for s < 0 also appear for < 0, when no instability appears, and are related to the Longman paradox discussed for a compressible core (Longman 1963). To circumvent the related problem to identify the relevant modes in the spectrum, as these compressional modes can overlap the modes describing the rebound process (e.g., Han and Wahr 1995), alternative methods to calculate the inverse Laplace transformation were suggested like integrating along appropriate paths numerically (e.g., Tanaka et al. 2006), apply the Post–Widder algorithm, which holds only for stable stratifications (Spada and Boschi 2006), or solve the problem directly in the time domain as an initial value problem (Martinec 1999). Compressible Earth structures that follow the AWC (e.g., Martinec et al. 2001; Wolf and Li 2002) are not widely accepted due to their deviation from seismically inferred density models, but also because Vermeersen and Mitrovica (2000) could
Elastic and Viscoelastic Response of the Lithosphere to Surface Loading
673
show that the instability modes show times of 1=s > 107 year and so are far beyond the time scales of the loading processes considered in this theory. The second method to encompass the problem is to consider an incompressible stratification which is stable by definition as long as no negative density contrast appears. This is a manageable strategy as the viscoelastic response to GIA of an incompressible Earth structure is comparable to that of a compressible one if the flexure of lithosphere is properly adjusted (e.g., Tanaka et al. 2011) and holds when investigating a relaxation process like GIA. When the elastic loading response is evident, a compressible model has to be considered. As stated at the beginning of this section, the presented calculus cannot be applied when considering lateral variations in the Earth structure as the response functions will depend on the location of the loading. For a power-law rheology, the same holds due to the nonlinear material law, as the effective viscosity depends on the perturbed stress state.
4
Future Directions
Two arguments exist why to go ahead of the usual consideration of SNRVEI models. The first argument is that in order to determine a realistic response function, e.g., for GIA, the respective Earth structure is not compatible with structural models based on other observations like seismic tomography or geodynamical constraints. The second argument is that the increasing accuracy of Earth observation systems demands better constrained Earth models and the interpretation of the observations is not possible anymore by the assumption of spherical symmetry. More advanced models have repeatedly been discussed during the last decades, and due to the increasing computer power, it is likely that such models will not replace the SNRVEI models completely due to the lack of linearity but will complement them especially for the inference of horizontal motions that are more sensitive to surface near structures, or when considering loading processes in the vicinity of strong lateral contrasts of lithospheric structure.
5
Conclusion
The modeling of the solid Earth response to loading processes when interpreting geodetic observables is mandatory and usually expressed by load Love numbers for elastic deformations and linear response functions representing the viscoelastic behavior of the material. These representations are restricted to spherically stratified Earth models with a linear rheology and mainly applied to the process of GIA. For local to regional loading processes in continental regions, this kind of models is applicable, as the structural features can be adjusted to the respective region.
674
V. Klemann et al.
References Anderson DL, Minster B (1979) The frequency dependence of Q in the earth and implications for mantle rheology and Chandler Wobble. Geophys J R Astr Soc 58:431–440. doi:10.1111/j.1365246X.1979.tb01033.x Blewitt G (2009) GPS and space-based geodetic methods. In: Herring T (ed) Geodesy: treatise on geophysics. Elsevier, Amsterdam, pp 351–390. doi:10.1016/B978-044452748-6.00058-4 Bürgmann R, Dresen G (2008) Rheology of the lower crust and upper mantle: evidence from rock mechanics, geodesy, and field observations. Ann Rev Earth Planet Sci 36:531–567. doi:10.1146/annurev.earth.36.031207.124326 Byerlee J (1978) Friction of rocks. Pure Appl Geophys 116:615–626. doi:10.1007/BF00876528 Cambiotti G, Sabadini R (2010) The compressional and compositional stratifications in maxwell earth models: the gravitational overturning and the long-period tangential flux. Geophys J Int 180:475–500. doi:10.1111/j.1365-246X.2009.04434.x Cambiotti G, Klemann V, Sabadini R (2013) Compressible viscoelastodynamics of a spherical body at long time scales and its isostatic equilibrium. Geophys J Int 193:1071–1082. doi:10.1093/gji/ggt026 Cathles LM (1975) The viscosity of the Earth’s mantle. Princeton University Press, Princeton, p 386 Dahlen FA (1974) On the static deformations of an earth model with a fluid core. Geophys J R Astr Soc 36:461–485. doi:10.1111/j.1365-246X.1974.tb03649.x Dziewonski AM, Anderson DL (1981) Preliminary reference earth model. Phys Earth Planet Inter 25:297–356. doi:10.1016/0031-9201(81)90046-7 Farrell WE (1972) Deformation of the earth by surface loads. Rev Geophys 10:761–797. doi:10.1029/RG010i003p00761 Freeden W, Schreiner M (2010) Special functions in mathematical geosciences: an attempt at a categorization. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics. Springer, Berlin/Heidelberg, pp 925–949. doi:10.1007/978-3-642-01546-5_8 Gasperini P, Yuen DA, Sabadini R (1992) Postglacial rebound with a non-Newtonian upper mantle and a Newtonian lower mantle rheology. Geophys Res Lett 19:1711–1714. doi:10.1029/92GL01456 Gasperini P, Da Forno G, Boschi E (2004) Linear or non-linear rheology in the Earth’s mantle: the prevalence of power-law creep in the postglacial isostatic readjustment of Laurentia. Geophys J Int 157:1297–1302. doi:10.1111/j.1365-246X.2004.02319.x Gomez N, Pollard D, Mitrovica JX, Huybers P, Clark PU (2012) Evolution of a coupled marine ice sheet–sea level model. J Geophys Res 117. doi:10.1029/2011JF002128 Gurtin ME, Sternberg E (1962) On the linear theory of viscoelasticity. Arch Ration Mech Anal 11:291–356 Han D, Wahr J (1995) The viscoelastic relaxation of a realistically stratified earth, and a further analysis of postglacial rebound. Geophys J Int 120:287–311. doi:10.1111/j.1365246X.1995.tb01819.x Hanyk L, Matyska C, Yuen DA (1999) Secular gravitational instability of a compressible viscoelastic sphere. Geophys Res Lett 26:557–560. doi:10.1029/1999GL900024 Haskell NA (1935) The motion of a viscous fluid under a surface load. Physics 6, 265–369. doi:10.1063/1.1745329 Holgate SJ, Matthews A, Woodworth PL, Rickards LJ, Tamisiea ME, Bradshaw E, Foden PR, Gordon KM, Jevrejeva S, Pugh J (2013) New data systems and products at the permanent service for mean sea level. J Coast Res 288:493–504. doi:10.2112/JCOASTRES-D-12-00175.1 Johnston AC (1989) The effect of large ice sheets on earthquake genesis. In: Gregersen S, Basham PW (eds) Earthquakes at North-Atlantic passive margins: neotectonics and postglacial rebound. Kluwer Academic, Dordrecht, pp 581–599
Elastic and Viscoelastic Response of the Lithosphere to Surface Loading
675
Kanamori H, Anderson DL (1977) Importance of physical dispersion in surface wave and free oscillation problems: review. Rev Geophys Space Phys 15:105–112. doi:10.1029/RG015i001p00105 Karato S (2008) Deformation of Earth materials: an introduction to the rheology of the solid Earth. Cambridge University Press, Cambridge, p 463 Kennett BLN, Engdahl ER, Buland R (1995) Constraints on seismic velocities in the earth from traveltimes. Geophys J Int 122:108–124. doi:10.1111/j.1365-246X.1995.tb03540.x Klemann V, Ivins E, Martinec Z, Wolf D (2007) Models of active glacial isostasy roofing warm subduction: case of the South Patagonian Ice Field. J Geophys Res 112:B09405. doi:10.1029/2006JB004,818 Klemann V, Martinec Z, Ivins ER (2008) Glacial isostasy and plate motions. J Geodyn 46:95–103. doi:10.1016/j.jog.2008.04.005 Kuhlmann J, Thomas M, Schuh H (2014) Self-attraction and loading of oceanic masses. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of Geomathematics. Springer, Berlin/Heidelberg. Kusche J (2010) Time-variable gravity field and global deformation of the Earth. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics. Springer, Berlin/Heidelberg, pp 253– 268. doi:10.1007/978-3-642-01546-5_8 Kusche J, Klemann V, Bosch W (2012) Mass distribution and mass transport in the Earth system. J Geodyn 59–60:1–8. doi:10.1016/j.jog.2012.03.003 Lambert A, Courtier N, James T (2006) Long-term monitoring by absolute gravimetry: tides to postglacial rebound. J Geodyn 41:307–317. doi:10.1016/j.jog.2005.08.032 Lapwood ER, Usami T (1981) Free oscillations of the Earth. Cambridge Unversity Press, Cambridge, p 243 Laske G, Masters G, Reif C (2012) A new global crustal model at 2 2 degrees. http://igppweb. ucsd.edu/~gabi/crust2.html Longman IM (1963) A Green’s function for determining the deformation of the earth under surface mass loads – 2. Computations and numerical results. J Geophys Res 68:485–496. doi:10.1029/JZ068i002p00485 Marsden JE, Hughes TJR (1983) Mathematical foundations of elasticity. Dover, New York, p 555 Martinec Z (1999) Spectral, initial value approach for viscoelastic relaxation of a spherical earth with a three-dimensional viscosity—I. Theory. Geophys J Int 137:469–488. doi:10.1046/j.1365246X.1999.00803.x Martinec Z (2000) Spectral–finite element approach for three-dimensional viscoelastic relaxation in a spherical earth. Geophys J Int 142:117–141. doi:10.1046/j.1365-246x.2000.00138.x Martinec Z (2007) Propagator-matrix technique for the viscoelastic response of a multi-layered sphere to surface toroidal traction. Pure Appl Geophys 164:663–681. doi:10.1007/s00024-0070188-5 Martinec Z, Thoma M, Wolf D (2001) Material versus local incompressibility and its influence on glacial-isostatic adjustment. Geophys J Int 144:136–156. doi:10.1046/j.1365246x.2001.00324.x Peltier WR (1976) Glacial–isostatic adjustment—II. The inverse problem. Geophys J R Astr Soc 46:669–705. doi:10.1111/j.1365-246X.1976.tb01253.x Peltier WR (2004) Global glacial isostasy and the surface of the ice-age earth: the ICE5G (VM2) model and GRACE. Ann Rev Earth Planet Sci 32:111–149. doi:10.1146/annurev.earth.32.082503.144359 Plag H-P, Jüttner H-U (1995) Rayleigh-Taylor instabilities of a self-gravitating earth. J Geodyn 20:267–288. doi:10.1016/0264-3707(95)00008-W Ranalli G (1987) Rheology of the Earth, deformation and flow processes in geophysics and geodynamics. Allan & Unwin, Boston, p 366 Rümpker G, Wolf D (1996) Viscoelastic relaxation of a Burgers half-space: implications for the interpretation of the Fennoscandian uplift. Geophys J Int 124:541–555. doi:10.1111/j.1365246X.1996.tb07036.x Sabadini R, Vermeersen B (2004) Global dynamics of the Earth—applications of normal mode relaxation theory to solid-Earth geophysics. Kluwer Academic, Dordrecht, p 329
676
V. Klemann et al.
Schmeling H (1987) On the interaction between small- and large-scale convection and postglacial rebound flow in a power-law mantle. Earth Planet Sci Lett 84:254–262. doi:10.1016/0012821X(87)90090-2 Schuh H, Behrend D (2012) Vlbi: a fascinating technique for geodesy and astrometry. J Geodyn 61:68–80. doi:10.1016/j.jog.2012.07.007 Simons FJ, Olhede SC (2013) Maximum-likelihood estimation of lithospheric flexural rigidity, initial-loading fraction and load correlation, under isotropy. Geophys J Int 193:1300–1342. doi:10.1093/gji/ggt056 Spada G, Boschi L (2006) Using the Post–Widder formula to compute the Earth’s viscoelastic Love numbers. Geophys J Int 166:309–321. doi:10.1111/j.1365-246X.2006.02995.x Spada G, Barletta VR, Klemann V, Riva REM, Martinec Z, Gasperini P, Lund B, Wolf D, Vermeersen LLA, King MA, (2011) A benchmark study for glacial isostatic adjustment codes. Geophys J Int 185:106–132. doi:10.1111/j.1365-246X.2011.04952.x Stacey FD (1969) Physcis of the Earth. Wiley, New York, p 414 Tanaka Y, Okuno J, Okubo S (2006) A new method for the computation of global viscoelastic postseismic deformation in a realistic earth model (I)—vertical displacement and gravity variation. Geophys J Int 164:273–289. doi:10.1111/j.1365-246X.2005.02821.x Tanaka Y, Klemann V, Martinec Z, Riva REM (2011) Spectral-finite element approach to viscoelastic relaxation in a spherical compressible earth: application to gia modelling. Geophys J Int 184:220–234. doi:10.1111/j.1365-246X.2010.04854.x Tromp J, Mitrovica JX (1999) Surface loading of a viscoelastic earth—I. General theory. Geophys J Int 137:847–855. doi:10.1046/j.1365-246x.1999.00838.x van den Berg AP, van Keken PE, Yuen DA (1993) The effects of a composite non-Newtonian and Newtonian rheology on mantle convection. Geophys J Int 115:62–78. doi:10.1111/j.1365246X.1993.tb05588.x van den Berg J, van de Wal R, Oerlemans J (2006) Recovering lateral variations in lithospheric strenght from bedrock motion data using a coupled ice sheet–lithosphere model. J Geophys Res 111:B05409. doi:10.1029/2005JB003790 van den Berg J, van de Wal R, Milne GA, Oerlemans J (2008) Effect of isostasy on dynamical ice sheet modelling: a csae study for Eurasia. J Geophys Res 113:B05412. doi:10.1029/2007JB004994 van der Wal W, Wu P, Wang H, Sideris MG (2010) Sea levels and uplift rate from composite rheology in glacial isostatic adjustment modeling. J Geodyn 50:38–48. doi:dx.doi.org/10.1016/j.jog.2010.01.006 van der Wal W, Barnhoorn A, Stocchi P, Gradmann S, Wu P, Drury M, Vermeersen B (2013) Glacial isostatic adjustment model with composite 3-d earth rheology for Fennoscandia. Geophys J Int 194:61–77. doi:10.1093/gji/ggt099,pdf Vermeersen LLA, Mitrovica JX (2000) Gravitational stability of spherical self-gravitating relaxation models. Geophys J Int 142:351–360. doi:10.1046/j.1365-246x.2000.00159.x Vermeersen LLA, Sabadini R, Spada G (1996) Compressible rotational deformation. Geophys J Int 126:735–761. doi:10.1111/j.1365-246X.1996.tb04700.x Wang H, Xiang L, Jia L, Jiang L, Wang Z, Hu B, Gao P (2012) Load love numbers and Green’s functions for elastic earth models PREM, iasp91, ak135, and modified models with refined crustal structure from Crust 2.0. Comput Geosci 49:190–199. doi:10.1016/j.cageo.2012.06.022 Watts AB (2001) Isostasy and flexure of the lithosphere. Cambridge University Press, Cambridge, p 458 Wolf D (2010) Gravitational viscoelastodynamics. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics. Springer, Berlin/Heidelberg, pp 304–330. doi:10.1007/978-3642-01546-5_10 Wolf D, Kaufmann G (2000) Effects due to compressional and compositional density stratification on load-induced Maxwell-viscoelastic perturbations. Geophys J Int 140:51–62. doi:10.1046/j.1365-246x.2000.00984.x
Elastic and Viscoelastic Response of the Lithosphere to Surface Loading
677
Wolf D, Li G (2002) Compressible viscoelastic earth models based on Darwin’s law. In: Mitrovica JX, Vermeersen LLA (eds) Glacial isostatic adjustment and the Earth system: sea-level, crustal deformation, gravity and rotation. American Geophysical Union, Washington, DC, pp 275–292 Wu P (2002) Effects of nonlinear rheology on degree 2 harmonic deformation in a spherical selfgravitating earth. Geophys Res Lett 29. doi:10.1029/2001GL014109 Wu P, Hasegawa HS (1996) Induced stresses and fault potential in eastern Canada due to a disc load: a preliminary analysis. Geophys J Int 125:415–430. doi:10.1111/j.1365246X.1996.tb00008.x Wu P, Peltier WR (1982) Viscous gravitational relaxation. Geophys J R Astr Soc 70:435–485. doi:10.1111/j.1365-246X.1982.tb04976.x Wu P, Wang H (2008) Postglacial isostatic adjustment in a self-gravitating spherical Earth with power-law rheology. J Geodyn 46:118–130. doi:10.1016/j.jog.2008.03.008
Multiscale Model Reduction with Generalized Multiscale Finite Element Methods in Geomathematics Yalchin Efendiev and Michael Presho
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Some Selected Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Coarse Mesh Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 GMsFEM Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Local Basis Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Global Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 GMsFEM for Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Single-Phase Compressible Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Nonlinear Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Multiphase Flow and Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
680 681 681 683 684 684 686 690 690 691 694 697 699 699
Abstract
In this chapter, we discuss multiscale model reduction using Generalized Multiscale Finite Element Methods (GMsFEM) in a number of geomathematical applications. GMsFEM has been recently introduced (Efendiev et al. 2012) and
Y. Efendiev () Department of Mathematics, Institute for Scientific Computation (ISC), Texas A&M University, College Station, TX, USA Numerical Porous Media SRI Center, King Abdullah University of Science and Technology (KAUST), Makkah Province, Kingdom of Saudi Arabia e-mail: [email protected]; [email protected] M. Presho The Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin, Austin, TX, USA © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_68
679
680
Y. Efendiev and M. Presho
applied to various problems. In the current chapter, we consider some of these applications and outline the basic methodological concepts.
1
Introduction
Many geomathematical problems involve media or processes that contain multiple scales and physical properties that vary over orders of magnitude and exhibit uncertainties. As an example, solutions to fluid flow problems in heterogeneous porous media require large-scale computations to understand the complex physics and chemistry occurring in the subsurface. These models, henceforth called finegrid models, often consist of over 106 –107 grid cells. Therefore, the ability to coarsen these highly resolved models to levels of detail appropriate for simulations, optimization, and uncertainty quantification, while maintaining the integrity of the model for its fast simulation, is clearly needed. Similarly, geological complexity makes seismic imaging to be computationally challenging for geomathematical applications. Traditional methods share a common potential weakness in that the computational cost will be prohibitively expensive for large geological models and large three-dimensional models. Moreover, current seismic exploration tends to investigate finer and finer details of target reservoirs and to characterize smaller and smaller geological heterogeneities that are possibly smaller than the wavelength of the seismic wavefield. In these cases, one needs a much finer grid discretization of the model, and since the computational time and memory cost is directly proportional to the grid number in finite difference, finite element, or spectral element methods, the computational cost will increase accordingly. In full wave inversion, for example, where one needs to implement the modeling numerous times, the cost is even more expensive. These and many other applications require sophisticated model reduction techniques that can represent important features of the solution. There are a variety of model reduction techniques that include homogenization, upscaling, perturbation approaches, and multiscale methods. Upscaling techniques (see, e.g., Chen et al. 2003; Chen and Durlofsky 2007) have been commonly used in subsurface applications and include the reformulation of the global problem on a coarse grid which are called upscaled equations. The upscaled equations contain effective media properties. The calculations of these effective properties typically involve solving local problems in representative volumes or in coarse-grid blocks and extracting these properties via volume averaging. Though effective in many cases, these approaches do not systematically approximate the fine-grid solution. In this chapter, we discuss some recent approaches introduced in the context of Multiscale Finite Element Methods (see Hou and Wu 1997; Hughes et al. 1998; Aarnes and Hou 2002; Arbogast 2002; Jenny et al. 2003; Aarnes 2004; Efendiev et al. 2004; Aarnes et al. 2006; Efendiev et al. 2006; Aarnes et al. 2007; Owhadi and Zhang 2007; Aarnes and Efendiev 2008; Efendiev and Hou 2009; Efendiev et al. 2011) that can systematically and effectively enrich the solution space locally on a coarse-grid level. The main idea of these multiscale methods is to construct
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
681
an approximation space for the solution on each coarse (computational) grid. For this purpose, we use the recently introduced Generalized Multiscale Finite Element Method in this chapter (Efendiev et al. 2012). The Generalized Multiscale Finite Element Method (GMsFEM) is a flexible framework that generalizes the Multiscale Finite Element Method (MsFEM) by systematically enriching the coarse spaces and taking into account small-scale information and complex input spaces. This approach, as in many multiscale model reduction techniques, divides the computation into two stages: offline and online. In the offline stage, a small-dimensional space is constructed that can be efficiently used in the online stage to construct multiscale basis functions. These multiscale basis functions can be reused for any input parameter to solve the problem on a coarse grid. The main idea behind the construction of offline and online spaces is the selection of local spectral problems and the selection of the snapshot space. In Efendiev et al. (2012), we propose several general strategies for the effective enrichment of coarse spaces. In this chapter, our focus is to demonstrate a number of applications of GMsFEM. The chapter is organized as follows. In Sect. 2, we discuss the underlying equations, four exemplary application problems, and describe the coarse (and fine) mesh discretization. In Sect. 3, we describe the basic ingredients of GMsFEM. In particular, we discuss the snapshot space, offline and online spaces, and their construction. In Sect. 4, we offer a set of representative numerical results for the applications described in Sect. 2. We finish the chapter by offering some concluding remarks in Sect. 5.
2
Preliminaries
2.1
Some Selected Applications
In this subsection we offer a general overview of selected classes of subsurface flow problems that have been successfully treated with the Generalized Multiscale Finite Element Method (GMsFEM). More detailed discussions of the respective solution procedures will be deferred until later sections yet we emphasize that these preliminary descriptions will serve as further motivation for the multiscale methods presented throughout the chapter.
Single-Phase Flow As an initial application for the practical use of GMsFEM, we may consider an incompressible, steady-state flow of a single fluid phase. In this context, an elliptic equation of the form div .x/ rp D q
(1)
is used to model the pressure (as denoted by p.x/) within a heterogeneous reservoir constrained to a region D. We note that all applications in this chapter involve a
682
Y. Efendiev and M. Presho
permeability coefficient .x/ that is assumed to exhibit high-contrast behavior (i.e., max =min is assumed to be very large). q.x/ is used to denote any external sources. For a single-phase, compressible flow, we consider a parabolic equation of the form @p div .x/ rp D q; @t
(2)
where p.x; t/ denotes the time-varying pressure within a specified domain D.
Wave Equation We also consider the second-order acoustic wave equation in the form @2 p r 1 rp D q; 2 @t
(3)
where p D p.x; t/ is the pressure wavefield, D .x/ is the bulk modulus of the media which may vary greatly below the dominant seismic wavelength, D .x/ is the density of medium, and q D q.x; t/ is the external force term. We will assume constant and normalize the density with it, so that (3) may be written as @2 p r c 2 rp D q; 2 @t
(4)
where c 2 D =.
Nonlinear Flows In some situations, the pressure within a domain may involve a nonlinearity (Efendiev et al. 2004). For an application of GMsFEM, we may consider a incompressible, steady-state flow of a single fluid phase. To model an incompressible, nonlinear, steady-state flow, we consider a general equation of the form div .x; p; rp/ rp D q;
(5)
where p.x/ denotes the pressure within a subsurface domain D, k.x/ is a highcontrast permeability coefficient, and q.x/ is an external source term.
Multiphase Flow and Transport Multiphase fluid flow is another area where GMsFEM may be used as an effective solution technique. For this selected application we consider a heterogeneous oil reservoir which is confined to a global domain D. We consider an immiscible twophase system containing water and oil (where the respective subscripts w and o are often used) that is incompressible. We also assume a gravity-free environment and that the pore space is fully saturated. Then, a statement of conservation of mass
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
683
combined with Darcy’s law allows us to write the governing equations of the flow as div.v/ D q; where v D .S /.x/rp;
(6)
@S C div.f .S /v/ D qw ; @t
(7)
and
where p.x; t/ denotes the pressure, v.x; t/ is the Darcy velocity, S .x; t/ is the water saturation, and .x/ is the high-contrast permeability coefficient. The total mobility .S / and the flux function f .S / are, respectively, given by .S / D
rw .S / ro .S / C ; and f .S / w o
D
rw .S /=w ; .S /
(8)
where rj (j D w; o) is the relative permeability of the phase j and j (j D w; o) are the respective fluid viscosities.
2.2
Coarse Mesh Description
To describe the general solution framework for the model equations in this chapter, we first introduce the notion of fine and coarse grids. We let T H be a usual conforming partition of the computational domain D into finite elements (triangles, quadrilaterals, tetrahedrals, etc.). We refer to this partition as the coarse grid and assume that each coarse subregion is partitioned into a connected union of fine-grid v blocks. The fine-grid partition will be denoted by T h . We use fxi gN i D1 (where Nv the number of coarse nodes) to denote the vertices of the coarse mesh T H and define the neighborhood of the node xi by !i D Fig. 1 Schematic of a coarse element and coarse neighborhood
[
fKj 2 T H I xi 2 K j g:
(9)
684
Y. Efendiev and M. Presho
See Fig. 1 for an illustration of neighborhoods and elements subordinated to the coarse discretization. We emphasize the use of !i to denote a coarse neighborhood and K to denote a coarse element. To keep the notations simple, throughout, we use the same symbols to denote continuous fields and their discrete approximations.
3
GMsFEM Framework
In this section we will describe the details associated with the implementation of an offline-online procedure for constructing GMsFEM basis functions. We note that this procedure is applicable for the general case(s) when the coefficient of a system (such as those in Sect. 2.1) depends on a parameter . That is, we assume, in general, that D .xI / for the model problems that we consider. A general outline for the procedure is offered below. 1. Offline computations: – 1.0. Coarse grid generation. – 1.1. Construction of snapshot space that will be used to compute an offline space. – 1.2. Construction of a small-dimensional offline space by performing dimension reduction in the space of global snapshots. 2. Online computations: – 2.1. For each input parameter, compute multiscale basis functions. – 2.2. Solution of a coarse-grid problem for any force term and boundary condition. – 2.3. Iterative solvers, if needed.
3.1
Local Basis Functions
In the offline computation, we first construct a snapshot space Vsnap , corresponding to either the continuous Galerkin (CG) or discontinuous Galerkin (DG) formulation. Construction of the snapshot space involves solving the local problems for various choices of input parameters, on a specified coarse subdomain , where denotes coarse neighborhood-based computations for a CG formulation and coarse element-based computations for a DG formulation (Efendiev et al. 2014a). See Fig. 1 for an illustration of the respective coarse subdomains. For brevity of notation we now omit the superscript for eigenvalue problems, yet it is assumed throughout this section that the offline and online space computations are localized to respective coarse subdomains. The snapshots can be generated in various ways. One of them includes using all fine-grid functions within a coarse block. One can also use harmonic extension or oversampling techniques (e.g., Efendiev et al. 2014b). Without loss of generality, we denote the snapshots snap by l and the space of snapshots by
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
Vsnap D spanf
snap l
685
W and 1 l Li g;
where Li denotes the number of snapshots in the domain !i (or ). To ensure adequate accuracy, the number of small eigenvalues to be used in the construction should be minimally taken as the number of high-contrast inclusions on the coarse subdomain (Efendiev et al. 2011). We reorder the snapshot functions using a single index to create the matrix i h snap snap Rsnap D 1 ; : : : ; Msnap ; where Msnap denotes the total number of functions to keep in the snapshot matrix construction. In our simulations, we use fine-grid vectors as the space of snapshots. In order to construct the offline space Voff , we perform a dimension reduction of the space of snapshots using an auxiliary spectral decomposition. The main objective is to use the offline space to efficiently (and accurately) construct a set of multiscale basis functions for each value in the online stage. More precisely, we seek a subspace of the snapshot space such that it can approximate any element of the snapshot space in the appropriate sense defined via auxiliary bilinear forms. At the offline stage the bilinear forms are chosen to be parameter independent, such that there is no need to reconstruct the offline space for each value. The analysis in Efendiev et al. (2011) motivates the following eigenvalue problem in the space of snapshots: off off Aoff ‰koff D off k S ‰k ;
where off D Aoff D amn
(10)
Z .x; /r
off S off D smn D
snap m
r
snap n
T D Rsnap ARsnap
and
Z
.x; Q /
snap m
snap n
T D Rsnap S Rsnap ;
Q / are domain-based averaged coefficients with chosen where .x; / and .x; as the average of preselected i s, and the form for Q will be offered later (cf. (13)). We note that A and S denote analogous fine-scale matrices that use finegrid basis functions. To generate the offline space, we then choose the smallest Moff eigenvalues from Eq. (10) and the corresponding eigenvectors in the space of P form off snap off snapshots by setting koff D j ‰kj (for k D 1; : : : ; Moff ), where ‰kj are the j off coordinates of the vector ‰k . We then create the offline matrix
off Roff D 1off ; : : : ; M off to be used in the online space construction.
686
Y. Efendiev and M. Presho
For a given input parameter (in the parameter-dependent case), we next construct the associated online coarse space Von ./ for each value on each coarse subdomain. In principle, we want this to be a small-dimensional subspace of the offline space for computational efficiency. The online coarse space will be used within the finite element framework to solve the original global problem, where a continuous or discontinuous Galerkin coupling of the multiscale basis functions is used to compute the global solution. In particular, we seek a subspace of the offline space such that it can approximate any element of the offline space in an appropriate sense. We note that at the online stage, the bilinear forms are chosen to be parameter dependent. Similar analysis (see Efendiev et al. 2011) motivates the following eigenvalue problem in the offline space: on on Aon ./‰kon D on k S ./‰k ;
(11)
where Z Aon ./ D Œaon ./mn D
.xI /r
off m
r
off n
T D Roff A./Roff
Z on
on
S ./ D Œs ./mn D
.xI Q /
off m
off n
T D Roff S ./Roff ;
and .xI / and .xI Q / are now parameter dependent. To generate the online space, we then choose the smallest Mon eigenvalues from Eq. (11) the P andonform off corresponding eigenvectors in the offline space by setting kon D j ‰kj (for j on k D 1; : : : ; Mon ), where ‰kj are the coordinates of the vector ‰kon . We note that in the case when the coefficient is independent of the parameter, then Von D Voff . In other words, the online space discussion is limited to the case where the coefficient is parameter dependent.
3.2
Global Coupling
Throughout the remainder of this section, we consider the single-phase, incompressible flow model from Eq. (1) as a representative example of the model reduction process. However, we emphasize that the use of GMsFEM extends to the variety of applications introduced in Sect. 2.1. More details pertaining to each additional application will be offered in Sect. 4.
Continuous Galerkin Coupling In this subsection we aim to create an appropriate solution space and variational formulation that is suitable for a continuous Galerkin approximation. We begin v with an initial coarse space V init ./ D spanf&i gN i D1 , where the &i are the standard multiscale partition of unity functions defined by
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
div ..xI / r&i / D 0
K 2 !i
&i D gi
on @K;
687
(12)
for all K 2 !i , where gi is assumed to be linear. Referring back to Eq. (10) (e.g.), we note that the summed, pointwise energy Q required for the eigenvalue problems will be defined as
Q D
Nv X
H 2 jr&i j2 ;
(13)
i D1
where H denotes the coarse mesh size. We then multiply the partition of unity functions by the eigenfunctions in the online space Von!i to construct the resulting basis functions CG i;k
!i ;on k
D &i
!i for 1 i Nv and 1 k Mon ;
(14)
!i where Mon denotes the number of online eigenvectors that are chosen for each coarse node i . We note that the construction in Eq. (14) yields inherently continuous basis functions due to the multiplication of online eigenvectors with the initial (continuous) partition of unity. This convention is not necessary for the discontinuous Galerkin global coupling and is a focal point of contrast between the respective methods. However, with the continuous basis functions in place, we define the continuous Galerkin spectral multiscale space as
VonCG ./ D span
˚
CG i;k
!i : W 1 i Nv and 1 k Mon
(15)
c Using a single index notation, we may write VonCG ./ D spanf iCG gN i D1 , where Nc denotes the total number of basis functions that are used in the coarse
space (where construction. We also construct an operator matrix RT D 1CG ; : : : ; NCG c CG are used to denote the nodal values of each basis function defined on the fine i grid), for use in this and later sections. Using this space to solveP the single-phase incompressible flow equation in (1), CG we search for pms .xI / D i ci iCG .xI / 2 VonCG such that
CG aCG pms ; vI D .q; v/
for all v 2 VonCG ;
Z
(16)
Z
where aCG .p; vI / D
.xI /rp rv dx, and .q; v/ D D
qv dx. We note that D
variational form in (16) yields the following linear algebraic system Ac PcCG D Qc ;
(17)
688
Y. Efendiev and M. Presho
where PcCG denotes the nodal values of the discrete CG solution, and Z
Z
Ac ./ D ŒaIJ D
.xI / r D
CG I r
CG J
dx
and Qc D ŒqI D
q D
CG I
dx:
Using the operator matrix RT , we may write Ac ./ D RA./RT and Qc D RQ, where A./ and Q are the standard fine-scale stiffness matrix and forcing vector corresponding to the form in Eq. (16). We also note that the operator matrix may be analogously used in order to project coarse-scale solutions onto the fine grid.
Discontinuous Galerkin Coupling One can also use the discontinuous Galerkin (DG) approach (see also Arnold et al. 2001; Dryja 2003; Rivière 2008) to couple multiscale basis functions. This may avoid the use of the partition of unity functions; however, a global formulation needs to be chosen carefully. We have been investigating the use of DG coupling, and the detailed results will be presented elsewhere; see Efendiev et al. (2014). Here, we would like to briefly mention a general global coupling that can be used. The global formulation is given by aDG .p; vI / D q.v/
for all
v D fvK 2 VK g;
(18)
where a
DG
.p; vI / D
X
aKDG .p; vI /
K
and q.v/ D
XZ K
qvK dx;
(19)
K
for all p D fpK g; v D fvK g with K being a coarse element depicted in Fig. 1. Each local bilinear form aKDG is given as a sum of three bilinear forms: DG aK .p; vI / WD aK .p; vI / C rK .p; vI / C sK .p; vI /;
(20)
where aK is the bilinear form, Z aK .p; vI / WD
K .xI /rpK rvK dx;
(21)
K
where K .xI / is the restriction of .xI / in K; the rK is the symmetric bilinear form, rK .p; vI / WD
X 1 Z @vK @pK Q E .xI / .vK vK0 / C .pK0 pK / ds; lE E @nK @nK
E@K
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
689
where Q E .xI / is the harmonic average of .xI / along the edge E, lE D 1 if E is on the boundary of the macrodomain, and lE D 2 if E is an inner edge of the macrodomain. Here, K0 and K are two coarse-grid elements sharing the common edge E; and sK is the penalty bilinear form, sK .p; vI / WD
Z X 1 1 ıE Q E .xI /.pK 0 pK /.vK 0 vK /ds: lE hE E
(22)
E@K
Here hE is harmonic average of the length of the edge E and E 0 , ıE is a positive penalty parameter that needs to be selected, and its choice affects the performance of GMsFEM. The inherent unconformal property of DG formulation determines the removal of the partition of unity functions while constructing basis functions in Eq. (14). Similarly, we can obtain the discontinuous Galerkin spectral multiscale space as VonDG ./ D span
˚
DG k
K ; W 1 k Mon
(23)
for every coarse element K. Using the same process as in the continuous Galerkin formulation, we can obtain an operator matrix constructed by the basis functions of VonDG ./. For the
. consistency of the notation, we denote the matrix as R, and RT D 1DG ; : : : ; NDG c Recall that Nc denote the total number of coarse basis functions. Solving the single-phase model in Eq. (1) in the coarse space VonDG ./ using DG the DG formulation described in (18) is equivalent to seeking pms .xI / D P DG DG c .xI / 2 V such that i on i i DG aDG pms ; vI D q.v/
for all v 2 VonDG ;
(24)
where aDG .p; vI / and q.v/ are defined in Eq. (19). Similar as the CG case, we can obtain a coarse linear algebra system Ac PcDG D Qc ;
(25)
where PcDG denotes the discrete coarse DG solution, and Ac ./ D RA./RT
and Qc D RQ;
where A./ and Q are the standard, fine-scale stiffness matrix and forcing vector corresponding to the form in Eq. (19). After solving the coarse system, we can use the operator matrix R to obtain the fine-scale solution in the form of RT PcDG .
690
Y. Efendiev and M. Presho
4
GMsFEM for Applications
4.1
Single-Phase Compressible Flow
To illustrate the effectiveness of GMsFEM we first offer a set of results associated with solving the single-phase, compressible flow model in Eq. (2). For the comparisons we solve the problem on a global domain D D Œ0; 1 Œ0; 1 , use a forcing term of q D 10, and the high-contrast permeability field from Fig. 2. Additionally, we consider zero Dirichlet boundary conditions and an initial condition p.x; 0/ D sin.2x/ sin.2y/. We use a 100 100 fine mesh which yields a system of size Nf D 10;201, and we consider a variety of coarse space dimensions to compute the GMsFEM solutions. We emphasize that by doing so, we solve reduced-order systems which are significantly smaller than the fully resolved counterpart. To solve Eq. (2) using the finite element method (FEM), we search for ph .t/ 2 Nf h , where i are the standard bilinear finite element basis V D spanf i gi D1 functions defined on T h and Nf denotes the number of nodes on the fine grid. After multiplying the equation by appropriate test functions and integrating over the domain D, we obtain the following set of ordinary differential equations corresponding to the model equation given in (2): M
dP C AP D Q; dt
(26)
where P D Œpi .t/ denotes the time-dependent nodal pressure values, M is the Z
i j , A is the stiffness matrix given by mass matrix given by M D Œmij D
Fig. 2 The high-contrast coefficient used for the compressible single-phase and two-phase examples
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
691
Z A D Œaij D r i r j , and Q is the forcing vector given by Q D Œqi D Z q i . Using the backward Euler, implicit scheme for the time marching process
yields 1 1 P nC1 D M C tA M P n C M C tA t Q;
(27)
where n denotes the time stepping index and t is the time step. The fine discretization yields large matrices of size Nf Nf which may become prohibitively expensive to handle numerically. Using an analogous procedure as in Sect. 3.2, we may express the coarse-scale analogue of (27) as 1 1 PcnC1 D Mc C tAc Mc Pcn C Mc C tAc t Qc ;
(28)
where Mc D RMRT , Ac D RART , and Qc D RQ, and we recall that R denotes the final space of enriched multiscale basis functions. The resulting coarse matrices in (28) are of size Nc Nc , where Nc is significantly smaller than Nf . Thus, the coarse system in (28) offers a suitable local model reduction of the fine system in (27). For more details regarding the GMsFEM solution procedure for this type of problem, we refer the interested reader to Ghommem et al. (2013). See Fig. 3 for an illustration of fine and reduced-order solutions advancing in time. While the solution profiles for the case when Nc D 202 show some slight differences, we emphasize that the solution profiles for the case when Nc D 526 are nearly indistinguishable from the fine-scale reference fields. Thus, we see that the addition of more basis functions noticeably improves the solution accuracy. For a more rigorous comparison, we also offer relative L2 error quantities E.t/ D kp.t/ pc .t/k=kp.t/k 100 % in Fig. 4. As expected, we see that as the dimension of the coarse space increases, the respective errors decrease.
4.2
Wave Equation
In solving the wave equation from Eq. (4), we use zero Dirichlet boundary conditions and the initial condition p.x; 0/ D 0. To solve the equation using FEM, we search for ph .t/ 2 V h , multiply the original equation by appropriate test functions, and integrate over the domain D to arrive at the following system of equations: M
d 2P C AP D Q; dt 2
(29)
where P D Œpi .t/ is the nodal pressure wavefield and M , A, and Q are analogously defined quantities from Eq. (26). Using the central finite difference method for time
692
Y. Efendiev and M. Presho
Fig. 3 Single-phase pressure profiles advancing in time for increasing coarse space dimensions
Fig. 4 Errors between reference (fine) pressure solutions and solutions obtained from a variety of coarse space dimensions
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
693
stepping and applying an explicit scheme, we obtain a fully discrete system of the form M
P nC1 2P n C P n1 C AP n D Qn ; t 2
(30)
where n denotes the time stepping index. We note that square matrices of this system will have dimension Nf Nf , where Nf is the dimension of the fine-scale discretization space. Using an analogous procedure from Sect. 4.1, we use the operator matrix R to construct the coarse-scale system Mc
PcnC1 2Pcn C Pcn1 C Ac Pcn D Qcn ; t 2
(31)
where Pc is the pressure wavefield in the coarse space, Mc D RMRT , Ac D RART , and Qc D RQ. The resulting system is of size Nc where Nc Nf , thus offering an efficient reduced order alternative to the fully resolved system. For more details regarding the GMsFEM solution procedure for this type of problem, we refer the interested reader to Fu et al. (2013). We test our method using the model as shown in Fig. 5. The model has size 1 km in both the x and z directions, and for the fine-scale discretization, we use a 500 500 mesh. This yields a fully resolved system of size Nf D 251;001. The source is a Ricker wavelet with central frequency 18.0 Hz, located at .0:5; 0:1/ km of the domain. There are three curved reflectors in this model, and between the three reflectors there are many randomly distributed inclusions with low velocities. In Fig. 6 we offer a set of wavefield snapshots at time t D 0:3 s. Fig. 6a shows
Fig. 5 The high-contrast coefficient used for the wave equation example
694
Y. Efendiev and M. Presho
Fig. 6 Wavefield solution snapshots from fully resolved and coarse spaces. (a) FEM .Nf D 251;001/. (b) GMsFEM .Nc D 2;980/. (c) GMsFEM .Nc D 5;860/
the fine-scale reference wavefield, and Fig. 6b, c show the wavefields corresponding to different coarse approximations. Both solutions were computed using a 25 25 coarse mesh, where two different levels of space enrichment were used. In particular, the solution in Fig. 6b corresponds to a coarse system of size Nc D 2;980, and the solution in Fig. 6c corresponds to a larger system of size Nc D 5;860 (both of which are significantly smaller than Nf in this example). In addition to the illustrative validation of the solution quality, we mention that the respective L2 errors are 12.76 and 7.6 % for these cases. As such, we see that GMsFEM can accurately recover both the primary wavefront and the wavefield scattered by the random inclusions.
4.3
Nonlinear Flows
In this subsection, we consider nonlinear, single-phase flow governed by an equation of the form div .xI p/ rp D q in D;
(32)
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
695
Fig. 7 The high-contrast coefficient used for the single-phase, nonlinear example
120
1
110 100
0.8
90 0.6
80 70
0.4
60 50
0.2
40 30
0
0
0.2
0.4
0.6
0.8
1
where p D 0 on @D. We use a source term q D 0:1 and solve the problem on the unit two-dimensional domain D D Œ0; 1 Œ0; 1 . We take .xI p/ D e.x/p.x/ in the following numerical test, where .x/ is shown in Fig. 7. In order to solve Eq. (32), we will consider a Picard iteration div .xI p n .x// rp nC1 .x/ D q
(33)
where superscripts involving n denote respective iteration levels. In our simulations, we partition the original domain using a coarse mesh of size H D 1=10 and use a fine mesh composed of uniform triangular elements of mesh size h D 0 1=100. We take the initial nC1 guess p D 0 and terminate the iterative loop when nC1 Ac P Pc Qc ı kQc k, where ı is the tolerance for the iteration c and we select ı D 103 . For this set of examples, we use a discontinuous Galerkin formulation where Ac and Qc correspond to the linear system resulting from global formulation in Sect. 3.2. In particular, we solve the problem as follows: Ac Pcn PcnC1 D Qc for n D 0; 1; : : : :
(34)
For all cases in this subsection, the global iteration resulting from the linearization converges in 4 or 5 iterations. For more details regarding the GMsFEM solution procedure for this type of problem, we refer the interested reader to Efendiev et al. (2014a). The simulation results are represented in Table 1. The first column shows the dimension of the online space, the second column represents the corresponding eigenvalue ( ) of the first eigenfunction discarded from the online space, and the next two columns illustrate the interior energy relative error .Eint / and the
696
Y. Efendiev and M. Presho
Table 1 DG relative errors corresponding to the permeability field in Fig. 7; snapshot space uses K eigenfunctions linit dim VonDG
235 375 440 627 844
3:97 104 2:13 104 1:11 104 1:06 104 6:08 105 DG Fine Solution
x 10−3 6
1
GMsFEM relative error (%) Eint E@ 52:99 10:46 35:68 6:18 35:45 6:09 32:54 5:44 32:22 5:15 DG MsFEM: dim(Von) = 235 x 10−3 5
1
Online-offline relative error (%) Eint E@ 47:83 9:26 21:24 4:19 20:72 4:12 15:19 3:15 14.97 3.13 DG GMsFEM: dim(Von) = 844 x 10−3 5
1
0.8
5
0.8
4
0.8
4
0.6
4
0.6
3
0.6
3
0.4
2
0.4
2
0.2
1
0.2
1
0
0
3
0.4
2 0.2 0
0
1 0.2
0.4
0.6
0.8
1
0
0
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
Fig. 8 Comparison of fine and coarse DG solutions corresponding to Fig. 7
boundary energy relative error .E@ / between the fine-scale solution and DG GMsFEM solution. We follow the definition of Eint and E@ as in Efendiev et al. (2014):
T D
2 N X 12 rp i 2
L .i /
i D1
Eint
C
N X 12 2 re D i 2 i D1
0 E@ D @
N X X i D1 Eij @i
N X X i D1 Eij @i
1 1 lij hij
Z ij .pi pj /2 ; Eij
! 12
. T
;
(35)
L .i /
1 1 lij hij
1 12 . ij .ei ej /2 T A :
Z
(36)
Eij
Here, e D p pms and ei D ejEij . The errors between the offline and online solutions are offered in the final two columns. We note that as the dimension of the online space increases, the relative errors decrease accordingly. As shown in Table 1, the interior relative energy errors decrease from 55.08 to 34.86 % and boundary relative energy errors decrease from 8.94 to 6.40 % when the dimension of the online space increases from 235 to 844. In Fig. 8, we also plot the fine and the coarse solutions with the smallest and largest dimension of the online spaces. We note that the fine solution and the coarse solution
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
697
corresponding to the smallest online space show some slight differences. However, the discrepancies diminish when the coarse solution is computed within the largest online space.
4.4
Multiphase Flow and Transport
To solve the two-phase flow problem from Eq. (7), we use quadratic relative permeability curves krw D S 2 and kro D .1 S /2 , along with w D 1 and o D 5 for the fluid viscosities. For the initial condition, we set the value at the left edge as S D 1 and assume that S .x; 0/ D 0 elsewhere. For the pressure equation we assume that pL D 1 and pR D 0 at the left and right boundaries, respectively, and use no flow (zero Neumann) boundary conditions on the top and bottom boundaries. In solving the two-phase model, we note that a standard IMPES (Implicit Pressure Explicit Saturation) solution procedure is used (see Bush et al. 2014). As such, the bulk of the computational cost is devoted to inverting the linear systems associated with the pressure equation div .x/.S / rp D q:
(37)
In addition, we emphasize that the pressure equation must be repeatedly solved due to the need to update the saturation-dependent mobility coefficient. As similarly mentioned in previous subsections, (37) yields linear systems of the form A.S /P D Q; of with a fine-scale dimension of Nf . In order to construct suitable reduced order approximations, we may instead solve Ac .S /Pc D Qc ; where Ac .S / D RA.S /RT and Qc D RQ are coarsened using an analogous operator matrix R. However, in this context, we also require that the resulting fluxes given by v D .S /.x/rp satisfy a local conservation property. The latter is a delicate topic that will not be directly addressed in this chapter; however, we refer the interested reader to Bush et al. (2014) for a complete description of the post-processing procedure. We use an explicit, non-oscillatory upwinding scheme in order to solve the saturation equation in Eq. (7). For more details regarding the GMsFEM solution procedure for this type of problem, we refer the interested reader to Bush et al. (2014). Results for application of the method to two-phase flow are shown in Fig. 9 where we use the permeability field in Fig. 2. The illustration shows the reference saturation at three separate time levels, along with saturation profiles corresponding to enriched pressure spaces constructed on a 10 10 coarse mesh. In particular, we
698
Y. Efendiev and M. Presho
Fig. 9 Saturation profiles advancing in time for a variety of coarse space dimensions
consider a fine system of dimension Nf D 10;201 and coarse systems of dimension Nc D 121; 202; and 364. In Fig. 10 we also plot relative L2 errors corresponding to each configuration for a representative length of time. Figures 9 and 10 both show that an increase in the coarse space dimension yields a pronounced error decline in the respective saturation profiles. Additionally, we observe from Fig. 9 that the solutions corresponding to the case when Nc D 364 are nearly indistinguishable from the reference saturation solutions associated with pressure systems of size Nf D 10;201.
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
699
Fig. 10 Errors between reference (fine) saturation solutions and solutions obtained from a variety of coarse space dimensions
5
Conclusions
In this chapter, we discuss multiscale model reduction through the use of the Generalized Multiscale Finite Element Method (GMsFEM). We outline the basic concepts associated with the systematic enrichment of coarse solution spaces and describe the offline-online procedure that is used in the construction of multiscale basis functions. We then apply GMsFEM to a number of geomathematical applications and illustrate its effectiveness through presenting a number of representative numerical examples. For further details regarding each application, we direct the interested reader to pertinent references. Acknowledgements We would like to thank Ms. Guanglian Li and Mr. Shubin Fu for their assistance in compiling some results. YE’s work is partially supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Award Number DE- FG02-13ER26165. MP’s work is partially supported by he U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Award Number DE-SC0009286 as part of the DiaMonD Multifaceted Mathematics Integrated Capability Center.
References Aarnes JE (2004) On the use of a mixed multiscale finite element method for greater flexibility and increased speed or improved accuracy in reservoir simulation. SIAM J Multiscale Model Simul 2:421–439 Aarnes JE, Efendiev Y (2008) Mixed multiscale finite element for stochastic porous media flows. SIAM J Sci Comput 30(5):2319–2339
700
Y. Efendiev and M. Presho
Aarnes JE, Hou T (2002) Multiscale domain decomposition methods for elliptic problems with high aspect ratios. Acta Math Appl Sin Engl Ser 18:63–76 Aarnes JE, Krogstad S, Lie K-A (2006) A hierarchical multiscale method for two-phase flow based upon mixed finite elements and nonuniform grids. SIAM J Multiscale Model Simul 5(2): 337–363 Aarnes JE, Efendiev Y, Jiang L (2008) Analysis of multiscale finite element methods using global information for two-phase flow simulations. SIAM J Multiscale Model Simul 7: 2177–2193 Arbogast T (2002) Implementation of a locally conservative numerical subgrid upscaling scheme for two-phase Darcy flow. Comput Geosci 6:453–481 Arnold DN, Brezzi F, Cockburn B, Marini LD (2001) Unified analysis of discontinuous Galerkin methods for elliptic problems. SIAM J Numer Anal 39:1749–1779 Bush L, Ginting V, Presho M (2014) Application of a conservative, generalized multiscale finite element method to flow models. J Comput Appl Math 260:395–409 Chen Y, Durlofsky L (2007) An ensemble level upscaling approach for efficient estimation of finescale production statistics using coarse-scale simulations. In: SPE paper 106086, SPE reservoir simulation symposium, Houston, 26–28 Feb 2007 Chen Y, Durlofsky LJ, Gerritsen M, Wen XH (2003) A coupled local-global upscaling approach for simulating flow in highly heterogeneous formations. Adv Water Resour 26:1041–1060 Dryja M (2003) On discontinuous Galerkin methods for elliptic problems with discontinuous coefficients. Comput Methods Appl Math 3:76–85 Efendiev Y, Hou T (2009) Multiscale finite element methods. Theory and applications. Springer, New York Efendiev Y, Hou T, Ginting V (2004) Multiscale finite element methods for nonlinear problems and their applications. Commun Math Sci 2:553–589 Efendiev Y, Ginting V, Hou T, Ewing R (2006) Accurate multiscale finite element methods for two-phase flow simulations. J Comput Phys 220(1):155–174 Efendiev Y, Galvis J, Wu XH (2011) Multiscale finite element methods for high-contrast problems using local spectral basis functions. J Comput Phys 230(4):937–955 Efendiev Y, Galvis J, Lazarov R, Willems J (2012) Robust domain decomposition preconditioners for abstract symmetric positive definite bilinear forms. ESAIM: M2AN 46: 1175–1199 Efendiev Y, Galvis J, Thomines F (2012) A systematic coarse-scale model reduction technique for parameter-dependent flows in highly heterogeneous media and its applications. SIAM J Multiscale Model Simul 10(4):1317–1343 Efendiev Y, Galvis J, Hou T (2013) Generalized multiscale finite element methods. J Comput Phys 251:116–135 Efendiev Y, Galvis J, Lazarov R, Moon M, Sarkis M (2014) Generalized multiscale finite element method. symmetric interior penalty coupling. J Comput Phys 255:1–15 Efendiev Y, Galvis J, Li G, Presho M (2014a) Generalized multiscale finite element methods. nonlinear elliptic equations. Commun Comput Phys 15:733–755 Efendiev Y, Galvis J, Li G, Presho M (2014b) Generalized multiscale finite element methods. Oversampling strategies. Int J Multiscale Com 12:465–484 Fu S, Efendiev Y, Gao K, Gibson R Jr (2013) Multiscale modeling of acoustic wave propagation in 2D heterogeneous media using local spectral basis functions. SEG Technical Program Expanded Abstracts 2013:3553–3558. doi:10.1190/segam2013-1184.1 Ghommem M, Presho M, Calo V, Efendiev Y (2013) Mode decomposition methods for flows in high-contrast porous media. global-local approach. J Comput Phys 253:226–238 Hou TY, Wu XH (1997) A multiscale finite element method for elliptic problems in composite materials and porous media. J Comput Phys 134:169–189 Hughes T, Feijoo G, Mazzei L, Quincy J (1998) The variational multiscale method – a paradigm for computational mechanics. Comput Methods Appl Mech Eng 166:3–24
Multiscale Model Reduction with Generalized Multiscale Finite Element. . .
701
Jenny P, Lee SH, Tchelepi H (2003) Multi-scale finite volume method for elliptic problems in subsurface flow simulation. J Comput Phys 187:47–67 Owhadi H, Zhang L (2007) Metric-based upscaling. Commun Pure Appl Math 60:675–723 Rivière B (2008) Discontinuous Galerkin methods for solving elliptic and parabolic equation. Frontiers in applied mathematics, vol 35. Society for Industrial and Applied Mathematics (SIAM), Philadelphia
Efficient Modeling of Flow and Transport in Porous Media Using Multi-physics and Multi-scale Approaches Rainer Helmig, Bernd Flemisch, Markus Wolff, and Benjamin Faigle
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Definition of Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Upscaling and Multi-scale Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Multi-physics Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Mathematical Models for Flow and Transport Processes in Porous Media . . . . . . . . . . . 3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Multiphase Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Decoupled Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Non-isothermal Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Numerical Solution Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Solution of the Fully Coupled Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Solution of the Decoupled Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Application of Multi-physics and Multi-scale Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 A Multi-physics Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 A Multi-scale Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
704 707 707 709 712 713 713 722 724 729 730 730 731 735 735 739 741 742
Abstract
Flow and transport processes in porous media including multiple fluid phases are the governing processes in a large variety of geological and technical systems. In general, these systems include processes of different complexity occurring in different parts of the domain of interest. The different processes mostly also take place on different spatial and temporal scales. It is extremely challenging
R. Helmig () • B. Flemisch • M. Wolff • B. Faigle Department of Hydromechanics and Modeling of Hydrosystems, Institute of Hydraulic Engineering, University of Stuttgart, Stuttgart, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_15
703
704
R. Helmig et al.
to model such systems in an adequate way accounting for the spatially varying and scale-dependent character of these processes. In this work, we give a brief overview of existing upscaling, multi-scale, and multi-physics methods, and we present mathematical models and model formulations for multiphase flow in porous media including compositional and non-isothermal flow. Finally, we show simulation results for two-phase flow using a multi-physics and a multi-scale method.
1
Introduction
In a hydrological, technical, or biological system, various processes occur in different parts of the general modeling domain. These processes must be considered on different space and time scales, and they require different model concepts and data. Highly complex processes may take place in one part of the system necessitating a fine spatial and temporal resolution, while in other parts of the system, physically simpler processes take place allowing an examination on coarser scales. For an overview and categorization of concepts for temporal and spatial model coupling, we refer to Helmig et al. (2012). Figure 1 shows a sketch of an exemplary porous media system including various important length scales and different types of physical processes. The heterogeneous structure in porous media is depending strongly on the spatial scale (see, e.g., Niessner and Helmig 2007). The traditional approach resolves the underlying structure on one scale, which has to be fine enough if an accurate description is desired. Multi-scale algorithms regard scales separately. The connection between two scales is made by up- and downscaling approaches. In Fig. 1, this is exemplarily visualized by integrating local heterogeneous information via upscaling techniques into the global flow problem. Much research has been done to upscale either
Fig. 1 A general physical system where different processes occur in different parts of a domain and on different scales
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
705
pressure or saturation equation in two-phase flow or to include the different scales directly in the numerical scheme by using multi-scale finite volumes or elements; see, e.g., Durlofsky (1991), Hou and Wu (1997), Renard and de Marsily (1997), Efendiev et al. (2000), Chen and Hou (2002), Efendiev and Durlofsky (2002), Chen et al. (2003), Jenny et al. (2003), Weinan et al. (2003), and Weinan and Engquist (2003b). In contrast to an a priori decision about the model complexity undertaken in traditional methods, multi-physics approaches allow to apply different model concepts in different subdomains. In the upper half of Fig. 1, the red plume in the middle of the domain can be simulated by accounting for a multitude of physical processes, whereas the surroundings are approximated by a simpler model abstraction. In this respect, research has advanced in the context of domain decomposition techniques (see, e.g., Wheeler et al. 1999; Yotov 2002) and in the context of mortar finite element techniques that allow multi-physics as well as multinumerics coupling (see, e.g., Peszynska et al. 2002). The advantage of multi-scale multi-physics algorithms is on one side that the appropriate model can be applied at will for each specific scale or physical process. On the other hand, they allow to save computing time or make the computation of very complex and large systems possible that could otherwise not be numerically simulated, at least not in that level of detail. In the following, as an example application, the storage of carbon dioxide (CO2 ) in a deep geological formation will be studied and multi-scale as well as multiphysics aspects in space and time will be identified. Please note that multi-scale and multi-physics aspects are relevant in a large number of additional applications, not only in geological systems but also in biological (e.g., treatment of brain tumors) and technical (e.g., processes in polymer electrolyte membrane fuel cells) systems. Thus, multi-scale multi-physics techniques developed for geological applications can be transferred to a broad range of other problems. Concerning CO2 storage, different storage options are commonly considered that are shown in Fig. 2, which is taken from IPCC (2005). According to that figure, possible storage possibilities are given by depleted oil and gas reservoirs, use of CO2 in the petroleum industry in order to enhance oil and gas recovery or – in a similar spirit – in order to improve the methane production by injection of CO2 . Besides, deep saline formations represent possible storage places, either onshore or offshore. When injecting carbon dioxide, processes take place on highly different spatial and temporal scales. Concerning spatial scales, the processes in the vicinity of the CO2 plume are very complex including phase change, chemical reactions, etc. But usually, the interest lies on the effect of the CO2 injection on larger domains, especially if it is to be investigated whether CO2 is able to migrate to the surface or not. In the vicinity of the CO2 plume, processes of much higher complexity and much higher fine-scale dependence occur than in the remaining part of the domain. This aspect prescribes both the spatial multi-scale and the spatial multiphysics character of this application: around the CO2 plume, processes have to be resolved on a fine spatial scale in order to be appropriately accounted for. In the rest of the domain of interest, processes may be resolved on a coarser spatial
706
R. Helmig et al.
Fig. 2 Carbon dioxide storage scenarios from IPCC special report on carbon capture and storage, IPCC (2005)
scale. Additionally, the processes occurring in the plume zone and in the non-plume zone are different: While complex two-phase multicomponent processes including reaction need to be considered near the plume, a single-phase system may be sufficient in other parts of the domain. With respect to temporal scales, we consider Fig. 3, which is again taken from IPCC (2005). In the early time period, i.e., few years after the CO2 injection ceased, the movement of the CO2 is determined by advection-dominated multiphase flow (viscous, buoyant, and capillary effects are relevant). In a later time period, when the CO2 has reached residual saturation everywhere, dissolution and diffusion processes are most decisive for the migration of the carbon dioxide. Eventually, in the very long time range of thousands of years, it is to be expected that the CO2 will be bound by chemical reactions. This paper is structured as follows: In Sect. 2, we define the relevant scales considered in this work and give an overview of multi-scale and of multi-physics techniques. Next, in Sect. 3, the mathematical model for flow and transport in porous media is described including non-isothermal flow and different mathematical formulations. In Sect. 4, the numerical solution procedures for both decoupled and coupled model formulations are explained. In Sect. 5, we present two different applications of multi-physics and multi-scale algorithms. Finally, we conclude in Sect. 6.
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
707
100 trapping contribution %
structural and stratigraphic trapping residual trapping increasing storage security solubility trapping 0
dominating processes
1
10
100
mineral trapping
1,000
10,000
time after stop of CO2 −injection (years)
geochemical phase transfer processes reactions dissolution and diffusion multiphase behavior advection−dominated (viscous, buoyant, capillary)
Fig. 3 Time scales of carbon dioxide sequestration, IPCC (2005)
2
State of the Art
We want to give a brief introduction into existing multi-scale methods and into methods for scale transfer. First, general definitions of different important scales are given (Sect. 2.1) to point out which are the scales considered in the following sections. Afterward, we give a very general overview of basic approaches for upscaling and different kinds of multi-scale methods (Sect. 2.2) and a short introduction to multiphysics methods (Sect. 2.3).
2.1
Definition of Scales
In order to design an appropriate modeling strategy for particular problems, it is important to consider the spatial and temporal scales involved, and how the physical processes and parameters of the system relate to these scales. A careful definition of relevant length scales can clarify any investigation of scale considerations, although such definitions are a matter of choice and modeling approach (Hristopulos and Christakos 1997). In general, we define the following length scales of concern: the molecular length scale, which is of the order of the size of a molecule; the microscale, or the minimum continuum length scale on which individual molecular interactions can be neglected in favor of an ensemble
708
R. Helmig et al.
average of molecular collisions; the local scale, which is the minimum continuum length scale at which the microscale description of fluid movement through individual pores can be neglected in favor of averaging the fluid movement over a representative elementary volume (REV) (therefore, this scale is also called the REV scale); the mesoscale, which is a scale on which local scale properties vary distinctly and markedly; and the megascale or field scale. Measurements or observations can yield representative information across this entire range of scales, depending on the aspect of the system observed and the nature of the instrument used to make the observation. For this reason, we do not specifically define a measurement scale. Figure 4 graphically depicts the range of spatial scales of concern in a typical porous medium system. It illustrates two important aspects of these natural systems: several orders of magnitude in potentially relevant length scales exist, and
field scale
km
meso scale
geological structures
block heterogeneities
m local scale
intrinsic heterogeneities
micro scale
mm
minimum continuum length scale
μm
Fig. 4 Different scales for flow in porous media
single pores
boundary layer
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
709
physical property P e.g.porosity
P
REV length scale
r
rmin
μm boundary layer
rmax radius r
mm
m
km
intrinsic heterogeneities geological structures single pores block heterogeneities
Fig. 5 Different scales for flow in porous media (schematically for Fig. 4)
heterogeneity occurs across the entire range of relevant scales. A similar range of temporal scales exists as well, from the picoseconds over which a chemical reaction can occur on a molecular length scale to the centuries or milleniums of concern in the long-term storage of greenhouse gases or atomic waste. When looking at the REV scale, we average over both fluid-phase properties and solid-phase properties. In Fig. 5, we schematically show the averaging behavior on the example of the porosity. While averaging over a representative elementary volume (REV), we assume that the averaged property P does not oscillate significantly. In Fig. 5 this is the case in the range of rmin to rmax , so an arbitrarily shaped volume V with an inscribed sphere with radius rmin and a circumscribed sphere with radius rmax can be chosen as REV. Accordingly, we do not assume any heterogeneities on the REV scale. For our model, we assume that the effects of the sub-REV scale heterogeneities are taken into account by effective parameters. The scales of interest in this work are the mesoscale (which we also call fine scale) and the megascale (for us, the coarse scale).
2.2
Upscaling and Multi-scale Methods
In multi-scale modeling, more than one scale is involved in the modeling as the name implies. In general, each pair of scales is coupled in a bidirectional way, where the coarser scale contains the finer scale. This means that upscaling and downscaling methods have to be provided. Upscaling is a transition of the finer to the coarser scale and downscaling vice versa. Both kinds of operators are generally needed. Only special applications with weak coupling between the scales allow for a monodirectional coupling and thus only upscaling or only downscaling operators. Classical upscaling strategies comprise the method of asymptotic expansions (homogenization) and volume averaging. Usually, the fine-scale informations which get lost due to averaging are accounted for by effective parameters in the upscaled
710
R. Helmig et al.
equations. For downscaling, the typical methodology is to specify boundary conditions at the boundaries of a coarse-grid block and solve a fine-grid problem in the respective domain. The boundary conditions are obtained either directly from the coarse-scale problem or coarse-scale results are rescaled to fine-scale properties using fine-scale material parameters. In the latter case, fine-grid boundary conditions can be specified along the boundaries of the downscaling domain. In the following, we provide a brief overview of common upscaling techniques and of multi-scale methods.
Upscaling Methods Volume Averaging/Homogenization Methods Coarse-scale equations may be derived from known fine-scale equations applying volume averaging or homogenization methods. Application of this methods for porous media flow can be found in, for example, Quintard and Whitaker (1988), Sáez et al. (1989), Gray et al. (1993), Whitaker (1998), Efendiev et al. (2000), Panfilov (2000), and Efendiev and Durlofsky (2002). Depending on the assumptions, different kinds of new coarse-scale parameters or functions occur in the upscaled equations accounting for fine-scale fluctuations which get lost due to the averaging. The problem is to find upscaled or averaged effective model parameters or functions which describe the physical large-scale behavior properly. It is a common approach to assume the fine-scale equations to be valid also on the coarse scale. In this case, application of volume averaging or homogenization methods can give information about the underlying and simplifying assumptions. Numerical Upscaling Assuming the coarse-scale equations to be known, the problem is to find upscaled or averaged effective model parameters or functions which describe the physical large-scale behavior properly. A flexible tool for calculating effective coefficients is provided by numerical upscaling techniques where representative fine-scale problems are solved to approximate the coarse-scale parameters. Generally, two types of method can be distinguished: global methods and local methods. Local methods choose subdomains of a size much smaller than the global scale (e.g., the size of one coarse-grid block). Examples of local upscaling techniques for singlephase parameters like permeabilities or transmissibilities or two-phase parameters like phase permeabilities can be found in, for example, Durlofsky (1991), Pickup and Sorbie (1996), Wallstrom et al. (2002a,b), and Efendiev and Durlofsky (2004). Global methods choose subdomains of the size of the model domain. Examples for such methods are the pseudo function approaches (see, e.g., Kyte and Berry 1975; Stone 1991; Barker and Thibeau 1997; Darman et al. 2002).
Multi-scale Methods Homogeneous Multi-scale Methods Homogeneous multi-scale methods inherently give approximate solutions on the microscale. They consist of the traditional numerical approaches to deal with
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
711
multi-scale problems, like multigrid methods (Bramble 1993; Briggs et al. 2000; Stüben 2001; Trottenberg et al. 2001), multi-resolution wavelet methods (Cattani and Laserra 2003; Jang et al. 2004; He and Han 2008; Urban 2009), multipole techniques (Giraud et al. 2006; Of 2007; Tornberg and Greengard 2008; Yao et al. 2008), or adaptive mesh refinement (Ainsworth and Oden 2000; Babuska and Strouboulis 2001; Müller 2003). Due to the usually enormous number of degrees of freedom on this scale, this direct numerical solution of real-world multiple scale problems is impossible to realize even with modern supercomputers. Heterogeneous Multi-scale Methods The heterogeneous multi-scale method (HMM) (Weinan et al. 2007) proposes general principles for developing accurate numerical schemes for multiple problems while keeping costs down. It was first introduced in Weinan and Engquist (2003a) and clearly described in Weinan and Engquist (2003b). The general goal of the HMM, as in other multi-scale type methods, is to capture the macroscopic behavior of multi-scale solutions without resolving all the fine details of the problem. The HMM does this by selectively incorporating the microscale data when needed and exploiting the characteristics of each particular problem. Variational Multi-scale Method In Hughes (1995) and Hughes et al. (1998), they present the variational multi-scale method that serves as a general framework for constructing multi-scale methods. An important part of the method is to split the function space into a coarse part, which captures low frequencies, and a fine part, which captures the high frequencies. An approximation of the fine-scale solution is computed and it is used to modify the coarse-scale equations. In recent years, there have been several works on convection–diffusion problems using the variational multi-scale framework (see, e.g., Codina 2001; Hauke and García-Olivares 2001; Juanes 2005) and it has also been applied as a framework for multi-scale simulation of multiphase flow through porous media; see, e.g., Juanes (2005), Juanes and Dub (2008), Kees et al. (2008), Nordbotten (2009), and Calo et al. (2011). Multi-scale Finite Volume Method The underlying idea is to construct transmissibilities that capture the local properties of the differential operator. This leads to a multipoint discretization scheme for the finite volume solution algorithm. The transmissibilities can be computed locally and therefore this step is perfectly suited for massively parallel computers. Furthermore, a conservative fine-scale velocity field can be constructed from the coarse-scale pressure solution. Over the recent years, the method became able to deal with increasingly complex equations (Jenny et al. 2006; Lunati and Jenny 2006, 2007, 2008; Hajibeygi et al. 2008; Lee et al. 2008, 2009). Multi-scale Finite Element Method Another multi-scale method, the multi-scale finite element method, was presented in 1997 (Hou and Wu 1997). The theoretical foundation is based on homogenization theory. The main idea is to solve local fine-scale problems numerically in order to
712
R. Helmig et al.
use these local solutions to modify the coarse-scale basis functions. There has been a lot of work on this method over the last decade; see, e.g., Chen and Hou (2003), Aarnes et al. (2006, 2008), Arbogast et al. (2007), Efendiev and Hou (2007), Kim et al. (2007), and Kippe et al. (2008). Multi-scale Methods and Domain Decomposition By comparing the formulations, the authors of Nordbotten and Bjørstad (2008) observe that the multi-scale finite volume method is a special case of a nonoverlapping domain decomposition preconditioner. They go on to suggest how the more general framework of domain decomposition methods can be applied in the multi-scale context to obtain improved multi-scale estimates. Further work on multi-scale modeling of flow through porous media using a domain decomposition preconditioner can be found in Galvis and Efendiev (2010) and Sandvin et al. (2011). Vertical Equilibrium Methods Vertical equilibrium methods are special kinds of multi-scale methods. An upscaled model is derived through vertical integration of the three-dimensional governing equations for two-phase flow under the assumptions of vertical equilibrium, complete gravity segregation, and a sharp interface between the two phases. The resulting model is a two-dimensional model for flow in the lateral directions only (vertical flow is zero). The underlying assumptions are sufficiently justified in many CO2 sequestration scenarios, which currently are the main application areas of vertical equilibrium models. Formulations with (Gasda et al. 2009) and without (Gasda et al. 2011) upscaling of convective mixing exist. Adaptive Upscaling Methods Local numerical upscaling methods (see section “Upscaling Methods”) can be extended to (adaptive) local–global methods. In this case, boundary conditions for solution of the local fine-scale flow problems are determined from the global coarse-scale solution via a downscaling step. Depending on the type of model the physical regime and the required accuracy, the effective parameters may have to be recalculated each time the global solution changes considerably (see, e.g., Chen et al. 2003, 2013; Chen and Durlofsky 2006; Chen and Li 2009). Combining numerical upscaling and downscaling, (adaptive) local–global methods can also be viewed as multi-scale methods.
2.3
Multi-physics Methods
In general, the term multi-physics is used whenever processes which are described by different sets of equations interact and thus are coupled within one global model. The coupling mechanisms can in general be divided into volume (or vertical) coupling and surface (or horizontal) coupling. In this sense, the multi-scale approaches introduced before could be interpreted as vertical coupling approaches. Moreover,
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
713
a large variety of multi-continua models exist. Here, the model domain is physically the same for the different sets of equations, and the exchange is usually performed by means of source and sink terms. Within the context of porous media, most wellknown multi-continua models include the double porosity models (Arbogast 1989; Ryzhik 2007) and the MINC method (Pruess 1985; Smith and Seth 1999). In contrast to that, horizontal coupling approaches divide the model domain into subdomains sharing common interfaces. The coupling is achieved by enforcing appropriate interface conditions. In physical terms, these interface conditions should state thermodynamic equilibrium (mechanical, thermal, and chemical equilibrium), while in mathematical terms, they often correspond to the continuity of the employed primal and dual variables, like pressure and normal velocity. Examples for surface coupling are discrete fracture approaches (Dietrich et al. 2005) or the coupling of porous media flow and free flow domains (Beavers and Joseph 1967; Discacciati et al. 2002; Layton et al. 2003; Girault and Rivière 2009; Jäger and Mikeli´c 2009). While these two examples couple different types of flow regimes, we will in our study concentrate on the coupling of different processes inside one porous media domain. A good overview of such multi-physics methods can be found in Wheeler and Peszyska (2002). In Albon et al. (1999), the authors present an interface concept to couple two-phase flow processes in different types of porous media. The coupling of different models for one-, two-, or three-phase flow incorporating an iterative nonlinear solver to ensure the interface coupling conditions was presented in Peszynska et al. (2000).
3
Mathematical Models for Flow and Transport Processes in Porous Media
We introduce the balance equations of flow and transport processes in porous media by means of an REV concept, i.e., on our fine scale. These equations may be upscaled in a subsequent step using one of the techniques of section “Upscaling Methods” or used in a multi-scale technique of section “Multi-scale Methods.” After establishing the necessary physical background, the equations for isothermal multiphase flow processes are derived, both for the case of immiscible fluids and for miscible fluids. Furthermore, we give an introduction to different decoupled formulations of the balance equations paving the way for specialized solution schemes discussed in the following section. Finally, an extension to non-isothermal processes is provided.
3.1
Preliminaries
After stating the basic definitions of phases and components, the essential fluid and matrix parameters are introduced. Parameters and constitutive relations describing fluid–matrix interactions are discussed, and some common laws for fluid-phase equilibria are reviewed.
714 Fig. 6 Contact angle between a wetting and a non-wetting fluid
R. Helmig et al.
non−wetting phase
wetting phase
θ > 90° θ < 90°
Basic Definitions Phases If two or more fluids fill a volume (e.g., the pore volume) and are immiscible and separated by a sharp interface, each fluid is called a phase of the multiphase system. Formally, the solid matrix can also be considered as a phase. If the solubility effects are not negligible, the fluid system has to be considered as a compositional multiphase system. A pair of two different fluid phases can be divided into a wetting and a nonwetting phase. Here, the important property is the contact angle between fluid– fluid interface and solid surface (Fig. 6). If the contact angle is acute, the phase has a higher affinity to the solid and is therefore called wetting, whereas the other phase is called non-wetting. Components A phase usually consists of several components which can either be pure chemical substances or consist of several substances which form a unit with constant physical properties, such as air. Thus, it depends on the model problem which substances or mixtures of substances are considered as a component. The choice of the components is essential, as balance equations for compositional flow systems are in general formulated with respect to components.
Fluid Parameters Compositions and Concentrations The composition of a phase ˛ is described by fractions of the different components contained in the phase. Mass fractions X˛ give the ratio of the mass mass of one component to the total mass of phase ˛, m X˛ D P : m
(1)
From this definition, it is obvious that the mass fractions sum up to unity for each phase,
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
X
X˛ D 1:
715
(2)
A concept which is widely used in chemistry and thermodynamics is mole fractions which phasewise relate the number of molecules of one component to the total number of components. Mole fractions are commonly denoted by lower case letters and can be calculated from mass fractions via the molar mass M by X =M x˛ D P ˛ X˛ =M
(3)
Both mole fractions and mass fractions are dimensionless quantities. Concentration is the mass of a component per volume of the phase and thus obtained by multiplying the mass fraction of by the density of the phase, C˛ D %˛ X˛ , which
the component 3 yields the SI unit kg=m . Density The density % relates the mass m of an amount of a substance to the volume V which is occupied by it: m (4) %D : V
The corresponding unit is kg=m3 . For a fluid phase ˛, it generally depends on the phase pressure p˛ and temperature T , as well as on the composition x˛ of the phase: (5) %˛ D %˛ p˛ ; T; x˛ : Since the compressibility of the solid matrix as well as its temperature dependence can be neglected for many applications, one can often assign a constant density to solids. For liquid phases, the dependence of density on the pressure is usually very low and the contribution by dissolved components is not significant. Thus, the density can be assumed to be only dependent on temperature, %˛ D %˛ .T /. For isothermal systems, the temperature is constant in time and thus, the density of the liquid phase is also constant in time. The density of gases is highly dependent on temperature as well as on pressure. Viscosity Viscosity is a measure for the resistance of a fluid to deformation under shear stress. For Newtonian fluids, the fluid shear stress is proportional to the temporal deformation of an angle , namely, D @=@t. The
proportionality factor is called dynamic viscosity with the SI unit .N s/ =m2 D Œkg= .m s/ . In general, the viscosity of liquid phases is primarily determined by their composition and by temperature. With increasing temperature, the viscosity of liquids decreases. Contrarily, the viscosity of gases increases with increasing temperature (see, e.g., Atkins 1994).
716
R. Helmig et al.
Matrix Parameters Porosity A porous medium consists of a solid matrix and the pores. The dimensionless ratio of the pore space within the REV to the total volume of the REV is defined as porosity :
D
volume of pore space within the REV : total volume of the REV
(6)
If the solid matrix is assumed to be rigid, the porosity is constant and independent of temperature, pressure, or other variables. Intrinsic Permeability The intrinsic permeability characterizes the inverse of the resistance of the porous matrix to flow through that matrix. Depending on the matrix type, the permeability may have different values for different flow directions which in general yields a tensor K with the unit m2 .
Parameters Describing Fluid–Matrix Interaction Saturation The pore space is divided and filled by the different phases. In the macroscopic approach, this is expressed by the saturation of each phase ˛. This dimensionless number is defined as the ratio of the volume of phase ˛ within the REV to the volume of the pore space within the REV: S˛ D
volume of phase ˛ within the REV : volume of the pore space within the REV
(7)
Assuming that the pore space of the REV is completely filled by the fluid phases ˛, the sum of the phase saturations must be equal to one: X
S˛ D 1:
(8)
˛
If no phase transition occurs, the saturations change due to displacement of one phase by another phase. However, a phase can in general not be fully displaced by another, but a certain saturation will be held back, which is called residual saturation. For a wetting phase, a residual saturation occurs if parts of the displaced wetting phase are held back in the finer pore channels during the drainage process (see Fig. 7, left-hand side). On the other side, a residual saturation for the nonwetting phase may occur if bubbles of the displaced non-wetting phase are trapped by surrounding wetting phase during the imbibition process (see Fig. 7, righthand side). Therefore, a residual saturation may depend on the pore geometry, the
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
717
Fig. 7 Residual saturations of the wetting and non-wetting phase, respectively
heterogeneity, and the displacement process, but also on the number of drainage and imbibition cycles. If the saturation of a phase S˛ is smaller than its residual saturation, the relative permeability (section “Relative Permeability”) of phase ˛ is equal to zero which means that no flux of that phase can take place. This implies that a flux can only occur, if the saturation of a phase ˛ lies between the residual saturation and unity .Sr˛ S˛ 1/. With the residual saturation, an effective saturation for a two-phase system can be defined in the following way: Se D
Sw Srw 1 Srw
Srw Sw 1:
(9)
Alternatively, in many models the following definition is used: Se D
Sw Srw 1 Srw Srn
Srw Sw 1 Srn :
(10)
Which definition has to be used depends on the way the capillary pressure and the relative permeability curves are obtained, as explained below. Further considerations on the use of effective saturations are made in Helmig (1997). Capillarity Due to interfacial tension, forces occur at the interface of two phases. This effect is caused by interactions of the fluids on the molecular scale. Therefore, the interface between a wetting and a non-wetting phase is curved and the equilibrium at the interface leads to a pressure difference between the phases called capillary pressure pc : pc D pn pw ;
(11)
where pn is the non-wetting phase and pw the wetting-phase pressure. In a macroscopic consideration, an increase of the non-wetting-phase saturation leads to a decrease of the wetting-phase saturation and, according to microscopic considerations, to the retreat of the wetting fluid to smaller pores. It is common to regard the macroscopic capillary pressure as a function of the saturation,
718
R. Helmig et al.
pc D pc .Sw /;
(12)
the so-called capillary pressure–saturation relation. The simplest way to define a capillary pressure–saturation function is a linear approach: pc .Se .Sw // D pc;max .1 Se .Sw //:
(13)
The most common pc -Sw -relations for a two-phase system are those of Brooks and and van Genuchten. In the Brooks–Corey model, 1
pc .Se .Sw // D pd Se .Sw /
pc pd ;
(14)
the capillary pressure is a function of the effective saturation Se . The entry pressure pd represents the minimum pressure needed for a non-wetting fluid to enter a porous medium initially saturated by a wetting fluid. The parameter is called pore-size distribution index and usually lies between 0.2 and 3.0. A very small -parameter describes a single size material, while a very large parameter indicates a highly nonuniform material. The parameters of the Brooks–Corey relation are determined by fitting to experimental data. The effective saturation definition which is used in this parameter fitting is also the one to choose for later application of the respective capillary pressure or relative permeability function. Relative Permeability Flow in porous media is strongly influenced by the interaction between the fluid phase and the solid phase. If more than one fluid phase fill the pore space, the presence of one phase also disturbs the flow behavior of another phase. Therefore, the relative permeability kr˛ which can be considered as a scaling factor is included into the permeability concept. Considering a two fluid-phase system, the space available for one of the fluids depends on the amount of the second fluid within the system. The wetting phase, for example, has to flow around those parts of the porous medium occupied by non-wetting fluid or has to displace the non-wetting fluid to find new flow paths. In a macroscopic view, this means that the cross-sectional area available for the flow of a phase is depending on its saturation. If the disturbance of the flow of one phase is only due to the restriction of available pore volume caused by the presence of the other fluid, a linear correlation for the relative permeability can be applied, krw .Se .Sw // D Se .Sw /;
(15)
krn .Se .Sw // D 1 Se .Sw /:
(16)
This formulation also implies that the relative permeability becomes zero if the residual saturation, representing the amount of immobile fluid, is reached.
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
719
In reality, one phase usually influences the flow of another phase just not only by the restriction in available volume but also by additional interactions between the fluids. If capillary effects occur, the wetting phase, for example, fills the smaller pores if the saturation is small. This means that in case of an increasing saturation of the wetting phase, the relative permeability krw has to increase slowly if the saturations are still small and it has to increase fast if the saturations become higher, since then the wetting phase begins to fill the larger pores. For the non-wetting phase the opposite situation is the case. Increasing the saturation, the larger pores are filled at first causing a faster rise of krn . At higher saturations the smaller pores become filled which slows down the increase of the relative permeability. Therefore, correlations for the relative permeabilities can be defined using the known capillary pressure–saturation relationships (for details, see Helmig 1997). Besides capillary pressure effects, also other effects might occur. As an example, the Brooks–Corey model is defined as
krw .Se .Sw // D Se .Sw /
2C3
;
h
krn .Se .Sw // D .1 Se .Sw //2 1 Se .Sw /
2C
(17)
i ;
(18)
where is the empirical constant from the Brooks–Corey pc .S / relationship (Eq. 14). These relative permeabilities do not sum up to unity as for the linear relationship. This is caused by the effects described before and means that one phase is slowed down stronger by the other phase as it would be only due to the restricted volume available for the flow.
Extended Darcy’s Law In a macroscopic treatment (we are on our fine scale) of porous media, Darcy’s law, which was originally obtained experimentally for Single-phase flow, can be used to calculate averaged velocities using the permeability. For multiphase systems, extended Darcy’s law incorporating relative permeabilities is formulated for each phase (for details, see Scheidegger 1974; Helmig 1997): v˛ D
kr˛ K.rp˛ C %˛ g/; ˛
(19)
where kr˛ is the relative permeability dependent on saturation, K the intrinsic permeability dependent on the porous medium, the dynamic fluid viscosity, p˛ the phase pressure, and %˛ the phase density, while g is the gravity vector. The mobility of a phase is defined as ˛ D kr˛ =˛ . Note that in Eq. (19), the pressure in phase ˛ is used which is important, since the pressure of different phases can differ due to capillarity. The product of the relative and the intrinsic permeability kr˛ K is often called total permeability Kt or effective permeability Ke .
720
R. Helmig et al.
Laws for Fluid-Phase Equilibria We give a short summary of common physical relationships, which govern the equilibrium state between fluid phases and thus the mass transfer processes, i.e., the exchange of components between phases. While a variety of other relationships can be found in literature, only Dalton’s law, Raoult’s law, as well as Henry’s law are treated here. Dalton’s Law Dalton’s law states that the total pressure of a gas mixture equals the sum of the pressures of the gases that make up the mixture, namely, pg D
X
pg ;
(20)
where pg is the pressure of a single component , the partial pressure, which is by definition the product of the mole fraction of the respective component in the gas phase and the total pressure of the gas phase, i.e., pg D xg pg :
(21)
Raoult’s Law Raoult’s law describes the lowering of the vapor pressure of a pure substance in a solution. It relates the vapor pressure of components to the composition of the solution under the simplifying assumption of an ideal solution. The relationship can be derived from the equality of fugacities; see Prausnitz et al. (1967). According to Raoult’s law, the vapor pressure of a solution of component is equal to the vapor pressure of the pure substance times the mole fraction of component in phase ˛. pg D x˛ pvap
(22)
Here, pvap denotes the vapor pressure of pure component which is generally a function of temperature.
Henry’s Law Henry’s law is valid for ideally diluted solutions and ideal gases. It is especially used for the calculation of the solution of gaseous components in liquids. Considering a system with gaseous component , a linear relationship between the mole fraction x˛ of component in the liquid phase and the partial pressure pg of in the gas phase is obtained: x˛ D H˛ pg :
(23)
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . . Fig. 8 Applicability of Henry’s law and Raoult’s law for a binary gas–liquid system (After Lüdecke and Lüdecke (2000))
721
2
1
pg
Hw
Henry’s Law Raoult’s Law
1
p vap
moles of component 1
0
1
total number of moles in system
The parameter H˛ denotes the Henry coefficient of component in phase ˛, which is dependent on temperature, H˛ D H˛ .T /. Figure 8 shows the range of applicability of both Henry’s law and Raoult’s law for a binary system, where component 1 is a component forming a liquid phase, e.g., water, and component 2 is a component forming a gaseous phase, e.g., air. One can see that for low mole fractions of component 2 in the system (small amounts of dissolved air in the liquid phase), Henry’s law can be applied, whereas for mole fractions of component 1 close to 1 (small amounts of vapor in the gas phase), Raoult’s law is the appropriate description. In general, the solvent follows Raoult’s law as it is present in excess, whereas the dissolved substance follows Henry’s law as it is highly diluted.
The Reynolds Transport Theorem A common way to derive balance equations in fluid dynamics is to use the Reynolds transport theorem (e.g., Helmig 1997; White 2003), named after the British scientist Osborne Reynolds. Let E be an arbitrary property of the fluid (e.g., mass, energy, momentum) that can be obtained by the integration of a scalar field e over a moving control volume G: Z e dG : (24) ED G
The Reynolds transport theorem states that the temporal derivative of the property in a control volume moving with the fluid can be related to local changes of the scalar field by dE d D dt dt
Z
Z e dG D G
G
@e C r .e v/ dG ; @t
(25)
722
R. Helmig et al.
For a general balance equation, we require a conservation of the property E. Thus, the property can only change due to sinks and sources, diffusion, or dissipation: dE D dt
Z G
@e C r .ev/ dG D @t
Z q e r wdG ;
(26)
G
where w is the diffusive flux of e and q e is the source per unit volume.
3.2
Multiphase Flow
We derive the mass balance equations for the fully immiscible and the compositional case. Both derivations are based on the general balance equation (26) and the insertion of extended Darcy’s law (19) for the involved velocities.
The Immiscible Case According to the specifications provided in Sect. 3.1, the mass of a phase ˛ inside a control volume G can be expressed by Z m˛ D
S˛ %˛ dG:
(27)
G
R Under the assumption that the phases are immiscible, the total mass m˛ D G m˛ dG is conserved, i.e., dm˛ =dt D 0 in the absence of external sources. Using the general balance equation (26), this mass conservation can be rewritten as Z G
@ . %˛ S˛ / C r . S˛ %˛ va˛ / dG D @t
Z %˛ q˛ dG :
(28)
G
We emphasize that the diffusive flux is assumed to be zero, since for the motion of phases on an REV scale, no diffusion or dispersion processes are considered. Coarse-scale approaches may, however, include additional dispersive terms to counterbalance the loss of fine-scale informations on the coarse scale. The control volume considered for the transport theorem moves with va˛ , which is related to the Darcy velocity v˛ by v˛ D S˛ va˛ :
(29)
Inserting (29) into (28) yields the integral balance equation for a single phase ˛ in a multiphase system: Z G
@ . %˛ S˛ / C r .%˛ v˛ / dG D @t
Z %˛ q˛ dG ; G
(30)
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
723
Rewriting this equation in differential form and inserting extended Darcy’s law (19) yield a system of n˛ partial differential equations (with n˛ the number of phases): @ . %˛ S˛ / D r .%˛ ˛ K .rp˛ C %˛ g// C %˛ q˛ : @t
(31)
Under isothermal conditions, the system (31) of n˛ partial differential equations is already closed. In particular, the parameters , K, and g are intrinsic, and q˛ are given source terms. The densities %˛ are functions of pressure and the known temperature only, and the mobilities ˛ only depend on the phase saturations. The remaining n˛ constitutive relations for the 2n˛ unknowns S˛ ; p˛ are the closure relation (8) and the n˛ 1 capillary pressure–saturation relationships (12).
The Compositional Case We now allow that each phase is made up of different components which can also be partially dissolved in the other phases. Inserting the total concentration per component, C D
X
%˛ S˛ X˛ ;
(32)
˛
into the general balance equation (26) and applying the same considerations on the velocities as in section “The Immiscible Case” yield Z G
@C X C r %˛ X˛ v˛ dG D @t ˛
Z q dG ;
(33)
G
Rewriting this in differential form and inserting the extended Darcy law (19), we obtain a set of n partial differential equations (with n the number of components): X @C D r %˛ X˛ ˛ K .rp˛ C %˛ g/ C q : @t ˛
(34)
We remark that the immiscible case (31) can be easily derived from (34) as a special case. In particular, full immiscibility can be equally expressed as X˛ being known and constant with respect to space and time. By eventually regrouping and renaming the components with respect to the fixed phase compositions, we can furthermore state that each component is associated with a distinct phase and X˛ D 1 holds for this particular phase ˛, whereas it equals zero for all other phases. This directly leads to (31). In general, we are left with n partial differential equations (34) for the 2n˛ C n n˛ unknowns p˛ ; S˛ ; X˛ : Considering (8) and (12) as in the immiscible case, and additionally the closure relations (2), yields 2n˛ constraints. The remaining
724
R. Helmig et al.
n .n˛ 1/ constraints have to be carefully chosen from the laws for fluid-phase equilibria; see section “Laws for Fluid-Phase Equilibria.”
3.3
Decoupled Formulations
It is often advantageous to reformulate the mass balance equations (31) or (34) into one elliptic or parabolic equation for the pressure and one or more hyperbolic– parabolic transport equations for the saturations or concentrations, respectively. In particular, this reformulation allows to employ multi-scale or discretization approaches which are especially developed and suited for the corresponding type of equation and to combine them in various ways. Furthermore, a sequential or iterative solution procedure reduces the amount of unknowns in each solution step. In the following, we introduce these decoupled formulations for the immiscible and for the compositional case.
The Immiscible Case The reformulation of the multiphase mass balance equations (31) into one pressure equation and one or more saturation equations was primarily derived in Chavent (1976), where the author just called it a new formulation for two-phase flow in porous media. Due to the introduction of the idea of fractional flows, this formulation is usually called fractional flow formulation. Pressure Equation A pressure equation can be derived by summation of the phase mass balance equations. After some reformulation, a general pressure equation can be written as follows: X X 1 X @ @%˛ C r vt C C v˛ r%˛
S˛ S˛ q˛ D 0; (35) @t %˛ @t ˛ ˛ ˛ with the following definition of a total velocity: vt D
X
v˛ :
(36)
˛
Inserting extended Darcy’s law (19) into (36) yields vt D t K
" X ˛
f˛ rp˛
X
# f˛ %˛ g :
(37)
˛
where f˛ D ˛ =t is the fractional flow function of phase ˛ and t D
P
˛ is
˛
the total mobility. In the following, different possibilities to reformulate this general pressure equation for two-phase flow are shown. There also exist fractional flow
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
725
approaches for three phases, which are not further considered here. For details, we refer to, for example, Suk and Yeh (2008). Global Pressure Formulation for Two-Phase FlowP
Defining a global pressure p such that rp D
f˛ rp˛ (see below), Eq. (37) can
˛
be rewritten as a function of p:
"
vt D t K rp
X
# f˛ %˛ g
(38)
˛
Inserting (38) into (35) yields the pressure equation related to the global pressure. For a domain with boundary D D [ N , where D denotes a Dirichlet and N a Neumann boundary, the boundary conditions are p
D pD
on D
vt n
D qN
on N :
and
(39)
This means that a global pressure has to be found on a Dirichlet boundary which can lead to problems, as the global pressure is not a physical variable and thus cannot be measured directly. Following Chavent and Jaffré (1986), the global pressure is defined by 1 p D .pw C pn / 2
ZSw 1 dpc fw .Sw / .Sw / dSw ; 2 dSw
(40)
Sc
where Sc is the saturation satisfying pc .Sc / D 0. This definition makes sure that the global pressure is a smooth function and thus is easier to handle from a numerical point of view. However, as shown, e.g., in Binning and Celia (1999), an iterative solution technique is required for more complex (realistic) conditions, where a phase pressure might be known at a boundary. It becomes also clear that p D pw D pn , if the capillary pressure between the phases is neglected. Phase Pressure Formulation for Two-Phase Flow
A pressure equation can also be further derived using a phase pressure which is a physically meaningful parameter in a multiphase system. Investigations of a phase pressure fractional flow formulation can, for example, be found in Chen et al. (2006) and a formulation including phase potentials has been used in Hoteit and Firoozabadi (2008). Exploiting Eq. (11), Eq. (37) can be rewritten in terms of one phase pressure. This yields a total velocity in terms of the wetting-phase pressure as " vt D t K rpw C fn rpc
X ˛
# f˛ %˛ g ;
(41)
726
R. Helmig et al.
and in terms of a non-wetting-phase pressure as " vt D t K rpn fw rpc
X
# f˛ %˛ g ;
(42)
˛
Substituting vt in the general pressure equation (35) by Eqs. (41) or (42) yields the pressure equations as function of a phase pressure. In analogy to the global pressure formulation, the following boundary conditions can be defined: pw
D pD
on D
or
pn
D pD
on D
and
vt n
D qN
on N :
(43)
It is important to point out that we now have a physically meaningful variable, the phase pressure, instead of the global pressure. So boundary conditions at Dirichlet boundaries can be defined directly, if a phase pressure at a boundary is known. Saturation Equation We derive the transport equation for the saturation depending on whether a global or a phase pressure formulation is used. In the first case, a possibly degenerated parabolic–hyperbolic equation is derived, which is quite weakly coupled to the pressure equation. In the second case, a purely hyperbolic equation is obtained with a stronger coupling to the corresponding pressure equation. Global Pressure Formulation for Two-Phase Flow
In the case of a global pressure formulation, a transport equation for saturation related to the total velocity vt has to be derived from the general multiphase mass balance equations (35). With the definition of the capillary pressure (11), the extended Darcy’s law (19) can be formulated for a wetting and a non-wetting phase as vw D w K.rpw %w g/
(44)
vn D n K .rpw C rpc %w g/ :
(45)
and
Solving (45) for Krpw and inserting it into Eq. (44) yields vw D
w vn C w KŒrpc C .%w %n /g : n
(46)
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
727
With vn D vt vw , (46) can be reformulated as the fractional flow equation for vw : vw D
w w n vt C K Œrpc C .%w %n / g ; w C n w C n
(47)
which can be further inserted into the wetting-phase mass balance equation leading to a transport equation for the wetting-phase saturation related to vt : @. %w Sw / C r Œ%w .fw vt C fw n K .rpc C .%w %n / g// %w qw D 0: @t
(48)
Some terms of (48) can be reformulated in dependence on the saturation (details, see Helmig 1997). For incompressible fluids and a porosity which does not change in time, the saturation equation of a two-phase system can then be formulated showing the typical character of a transport equation as @Sw dfw d .fw n / C vt
C K.%w %n /g rSw @t dSw dSw dpc N Cr K rSw qw C fw qt D 0; dSw
(49)
where qt D qw C qn . Similarly, an equation for the non-wetting-phase saturation can be derived which can be written in its final form as @. %n Sn / C r Œ%n .fn vt fn w K.rpc .%n %w /g// %n qn D 0: @t
(50)
Phase Pressure Formulation for Two-Phase Flow
Obviously, for the phase pressure formulation, the saturation can be calculated directly from the mass balance equations (35), leading to @ . %w Sw / C r .%w vw / D qw @t
(51)
for the wetting phase of a two-phase system and to @ . %n Sn / C r .%n vn / D qn ; @t
(52)
where the phase velocities can be calculated using extended Darcy’s law. A nice feature of the global pressure formulation is that the two equations (pressure equation and saturation equation) are only weakly coupled through the presence of the total mobility and the fractional flow functions in the pressure equation. These are dependent on the relative permeabilities of the phases and thus dependent on the saturation. This also holds for the phase pressure formulation.
728
R. Helmig et al.
However, in this formulation the coupling is strengthened again due to the additional capillary pressure term in the pressure equation.
The Compositional Case Similarly to the fractional flow formulations for immiscible multiphase flow, decoupled formulations for compositional flow have been developed. However, dissolution and phase changes of components affect the volume of mixtures, compromising the assumption of a divergence-free total velocity field. In Ács et al. (1985) and Trangenstein and Bell (1989), a pressure equation for compositional flow in porous media based on volume conservation is presented. The derivation as done in van Odyck et al. (2008) is based on the constraint that the pore space P always has to be filled by some fluid, i.e., the volume of the fluid mixture vt D ˛ v˛ Œm3 =m3 has to equal the pore volume: vt D :
(53)
To capture transient processes, both sides of the volume balance are approximated in time by a Taylor series:
vt .t/ C t
@ @vt C O t 2 D .t/ C t C O t 2 : @t @t
(54)
In the isothermal case, the fluid volume changes in time if there are variations of pressure or a change of mass. If latter is expressed in terms of the total concentration C as introduced in Eq. (32), we get @vt @p X @vt @C @vt D C ; @t @p @t @C @t
(55)
Inserting Eq. (55) in (54), neglecting the higher order terms, reordering under the assumption of an incompressible porous medium @ @t D 0 yields
vt @vt @p X @vt @C C D : @p @t @C @t t
(56)
For the change of total concentration in time, we include the compositional conservation equation (34). The term on the ride side in Eq. (56) arises if noniterated secondary variables such as density lead to minor violations of Eq. (53). Details on the treatment of this term of volume error can be found in Fritz et al. (2012) and Pau et al. (2012).
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
3.4
729
Non-isothermal Flow
The consideration of non-isothermal flow processes involves an additional conservation property: energy. This is expressed as internal energy inside a unit volume which consists of the internal energies of the matrix and the fluids: Z U D
G
X
.%˛ S˛ u˛ / C .1 / %s cs T dG ;
(57)
˛
where the internal energy is assumed to be a linear function of temperature T above a reference point. Then cs denotes the heat capacity of the rock and u˛ is the specific internal energy of phase ˛. The internal energy in a system is increased by heat fluxes into the system and by mechanical work done on the system dU dQ dW D C : dt dt dt
(58)
Heat flows over the control volume boundaries by conduction, which is a linear function of the temperature gradient and occurs in the direction of falling temperatures: dQ D dt
I
Z n .s rT / d D
r .s rT / dG :
(59)
G
The mechanical work done by the system (and therefore decreasing its energy) is volume changing work. It is done when fluids flow over the control volume boundaries against a pressure p: dW D dt
I
Z p .n v/ d D
r .pv/ dG :
(60)
G
The left-hand side of Eq. (58) can be expressed by the Reynolds transport theorem, where the velocity of the solid phase equals zero: dU D dt
! X @ @ .%˛ S˛ u˛ / C ..1 / %s cs T / dG
@t G @t ˛ I X .%˛ u˛ v˛ / d : C n
Z
(61)
˛
Using the definition of specific enthalpy h D u C p=%, the second term on the right-hand side of Eq. (61) and the right-hand side of Eq. (60) can be combined and Eq. (58) can be rewritten to
730
R. Helmig et al.
! X @ @ ..1 / %s cs T / dG
.%˛ S˛ u˛ / C @t @t G ˛ I I X D n .s rT / d n .%˛ h˛ v˛ / d
Z
(62)
˛
or in differential form @ @t
X
! .%˛ S˛ u˛ / C
˛
D r .s rT / r
X
@ ..1 / %s cs T / Œ2pt @t .%˛ h˛ v˛ /
(63)
˛
4
Numerical Solution Approaches
After establishing the continuous models, it remains to choose numerical discretization and solution schemes. We will only very briefly address this question for the fully coupled balance equations (31) or (34) in Sect. 4.1; our main emphasis here is on the decoupled systems, which will be treated in Sect. 4.2.
4.1
Solution of the Fully Coupled Equations
One possibility to calculate multiphase flow is to directly solve the system of equations given by the balances (31) or (34). These mass balance equations are usually nonlinear and strongly coupled. Thus, we also call this the fully coupled multiphase flow formulation. After space discretization, the system of equations one has to solve can be written as @ M.u/ C A.u/ D R.u/; @t
(64)
where M consists of the accumulation terms, A includes the internal flux terms, and R is the right-hand side vector which comprises Neumann boundary flux terms as well as source or sink terms. An implicit time discretization is applied to (64) that results in a fully implicit formulation. Usually, the implicit Euler method is chosen. All resulting equations have to be solved simultaneously due to the strong coupling. Therefore, a linearization technique has to be applied. The most common solution method is the Newton–Raphson algorithm (Aziz and Wong 1989; Dennis and Schnabel 1996). Advantages of the fully coupled formulation and the implicit method, respectively, are that it includes the whole range of physical effects (capillarity, gravity,. . . )
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
731
without having additional effort, that it is quite stable, and that it is usually not very sensitive to the choice of the time-step size. The disadvantage is that a global system of equations, which is twice as large as for a single-phase pressure equation if twophase flow is calculated (and even larger in the non-isothermal case or including more phases), has to be solved several times during each time step, dependent on the number of iterations the linearization algorithm needs to converge.
4.2
Solution of the Decoupled Equations
As before, we split our considerations into the immiscible and the miscible case.
The Immiscible Case As their name implies, decoupled formulations decouple the system of equations of a multiphase flow formulation to some extent. In the immiscible case, the result is an equation for pressure and additional transport equations for one saturation (see section “The Immiscible Case”) in the case of two-phase flow or several saturations if more phases are considered. The new equations are still weakly coupled due to the saturation-dependent parameters like relative permeabilities or capillary pressure in the pressure equation and the pressure-dependent parameters like density and viscosity in the saturation transport equation. Nevertheless, in many cases it is possible to solve this system of equations sequentially. Numerically, this is usually done by using an IMPES scheme (IMplicit Pressure – Explicit Saturation), which was first introduced in Sheldon and Cardwell (1959) and Stone and Garder (1961). The pressure equation is solved first implicitly. From the resulting pressure field, the velocity field can be calculated and inserted into the saturation equation which is then solved explicitly. One major advantage of the decoupled formulation is that it allows for different discretization of the different equations. For the pressure equation, it is of utmost importance that its solution admits the calculation of a locally conservative velocity field. There are various discretization methods meeting this requirement, like finite volumes with two-point or multipoint flux approximation (Aavatsmark 2002; Eigestad and Klausen 2005; Klausen and Winther 2006; Cao et al. 2008; Aavatsmark et al. 2008), mixed finite elements (Brezzi and Fortin 1991; Allen et al. 1992; Srinivas et al. 1992; Huang 2000; Mazzia and Putti 2006), or mimetic finite differences (Shashkov 1996; Hyman et al. 2002; Berndt et al. 2005; Brezzi et al. 2005a,b). Moreover, it is also possible to use discretizations with nonconservative standard velocity fields and employ a post-processing step to reconstruct a locally conservative scheme. This has been investigated for discontinuous Galerkin methods in Bastian and Rivière (2003), while for standard Lagrangian finite elements, it is possible to calculate equilibrated fluxes known from a posteriori error estimation (Ainsworth and Oden 2000). Similarly, there exists a variety of discretization methods for the solution of the transport equation(s), ranging from standard upwind finite volume approaches (Eymard et al. 2000; LeVeque 2002), over higher-order discontinuous Galerkin methods with slope limiter (Cockburn et al. 1989; Cockburn and Shu 1989; Hoteit
732
R. Helmig et al.
et al. 2004; Ghostine et al. 2009), the modified method of characteristics (Ewing et al. 1984; Dawson et al. 1989; Douglas et al. 1999; Chen et al. 2002), and the Eulerian–Lagrangian localized adjoint method (Russell 1990; Herrera et al. 1993; Ewing and Wang 1994, 1996; Wang et al. 2002), up to streamline methods (Matringe et al. 2006; Juanes and Lie 2008; Oladyshkin et al. 2008). The IMPES scheme can be very efficient, since a system of equations with only n unknowns, where n is the number of degrees of freedom for the discretization of the pressure equation, has to be solved only once in the pressure step. In comparison, several solutions of a system of equations with m unknowns have to be calculated in the fully coupled scheme, where m is usually at least twice as large as n. However, there are strong restrictions with respect to the choice of the time-step size. Stability analyses of the IMPES scheme can be found in, for example, Aziz and Settari (1979), Russell (1989), and Coats (2003a,b). The scheme is conditionally stable if for each grid cell i , t
Fi 1 D CFL
i Vi
(65)
where CFL is the CFL number, t is the time-step size, V is the volume of the cell, and is the porosity. The CFL-volume-flux Fi of cell i can be defined as Fi D max.Fi;in ; Fi;out /
(66)
where Fi;in is the sum of all CFL-volume-fluxes qj that enter and Fi;out of the fluxes that leave cell i through face j : Fi;in D
X
qj;in ;
Fi;out D
X
j
qj;out :
(67)
j
Equation (65) states that the volume obtained by multiplying Fi with t must not exceed the pore volume in cell i . In the one-dimensional case, this condition is the well-known Courant–Friedrichs–Lewy CFL condition: t
vi 1 D CFL:
x
(68)
From Eq. (65) it follows that a stable time step can be estimated as t D CFL min
i Vi Fi
;
CFL 1:
(69)
There exist various approaches for the estimation of Fi which differ for different kinds of transport equations and for varying complexity of the physical processes. One straightforward approach for calculating Fi is to define the fluxes q˛j to be the face fluxes:
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
qj D
X
733
jv˛j nj Aj j
(70)
˛
where nj and Aj are face normal and face area, respectively. However, it is important that, for CFL D 1, this approach guarantees stability of the IMPES scheme but only leads to a physically correct solution if a linear hyperbolic transport equation is solved. In this case, the displacement front is a shock moving with constant velocity. If the transport equation is nonlinear hyperbolic or parabolic, the front velocity varies with saturation and diffusive transport may occur. In this case, the CFL number has to be less than one. To avoid the kind of heuristic choice of CFL, stability analyses provides theoretically based estimates for Fi . If capillary pressure and gravity are neglected, Fi;in and Fi;out can be estimated as derived in Aziz and Settari (1979) or Coats (2003b) using ˇ ˇ ˇ ˇ df X ˇ ˇ w v˛j nj Aj ˇ : qj D ˇ ˇ ˇ dSw j ˛
(71)
dfw In this approach the derivative of the wetting-phase-fractional-flow function dS w accounts for the nonlinearity of the movement of the fluid fronts. Thus, the CFL number can be chosen as CFL D 1. For the general case, including capillary pressure and gravity, an IMPES stability criterion is derived in Coats (2003a), which also accounts for countercurrent flow. It is originally derived for three-phase flow but can be reduced for the simplified case of two-phase flow. The volume fluxes Fi;in and Fi;out are then estimated from Eq. (67) by substituting qj by
"
qj D Tj fnj
d w dSw
fwj
d n dSw
jˆw jj
(72)
j
jˆn jj fwj nj
j
dpc dSw
# :
(73)
j
The efficiency of the explicit scheme mainly depends on the time-step size. If the nonlinear coupling between pressure and transport equation or the physically correct approximation of the transport requires small time-step sizes, implicit schemes may become more efficient. A comparison between an IMPES scheme and a fully implicit scheme can, for example, be found in Sleep and Sykes (1993).
The Compositional Case The decoupled compositional multiphase flow equations derived in section “The Compositional Case” can be solved sequentially according to the IMPES scheme, where it is commonly referred to as IMPEC scheme, since concentrations are considered in this case. First, the pressure equation (56) is solved implicitly to obtain a pressure field and fluid-phase velocities which are used to explicitly solve
734
R. Helmig et al.
the transport equation (34). After each of these pressure-transport sequences, the distribution of total concentrations is known. For the next sequence, phase saturation and component mass fractions are needed. These are gained by performing a phase equilibrium, or the so-called flash calculation (see below), which forms the last step of the IMPEC scheme. Flash Calculations After the evaluation of the transport equation, the total concentrations at each cell or nodeP are known. From these, an overall mass fraction (or feed mass fraction) z D C = C of each component inside the mixture is calculated. We are interested in the phase distribution and the phase composition, as these quantities appear in the balance equations. To gain these, we introduce at first the equilibrium ratios K˛ D
X˛ ; Xr
(74)
which relate the mass fractions of each component in each phase ˛ to its mass fraction in a reference phase r, where Kr obviously always equals unity. The equilibrium ratios can be obtained by using the laws for fluid-phase equilibria in section “Laws for Fluid-Phase Equilibria,” as described in Niessner and Helmig (2007), or by incorporating a thermodynamic equation of state (Aziz and Wong 1989; Nghiem and Li 1984; Michelsen and Mollerup 2007). In the former case, the equilibrium ratios are independent of the composition and are therefore constant for constant pressure and temperature. The feed mass fraction z can be related to the phase mass fractions X˛ via the phase mass fraction ˛ D PCC by
z D
X
˛ X˛ :
(75)
˛
Combining and rearranging of (74) and (75) yields Xr D
X
P
z K˛ ˛
C r
;
(76)
˛¤r
and some more steps, which are elaborately described in Aziz and Wong (1989), yield a set of n˛ 1 equations, known as the Rachford–Rice equation X
z K˛ 1 D0 P K ˛ 1 ˛ 1C
8˛ ¤ r ;
(77)
˛¤r
which generally has to be solved iteratively for the phase mass fraction ˛ . Only in the case of the same number of phases and components, the Rachford–Rice equation can be solved analytically. Once the phase mass fractions are known, the mass
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
735
fractions of the components inside the reference phase can be calculated by (76) and then lead to the mass fractions inside the other phases via (74). Other flash calculation approaches which use the so-called reduced equation algorithms and which basically use modified forms of the presented equations are presented in Wang and Barker (1995).
5
Application of Multi-physics and Multi-scale Methods
In the following we present our approach to multi-scale and multi-physics modeling which slightly differs to the methods mentioned before by means of two examples. The common base for both kinds of methods is the use of an h-adaptive grid. This can be combined with numerical upscaling methods to a multi-scale method with regard to scale-dependent heterogeneous parameters or allows the adaptive local treatment of domains of different physical regimes in the sense of a multi-physics approach. This comparably simple approach follows the idea of developing a tool which is above all simple and flexible enough to be easily applied and adapted to complex real-life application. All of the methods and examples that are presented in the following have been implemented into the open-source porous media simulator DuMux (Flemisch et al. 2011).
5.1
A Multi-physics Example
In this section, we introduce a method to couple compositional two-phase flow with single-phase compositional flow as proposed in Fritz et al. (2012). The advantage of this coupling is that for single-phase compositional flow, a simpler pressure equation can be used. Moreover, for single-phase flow, the evaluation of flash calculations can be avoided. The latter becomes ever more interesting when these evaluations require lots of computational power such as in many reservoir engineering problems, where flash calculations may occupy up to 70 % of the total CPU time of a model (see Stenby and Wang 1993).
Single-Phase Transport We want to take a closer look at the equations for the miscible case and assume that only one phase is present. Inserting the definition of the total concentration (32) into the component mass balance equation (34) for one phase ˛ and applying the chain rule yields
%˛ X˛
@S˛ @ @X C S˛ %˛ X˛ C S˛ %˛ ˛ D r v˛ %˛ X˛ C q : @t @t @t
(78)
The first term on the left-hand side equals zero since only one phase is present and thus the saturation always equals unity. Further, we assume that phase and matrix compressibilities are of low importance and can be neglected. Then the second and
736
R. Helmig et al.
third terms on the left-hand side cancel out as well. A volumetric source q˛ of the present phase with the composition XQ is introduced and using the definition q D q˛ %˛ XQ , with XQ in the source-flow, we replace the mass source of component and write
%˛
@C @X˛ D D %˛ r v˛ X˛ C q˛ %˛ XQ : @t @t
(79)
This is the mass balance equation for the compositional transport in a single phase. The same assumptions can be applied to the compositional pressure equation. For incompressible flow, the derivatives of volume with respect to mass equal the reciprocal of the phase density, i.e., v@ t =@C D 1=%˛ . Inserting this identity into Eq. (56), setting the derivative of volume with respect to pressure to zero (incompressible flow), and applying the chain rule to the divergence term, we get X 1 X 1 X˛ %˛ r v˛ C %˛ v˛ rX˛ C X˛ v˛ r%˛ D q : %˛ %˛
(80)
By definition, the sum of the mass fractions X˛ inside one phase equals unity; thus, the second term in parenthesis cancels out. Furthermore, the gradient of the density equals zero due to the incompressibility. Again introducing the volumetric source term as above yields the pressure equation for incompressible single-phase compositional flow, which can as well be derived from Eq. (35): r v˛ D q˛
(81)
We note here that dropping the assumption of incompressibility only slightly complicates Eq. (81), which still provides an efficiency potential compared to Eq. (56) (Fritz et al. 2012).
Model Coupling Consider a spatial integration of Eq. (56) as is done for any numerical discretization. Then the unit of the terms is easily discovered to be volume over time. This basically reveals the physical aspect of the equation, namely, the conservation of total fluid volume as also described in section “The Compositional Case.” The same consideration holds for Eq. (81), which makes sense since it is derived from the compositional pressure equation. The associated transport equations also have a common physical relevance: the conservation of mass. The clear relation of the multiphase compositional and single-phase transport model and the possibility to use the same primary variables (as indicated in Eq. (79)) open the way to couple both models. This makes it possible to fit the model to the actual problem and use a sophisticated and accurate model in a subdomain of special interest, whereas a simpler model can be used in the rest of the domain. Consider, for example, a large hydro-system which is fully saturated with water except at a
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . . Fig. 9 Multi-physics problem example taken from Fritz et al. (2012)
737
vw Sw = 1, Sn = 0 Sn ⬆ 0
spill of nonaqueous phase liquid (NAPL) as displayed in Fig. 9. The components of the NAPL are solvable in water and contaminate it. To model the dissolution of the components in the water, only a small area around the NAPL spill has to be discretized by a compositional two-phase model. The spreading of the contaminants in the larger part of the domain can in contrary be simulated using a single-phase transport model. The advantage in coupling the two models in this domain is that in large parts, the costly equilibrium calculations and evaluation of volume derivatives can be avoided.
Practical Implementation The practical implementation of the multi-physics scheme proposed in the preceding sections is done by exploiting the similarity of the equations. Since both pressure equations have the same dimensions and the same unknowns, both can be written into one system of equations. The entries in the stiffness matrix and right-hand side vector are evaluated using either Eq. (56) if the control volume is situated inside the subdomain or Eq. (81) in the other parts of the domain. Equation (56) can also be set up properly at the internal boundary of the subdomain, since all coefficients can be determined. The coefficients concerning the phase which is not present outside the subdomain just have to be set to zero for the outer elements which makes all terms concerning this phase vanish at the boundary. Also the transport equation is well defined at the boundary. Since only one phase is present outside the subdomain, the mobility coefficient in Eq. (34) will equal zero for all other phases and the multiphase compositional mass balance will boil down to Eq. (79) at the interface. Implicitly the coupling conditions are already contained in the presented scheme: first, mass fluxes have to be continuous across the subdomain boundary and second, phase velocities have to be continuous across the subdomain boundary. Since only one phase is present outside the subdomain, it is obvious that the second condition requires that only this one phase may flow across the subdomain boundary. Another effect that has to be considered is demixing. Solubilities usually depend on pressure. If a phase is fully saturated with a certain component and then moves further downstream, where the pressure is lower, the solubility decreases and demixing occurs. If the solubility is exceeded outside the subdomain, this effect is not represented. These two considerations show the crucial importance of an adequate choice of the subdomain. On the one hand, we want the subdomain to be as small as possible to obtain an economic model; on the other hand, it must be chosen large enough to prevent errors. Especially in large and heterogeneous simulation
738
R. Helmig et al.
Fig. 10 Dissolution of residual air in water. In black squares, adaptive subdomain at initial conditions, after 100, 200, and 300 time steps, respectively (Taken from Fritz et al. (2012))
problems, it is unlikely to determine a proper subdomain in advance, so an automatic adaption is sought. As the most logical scheme, we propose to choose all cells with more than one phase and – since demixing occurs predominantly here – all directly adjacent cells to be part of the subdomain. At the end of each time step, the choice is checked and superfluous cells are removed and necessary cells are added. This quite easy decomposition can only be expected to be successful in the case of an explicit solution of the transport equations. In particular, the fulfillment of the CFL condition guarantees that no modeling error will occur, since information is transported at most one cell further in one time step. As example for the described subdomain adaptivity, see Fig. 10. Displayed is nearly residual air that moves slightly upward before it is dissolved in water. The subdomain is marked by black squares. In the upper row, the initial subdomain and its expansion due to demixing after 100 time steps can be seen. In the lower row, one can actually see that cells also get removed from the subdomain and that it moves and finally vanishes with the air phase.
Application Example To demonstrate the performance of the multi-physics approach and to compare it to the full compositional two-phase model on a real-life problem, we chose a benchmark problem for carbon dioxide sequestration as proposed in Class et al. (2009). Carbon dioxide is injected at a depth of 2,960–3,010 m into a saline aquifer over a time period of 50 years, while injection is stopped after 25 years. The given spatial discretization spans 54,756 control volumes. Figure 11 shows the results of the simulation. The results of the multi-physics approach compared to the model on full complexity show good agreement; the integral indicators proposed in the benchmark agree as well. In fact, observable differences are largely caused by
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . . wetting saturation 1
wetting saturation 1
y
739
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
z
y
x
z x
permeability 1e-13 2e-13 3e-13 3e-14
4e-13
Fig. 11 Results for the Johannsen formation benchmark after 50 years. Left: full compositional two-phase model. Right: multi-physics approach, depicting only the most complex subdomain
truncation errors (volume error) in the post-injection phase, and not by the multiphysics framework. After 25 years, the complex subdomain covers roughly 3 % of total cells, expanding to roughly 4; 5 % after 50 years.
5.2
A Multi-scale Example
As a multi-scale example flow through a layer of the three-dimensional geological model (model 2) of the SPE 10 benchmark study (Christie and Blunt 2001) is simulated. The permeability and porosity fields of layer 15 (top formation) are shown in Fig. 12a, b. Capillary pressure and gravity are neglected and the fluids are assumed to be incompressible. The fine-scale relative permeabilities are calculated using quadratic laws krw D Sw2
(82)
krn D .1 Sw /2
(83)
and the fluid viscosities are w D 103 kg/(ms) and n D 5 103 kg/(ms). As shown in Fig. 12c, the domain is initially saturated by a non-wetting phase (e.g., oil) and water infiltrates from the southern domain boundary driven by a pressure gradient in y-direction. At the remaining sides no-flow boundary conditions are applied. On the fine scale, the domain is discretized by a grid of 128 256 cells and on the coarse scale by a grid of 4 8 cells, leading to a hierarchic refinement factor of 5.
The Multi-scale Approach The multi-scale model is described in detail in Wolff et al. (2013b). It solves a decoupled system of equations for incompressible isothermal two-phase flow
740
R. Helmig et al.
Fig. 12 Porosity (a) and permeability (b) distribution according to layer 15 of the SPE10 benchmark model 2 (top formation) and problem setup (c)
like the one introduced in section “The Immiscible Case.” Effective coarse-scale permeabilities are calculated for the coarsest grid level (4 8 cells) from the finescale distribution (Fig. 12) applying the method of Wen et al. (2003) which enables the construction of full permeability tensors. For the solution of the local fine-scale problems effective flux boundary, conditions derived in Wallstrom et al. (2002a,b) are used. The multi-scale concept is straightforward: We combine an h-adaptive grid method with the numerical upscaling approach, where the adaptive grid can be interpreted as a kind of global downscaling. Wherever possible, the grid consists of level zero cells (coarse scale); otherwise, it can be refined up to the highest level (fine scale). Only one level difference is allowed between neighboring cells. This leads to a transfer region if there is more than one level difference between the fine scale and the coarse scale. At the highest level, the fine-scale parameters can be used directly. At all other levels, we use the upscaled parameters calculated for the zeroth grid level and intermediate-scale parameters are not upscaled separately (details, see Wolff et al. 2013b). The grid is adapted using a nonconforming refinement strategy with hanging nodes. For a correct approximation of fluxes at hanging nodes, a multipoint flux approximation method (MPFA) is used (Aavatsmark et al. 2008; Faigle et al. 2013; Wolff et al. 2013a).
Numerical Results Some results of the simulation of the previously described problem setup are shown in Figs. 13 and 14. The simulation is stopped approximately when the northern domain boundary is reached by the invading fluid. For both the distribution of the saturation and of the total velocity magnitude, the multi-scale result (b) agrees very well with the fine-scale reference solution (a). It can be observed that a preferred flow path exists which leads to the development of a large-scale finger. The preferred
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
741
Fig. 13 Saturation distribution of the fine-scale simulation (a) and the multi-scale simulation (b) at t D 1:5 108 s
Fig. 14 Total velocity distribution of the fine-scale simulation (a) and the multi-scale simulation (b) at t D 1:5 108 s and the corresponding adapted grid
flow paths are very well accounted for by the grid adaptation scheme as can be seen in Fig. 14c where the corresponding grid at the simulation end is plotted. For a sufficiently accurate approximation, the resolving of the large-scale flow paths as well as the local front propagation is important. In this example adaptation indicators based on local saturation gradients and on the total velocity are applied. For a more detailed discussion of the grid adaptation, we refer to Wolff et al. (2013b). Considering the grid (Fig. 14c), it is obvious that the number of degrees of freedom can be reduced and hence the efficiency increased significantly by the multi-scale method.
6
Conclusion
In this work, we give an overview of multi-scale and multi-physics methods for flow and transport processes in porous media. Therefore, we defined relevant scales and gave an overview of multi-scale and of multi-physics methods. We introduced
742
R. Helmig et al.
the mathematical model for compositional non-isothermal multiphase flow and transport in porous media and discussed possible mathematical reformulations. Based on that, decoupled and coupled numerical solution strategies are discussed. In a next part, applications examples of multi-physics and multi-scale models were given. This work is meant to give an overview of existing multi-scale and multiphysics approaches for multiphase flow problems in porous media. Examples are given in order to illustrate the effectiveness and applicability of these algorithms. Future work needs to be done to include more complex processes in the multiscale and multi-physics algorithms in order to allow for the modeling of highly complex real-life systems. Also, the development of upscaling techniques and the upscaling of the complex equations are crucial issues. Coupling techniques need to be improved to allow for a physically based coupling of different multiphysics domains and for the coupling across scales. Numerical methods need to be improved in order to allow for moving meshes if the multi-physics domains move during a simulation and multi-scale multi-physics algorithms need to allow for the application of different numerical schemes for the solution of different physical processes (multi-numerics). In conclusion, we state that the development and application of multi-scale multiphysics techniques allows to model highly complex physical problems in large domains that could otherwise not be solved numerically.
References Aarnes JE, Krogstad S, Lie K-A (2006) A hierarchical multiscale method for two-phase flow based upon mixed finite elements and nonuniform coarse grids. Multiscale Model Simul 5(2):337–363 Aarnes JE, Krogstad S, Lie K-A (2008) Multiscale mixed/mimetic methods on corner-point grids. Comput Geosci 12(3):297–315 Aavatsmark I (2002) An introduction to multipoint flux approximations for quadrilateral grids. Comput Geosci 6(3–4):405–432. Locally conservative numerical methods for flow in porous media Aavatsmark I, Eigestad GT, Mallison BT, Nordbotten JM (2008) A compact multipoint flux approximation method with improved robustness. Numer Methods Partial Differ Equ 24(5):1329–1360 Ács G, Doleschall S, Farkas E (1985) General purpose compositional model. Soc Pet Eng J 25:543–553 Ainsworth M, Oden JT (2000) A posteriori error estimation in finite element analysis. Pure and applied mathematics (New York). Wiley-Interscience [Wiley], New York Albon C, Jaffre J, Roberts J, Wang X, Serres C (1999) Domain decompositioning for some transition problems in flow in porous media. In: Chen Z, Ewing R, Shi Z-C (eds) Numerical treatment of multiphase flow in porous media. Lecture notes in physics. Springer, Berlin/Heidelberg Allen MB, Ewing RE, Lu P (1992) Well-conditioned iterative schemes for mixed finite-element models of porous-media flows. SIAM J Sci Stat Comput 13(3):794–814 Arbogast T (1989) Analysis of the simulation of single phase flow through a naturally fractured reservoir. SIAM J Numer Anal 26(1):12–29 Arbogast T, Pencheva G, Wheeler MF, Yotov I (2007) A multiscale mortar mixed finite element method. Multiscale Model Simul 6(1):319–346 Atkins P (1994) Physical chemistry, 5th edn. Oxford University Press, Oxford/New York
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
743
Aziz K, Settari A (1979) Petroleum reservoir simulation. Elsevier Applied Science, London Aziz K, Wong T (1989) Considerations in the development of multipurpose reservoir simulation models. In: Proceedings first and second international forum on reservoir simulation, Alpbach. Steiner, P., pp 77–208 Babuska I, Strouboulis T (2001) The finite element method and its reliability. Numerical mathematics and scientific computation. The Clarendon Press/Oxford University Press, New York Barker J, Thibeau S (1997) A critical review of the use of pseudorelative permeabilities for upscaling. SPE Reserv Eng 12(2):138–143 Bastian P, Rivière B (2003) Superconvergence and H .div/ projection for discontinuous Galerkin methods. Int J Numer Methods Fluids 42(10):1043–1057 Beavers GS, Joseph DD (1967) Boundary conditions at a naturally permeable wall. J Fluid Mech 30:197–207 Berndt M, Lipnikov K, Shashkov M, Wheeler MF, Yotov I (2005) Superconvergence of the velocity in mimetic finite difference methods on quadrilaterals. SIAM J Numer Anal 43(4):1728–1749 Binning P, Celia MA (1999) Practical implementation of the fractional flow approach to multi-phase flow simulation. Adv Water Resour 22(5):461–478 Bramble JH (1993) Multigrid methods. Volume 294 of Pitman research notes in mathematics series. Longman Scientific & Technical, Harlow Brezzi F, Fortin M (1991) Mixed and hybrid finite element methods. Volume 15 of Springer series in computational Mathematics. Springer, New York Brezzi F, Lipnikov K, Shashkov M (2005a) Convergence of the mimetic finite difference method for diffusion problems on polyhedral meshes. SIAM J Numer Anal 43(5):1872–1896 Brezzi F, Lipnikov K, Simoncini V (2005b) A family of mimetic finite difference methods on polygonal and polyhedral meshes. Math Models Methods Appl Sci 15(10):1533–1551 Briggs WL, Henson VE, McCormick SF (2000) A multigrid tutorial, 2nd edn. Society for Industrial and Applied Mathematics (SIAM), Philadelphia Calo V, Efendiev Y, Galvis J (2011) A note on variational multiscale methods for high-contrast heterogeneous porous media flows with rough source terms. Adv Water Resour 34(9): 1177–1185 Cao Y, Helmig R, Wohlmuth B (2008) The influence of the boundary discretization on the multipoint flux approximation L-method. In: Finite volumes for complex applications V. ISTE, London, pp 257–263 Cattani C, Laserra E (2003) Wavelets in the transport theory of heterogeneous reacting solutes. Int J Fluid Mech Res 30(2):147–152 Chavent G (1976) A new formulation of diphasic incompressible flows in porous media. Number 503 in Lecture notes in mechanics. Springer, Berlin, pp 258–270 Chavent G, Jaffré J (1986) Mathematical models and finite elements for reservoir simulation. North-Holland, Amsterdam Chen Y, Durlofsky LJ (2006) Adaptive local-global upscaling for general flow scenarios in heterogeneous formations. Transp Porous Media 62(2):157–185 Chen Y, Durlofsky LJ, Gerritsen M, Wen XH (2003) A coupled local-global upscaling approach for simulating flow in highly heterogeneous formations. Adv Water Resour 26(10):1041–1060 Chen Y, Li Y (2009) Local-global two-phase upscaling of flow and transport in heterogeneous formations. Multiscale Model Simul 8:125–153 Chen Y, Li Y, Efendiev Y (2013) Time-of-flight (TOF)-based two-phase upscaling for subsurface flow and transport. Adv Water Resour 54:119–132 Chen Z, Ewing RE, Jiang Q, Spagnuolo AM (2002) Degenerate two-phase incompressible flow. V. Characteristic finite element methods. J Numer Math 10(2):87–107 Chen Z, Hou TY (2002) A mixed multiscale finite element method for elliptic problems with oscillating coefficients. Math Comput 72(242):541–576 Chen Z, Hou TY (2003) A mixed multiscale finite element method for elliptic problems with oscillating coefficients. Math Comput 72(242):541–576
744
R. Helmig et al.
Chen Z, Huan G, Ma Y (2006) Computational methods for multiphase flows in porous media. Computational science & engineering. Society for Industrial and Applied Mathematics, Philadelphia Christie MA, Blunt MJ (2001) Tenth SPE comparative solution project: a comparison of upscaling techniques. SPE Reserv Eval Eng 4:308–317 Class H, Ebigbo A, Helmig R, Dahle H, Nordbotten J, Celia M, Audigane P, Darcis M, Ennis-King J, Fan Y, Flemisch B, Gasda S, Jin M, Krug S, Labregere D, Beni A, Pawar R, Sbai A, Thomas S, Trenty L, Wei L (2009) A benchmark study on problems related to CO2 storage in geologic formations. Comput Geosci 13(4):409–434 Coats KH (2003a) Impes stability: selection of stable timesteps. SPE J 8:181–187 Coats KH (2003b) Impes stability: the CFL limit. SPE J 8:291–297 Cockburn B, Lin SY, Shu C-W (1989) TVB Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws. III. One-dimensional systems. J Comput Phys 84(1):90–113 Cockburn B, Shu C-W (1989) TVB Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws. II. General framework. Math Comput 52(186):411–435 Codina R (2001) A stabilized finite element method for generalized stationary incompressible flows. Comput Methods Appl Mech Eng 190(20–21):2681–2706 Darman NH, Pickup GE, Sorbie KS (2002) A comparison of two-phase dynamic upscaling methods based on fluid potentials. Comput Geosci 6(1):5–27 Dawson CN, Russell TF, Wheeler MF (1989) Some improved error estimates for the modified method of characteristics. SIAM J Numer Anal 26(6):1487–1512 Dennis JE Jr, Schnabel RB (1996) Numerical methods for unconstrained optimization and nonlinear equations. Volume 16 of Classics in applied mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia Dietrich P, Hemlig R, Sauter M, Hötzl H, Köngeter J, Teutsch G (eds) (2005) Flow and transport in fractured porous media. Springer, Berlin/New York Discacciati M, Miglio E, Quarteroni A (2002) Mathematical and numerical models for coupling surface and groundwater flows. Appl Numer Math 43:57–74 Douglas J Jr, Huang C-S, Pereira F (1999) The modified method of characteristics with adjusted advection. Numer Math 83(3):353–369 Durlofsky LJ (1991) Numerical calculation of equivalent grid block permeability tensors for heterogeneous porous media. Water Resour Res 27(5):699–708 Efendiev Y, Durlofsky LJ (2002) Numerical modeling of subgrid heterogeneity in two phase flow simulations. Water Resour Res 38(8) Efendiev Y, Durlofsky LJ (2004) Accurate subgrid models for two-phase flow in heterogeneous reservoirs. SPE J 9:219–226 Efendiev Y, Durlofsky LJ, Lee SH (2000) Modeling of subgrid effects in coarse-scale simulations of transport in heterogeneous porous media. Water Resour Res 36(8):2031–2041 Efendiev Y, Hou T (2007) Multiscale finite element methods for porous media flows and their applications. Appl Numer Math 57(5–7):577–596 Eigestad GT, Klausen RA (2005) On the convergence of the multi-point flux approximation O-method: numerical experiments for discontinuous permeability. Numer Methods Partial Differ Equ 21(6):1079–1098 Ewing RE, Russell TF, Wheeler MF (1984) Convergence analysis of an approximation of miscible displacement in porous media by mixed finite elements and a modified method of characteristics. Comput Methods Appl Mech Eng 47(1–2):73–92 Ewing RE, Wang H (1994) Eulerian-Lagrangian localized adjoint methods for variable-coefficient advective-diffusive-reactive equations in groundwater contaminant transport. In: Advances in optimization and numerical analysis (Oaxaca, 1992). Volume 275 of Mathematics and its applications. Kluwer Academic, Dordrecht, pp 185–205 Ewing RE, Wang H (1996) An optimal-order estimate for Eulerian-Lagrangian localized adjoint methods for variable-coefficient advection-reaction problems. SIAM J Numer Anal 33(1):318–348
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
745
Eymard R, Gallouët T, Herbin R (2000) Finite volume methods. In: Handbook of numerical analysis, vol. VII. North-Holland, Amsterdam, pp 713–1020 Faigle B, Helmig R, Aavatsmark I, Flemisch B (2013) Efficient multi-physics modelling with adaptive grid-refinement using a MPFA method. Comput Geosci (submitted) Flemisch B, Darcis M, Erbertseder K, Faigle B, Lauser A, Mosthaf K, Müthing S, Nuske P, Tatomir A, Wolff M, Helmig R (2011) DUMUX: DUNE for multi-{phase, component, scale, physics,. . . } flow and transport in porous media. Adv Water Resour 34(9):1102–1112 Fritz J, Flemisch B, Helmig R (2012) Decoupled and multiphysics models for non-isothermal compositional two-phase flow in porous media. Int J Numer Anal Model 9(1):17–28 Galvis J, Efendiev Y (2010) Domain decomposition preconditioners for multiscale flows in high contrast media. Multiscale Model Simul 8(4):1461–1483 Gasda S, Nordbotten J, Celia M (2009) Vertical equilibrium with sub-scale analytical methods for geological CO2 sequestration. Comput Geosci 13(4):469–481 Gasda S, Nordbotten J, Celia M (2011) Vertically-averaged approaches to CO2 injection with solubility trapping. Water Resour Res 47:W05528 Ghostine R, Kesserwani G, Mosé R, Vazquez J, Ghenaim A (2009) An improvement of classical slope limiters for high-order discontinuous Galerkin method. Int J Numer Methods Fluids 59(4):423–442 Giraud L, Langou J, Sylvand G (2006) On the parallel solution of large industrial wave propagation problems. J Comput Acoust 14(1):83–111 Girault V, Rivière B (2009) Dg approximation of coupled navier-stokes and darcy equations by beaver-joseph-saffman interface condition. SIAM J Numer Anal 47:2052–2089 Gray GW, Leijnse A, Kolar RL, Blain CA (1993) Mathematical tools for changing scale in the analysis of physical systems, 1st edn. CRC, Boca Raton Hajibeygi H, Bonfigli G, Hesse MA, Jenny P (2008) Iterative multiscale finite-volume method. J Comput Phys 227(19):8604–8621 Hauke G, García-Olivares A (2001) Variational subgrid scale formulations for the advectiondiffusion-reaction equation. Comput Methods Appl Mech Eng 190(51–52):6847–6865 He Y, Han B (2008) A wavelet finite-difference method for numerical simulation of wave propagation in fluid-saturated porous media. Appl Math Mech (English Ed.) 29(11):1495–1504 Helmig R (1997) Multiphase flow and transport processes in the subsurface. Springer, Berlin/New York Helmig R, Flemisch B, Wolff M, Ebigbo A, Class H (2012) Model coupling for multiphase flow in porous media. Adv Water Resour 51:52–66 Herrera I, Ewing RE, Celia MA, Russell TF (1993) Eulerian-Lagrangian localized adjoint method: the theoretical framework. Numer Methods Partial Differ Equ 9(4):431–457 Hoteit H, Ackerer P, Mosé R, Erhel J, Philippe B (2004) New two-dimensional slope limiters for discontinuous Galerkin methods on arbitrary meshes. Int J Numer Methods Eng 61(14): 2566–2593 Hoteit H, Firoozabadi A (2008) Numerical modeling of two-phase flow in heterogeneous permeable media with different capillarity pressures. Adv Water Resour 31(1):56–73 Hou TY, Wu X-H (1997) A multiscale finite element method for elliptic problems in composite materials and porous media. J Comput Phys 134(1):169–189 Hristopulos D, Christakos G (1997) An analysis of hydraulic conductivity upscaling. Nonlinear Anal 30(8):4979–4984 Huang C-S (2000) Convergence analysis of a mass-conserving approximation of immiscible displacement in porous media by mixed finite elements and a modified method of characteristics with adjusted advection. Comput Geosci 4(2):165–184 Hughes TJR (1995) Multiscale phenomena: Green’s functions, the Dirichlet-to-Neumann formulation, subgrid scale models, bubbles and the origins of stabilized methods. Comput Methods Appl Mech Eng 127(1–4):387–401 Hughes, TJR, Feijóo GR, Mazzei L, Quincy J-B (1998) The variational multiscale method – a paradigm for computational mechanics. Comput Methods Appl Mech Eng 166(1–2):3–24
746
R. Helmig et al.
Hyman J, Morel J, Shashkov M, Steinberg S (2002) Mimetic finite difference methods for diffusion equations. Comput Geosci 6(3–4):333–352 Locally conservative numerical methods for flow in porous media IPCC (2005) Carbon dioxide capture and storage. Special report of the intergovernmental panel on climate change. Cambridge University Press, Cambridge Jäger W, Mikeli´c A (2009) Modeling effective interface laws for transport phenomena between an unconfined fluid and a porous medium using homogenization. Transp Porous Media 78:489–508 Jang G-W, Kim JE, Kim YY (2004) Multiscale Galerkin method using interpolation wavelets for two-dimensional elliptic problems in general domains. Int J Numer Methods Eng 59(2):225–253 Jenny P, Lee SH, Tchelepi H (2003) Multi-scale finite-volume method for elliptic problems in subsurface flow simulations. J Comput Phys 187:47–67 Jenny P, Lee SH, Tchelepi HA (2006) Adaptive fully implicit multi-scale finite-volume method for multi-phase flow and transport in heterogeneous porous media. J Comput Phys 217(2):627–641 Juanes R (2005) A variational multiscale finite element method for multiphase flow in porous media. Finite Elem Anal Des 41(7–8):763–777 Juanes R, Dub F-X (2008) A locally conservative variational multiscale method for the simulation of porous media flow with multiscale source terms. Comput Geosci 12:273–295 Juanes R, Lie K-A (2008) Numerical modeling of multiphase first-contact miscible flows. II. Front-tracking/streamline simulation. Transp Porous Media 72(1):97–120 Kees CE, Farthing M, Dawson CN (2008) Locally conservative, stabilized finite element methods for variably saturated flow. Computer Methods Appl Mech Eng 197(51–52):4610–4625 Kim M-Y, Park E-J, Thomas SG, Wheeler MF (2007) A multiscale mortar mixed finite element method for slightly compressible flows in porous media. J Korean Math Soc 44(5):1103–1119 Kippe V, Aarnes JE, Lie K-A (2008) A comparison of multiscale methods for elliptic problems in porous media flow. Comput Geosci 12(3):377–398 Klausen RA, Winther R (2006) Robust convergence of multi point flux approximation on rough grids. Numer Math 104(3):317–337 Kyte JR, Berry DW (1975) New pseudo functions to control numerical dispersion. SPE J 15(4):269–276 Layton WJ, Schieweck F, Yotov I (2003) Coupling fluid flow with porous media flow. SIAM J Numer Anal 40:2195–2218 Lee SH, Wolfsteiner C, Tchelepi HA (2008) Multiscale finite-volume formulation of multiphase flow in porous media: black oil formulation of compressible, three-phase flow with gravity. Comput Geosci 12(3):351–366 Lee SH, Zhou H, Tchelepi HA (2009) Adaptive multiscale finite-volume method for nonlinear multiphase transport in heterogeneous formations. J Comput Phys 228:9036–9058 LeVeque RJ (2002) Finite volume methods for hyperbolic problems. Cambridge texts in applied mathematics. Cambridge University Press, Cambridge Lüdecke C, Lüdecke D (2000) Thermodynamik Springer, Berlin Lunati I, Jenny P (2006) Multiscale finite-volume method for compressible multiphase flow in porous media. J Comput Phys 216(2):616–636 Lunati I, Jenny P (2007) Treating highly anisotropic subsurface flow with the multiscale finite-volume method. Multiscale Model Simul 6(1):308–318 Lunati I, Jenny P (2008) Multiscale finite-volume method for density-driven flow in porous media. Comput Geosci 12(3):337–350 Matringe SF, Juanes R, Tchelepi HA (2006) Robust streamline tracing for the simulation of porous media flow on general triangular and quadrilateral grids. J Comput Phys 219(2):992–1012 Mazzia A, Putti M (2006) Three-dimensional mixed finite element-finite volume approach for the solution of density-dependent flow in porous media. J Comput Appl Math 185(2):347–359 Michelsen M, Mollerup J (2007) Thermodynamic models: fundamentals & computational aspects. Tie-Line Publications, Holte
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
747
Müller S (2003) Adaptive multiscale schemes for conservation laws. Volume 27 of Lecture notes in computational science and engineering. Springer, Berlin Nghiem L, Li Y-K (1984) Computation of multiphase equilibrium phenomena with an equation of state. Fluid Phase Equilibria 17(1):77–95 Niessner J, Helmig R (2007) Multi-scale modeling of three-phase–three-component processes in heterogeneous porous media. Adv Water Resour 30(11):2309–2325 Nordbotten J (2009) Adaptive variational multiscale methods for multiphase flow in porous media. Multiscale Model Simul 7(3):1455 Nordbotten JM, Bjørstad PE (2008) On the relationship between the multiscale finite-volume method and domain decomposition preconditioners. Comput Geosci 12(3):367–376 Of G (2007) Fast multipole methods and applications. In: Boundary element analysis. Volume 29 of Lecture notes in applied and computational mechanics. Springer, Berlin, pp 135–160 Oladyshkin S, Royer J-J, Panfilov M (2008) Effective solution through the streamline technique and HT-splitting for the 3D dynamic analysis of the compositional flows in oil reservoirs. Transp. Porous Media 74(3):311–329 Panfilov M (2000) Macroscale models of flow through highly heterogeneous porous media. Kluwer Academic, Dordrecht/Boston Pau GSH, Bell JB, Almgren AS, Fagnan KM, Lijewski MJ (2012) An adaptive mesh refinement algorithm for compressible two-phase flow in porous media. Comput Geosci 16(3):577–592 Peszynska M, Lu Q, Wheeler M (2000) Multiphysics coupling of codes. In: Computational methods in water resources. A. A. Balkema, Rotterdam/Brookfield, pp 175–182 Peszynska M, Wheeler MF, Yotov I (2002) Mortar upscaling for multiphase flow in porous media. Comput Geosci 6:73–100 Pickup GE, Sorbie KS (1996) The scaleup of two-phase flow in porous media using phase permeability tensors. SPE J 1:369–382 Prausnitz JM, Lichtenthaler RN, Azevedo EG (1967) Molecular thermodynamics of fluid-phase equilibria. Prentice-Hall Pruess K (1985) A practical method for modeling fluid and heat flow in fractured porous media. SPE J 25(1):14–26 Quintard M, Whitaker S (1988) Two-phase flow in heterogeneous porous media: the method of large-scale averaging. Transp Porous Media 3(4):357–413 Renard P, de Marsily G (1997) Calculating effective permeability: a review. Adv Water Resour 20:253–278 Russell T (1989) Stability analysis and switching criteria for adaptive implicit methods based on the CFL condition. In: Proceedings of SPE symposium on reservoir simulation, Dallas. Society of Petroleum Engineers, pp 97–107 Russell TF (1990) Eulerian-Lagrangian localized adjoint methods for advection-dominated problems. In: Numerical analysis 1989 (Dundee, 1989). Volume 228 of Pitman research notes in mathematics series. Longman Scientific & Technical, Harlow, pp 206–228 Ryzhik V (2007) Spreading of a NAPL lens in a double-porosity medium. Comput Geosci 11(1): 1–8 Sáez AE, Otero CJ, Rusinek I (1989) The effective homogeneous behavior of heterogeneous porous media. Transp Porous Media 4(3):213–238 Sandvin A, Nordbotten JM, Aavatsmark I (2011) Multiscale mass conservative domain decomposition preconditioners for elliptic problems on irregular grids. Comput Geosci 15(3):587–602 Scheidegger A (1974) The physics of flow through porous media, 3rd edn University of Toronto Press, Toronto/Buffalo Shashkov M (1996) Conservative finite-difference methods on general grids. Symbolic and numeric computation series. CRC, Boca Raton Sheldon JW, Cardwell WT (1959) One-dimensional, incompressible, noncapillary, two-phase fluid in a porous medium. Pet Trans AIME 216:290–296 Sleep BE, Sykes JF (1993) Compositional simulation of groundwater contamination by organiccompounds. 1. Model development and verification. Water Resour Res 29(6):1697–1708
748
R. Helmig et al.
Smith EH, Seth MS (1999) Efficient solution for matrix-fracture flow with multiple interacting continua. Int J Numer Anal Methods Geomech 23(5):427–438 Srinivas C, Ramaswamy B, Wheeler MF (1992) Mixed finite element methods for flow through unsaturated porous media. In: Computational methods in water resources, IX, vol 1 (Denver, 1992). Computational Mechanics, Southampton, pp 239–246 Stenby E, Wang P (1993) Noniterative phase equilibrium calculation in compositional reservoir simulation. SPE 26641 Stone HL (1991) Rigorous black oil pseudo functions. In: SPE symposium on reservoir simulation, Anaheim, 17–20 Feb 1991 Stone HL, Garder AO Jr (1961) Analysis of gas-cap or dissolved-gas reservoirs. Pet Trans AIME 222:92–104 Stüben K (2001) A review of algebraic multigrid. J Comput Appl Math 128(1–2):281–309 Numerical analysis 2000, vol VII, Partial differential equations Suk H, Yeh G-T (2008) Multiphase flow modeling with general boundary conditions and automatic phase-configuration changes using a fractional-flow approach. Comput Geosci 12(4):541–571 Tornberg A-K, Greengard L (2008) A fast multipole method for the three-dimensional Stokes equations. J Comput Phys 227(3):1613–1619 Trangenstein J, Bell J (1989) Mathematical structure of compositional reservoir simulation. SIAM J Sci Stat Comput 10(5):817–845 Trottenberg U, Oosterlee CW, Schüller A (2001) Multigrid Academic, San Diego Urban K (2009) Wavelet methods for elliptic partial differential equations. Numerical mathematics and scientific computation. Oxford University Press, Oxford van Odyck DEA, Bell JB, Monmont F, Nikiforakis N (2008) The mathematical structure of multiphase thermal models of flow in porous media. Proc R Soc A Math Phys Eng Sci 465(2102):523–549 Wallstrom TC, Christie MA, Durlofsky LJ, Sharp DH (2002a) Effective flux boundary conditions for upscaling porous media equations. Transp Porous Media 46(2):139–153 Wallstrom TC, Hou S, Christie MA, Durlofsky LJ, Sharp DH, Zou Q (2002b) Application of effective flux boundary conditions to two-phase upscaling in porous media. Transp Porous Media 46(2):155–178 Wang H, Liang D, Ewing RE, Lyons SL, Qin G (2002) An ELLAM approximation for highly compressible multicomponent flows in porous media. Comput Geosci 6(3–4):227–251. Locally conservative numerical methods for flow in porous media Wang P, Barker J (1995) Comparison of flash calculations in compositional reservoir simulation. SPE 30787 Weinan E, Engquist B (2003a) The heterogeneous multiscale methods. Commun Math Sci 1(1):87–132 Weinan E, Engquist B (2003b) Multiscale modeling and computation. Not Am Math Soc 50(9):1062–1070 Weinan E, Engquist B, Huang Z (2003) Heterogeneous multiscale method: a general methodology for multiscale modeling. Phys Rev 67 Weinan E, Engquist B, Li X, Ren W, Vanden-Eijnden E (2007) Heterogeneous multiscale methods: a review. Commun. Comput. Phys. 2(3):367–450 Wen XH, Durlofsky LJ, Edwards MG (2003) Use of border regions for improved permeability upscaling. Math Geol 35(5):521–547 Wheeler M, Arbogast T, Bryant S, Eaton J, Lu Q, Peszynska M, Yotov I (1999) A parallel multiblock/multidomain approach to reservoir simulation. In: Fifteenth SPE symposium on reservoir simulation, Houston. Society of Petroleum Engineers, pp 51–62. SPE 51884 Wheeler MF, Peszyska M (2002) Computational engineering and science methodologies for modeling and simulation of subsurface applications. Adv Water Resour 25(812):1147–1173 Whitaker S (1998) The method of volume averaging. Kluwer Academic, Norwell White F (2003) Fluid mechanics. McGraw-Hill, Boston Wolff M, Cao Y, Flemisch B, Helmig R, Wohlmuth B (2013a) Multi-point flux approximation L-method in 3D: numerical convergence and application to two-phase flow through porous
Efficient Modeling of Flow and Transport in Porous Media Using Multi. . .
749
media. In: Bastian P, Kraus J, Scheichl R, Wheeler M (eds) Simulation of flow in porous media – applications in energy and environment. De Gruyter, Berlin Wolff M, Flemisch B, Helmig R (2013b) An adaptive multi-scale approach for modeling two-phase flow in porous media including capillary pressure. Water Resour Res (submitted) Yao Z-H, Wang H-T, Wang P-B, Lei T (2008) Investigations on fast multipole BEM in solid mechanics. J Univ Sci Technol China 38(1):1–17 Yotov I (2002) Advanced techniques and algorithms for reservoir simulation IV. Multiblock solvers and preconditioners. In: Chadam J, Cunningham A, Ewing RE, Ortoleva P Wheeler MF (eds) IMA volumes in mathematics and its applications. Volume 131: resource recovery, confinement, and remediation of environmental hazards. Springer
Convection Structures of Binary Fluid Mixtures in Porous Media Matthias Augustin, Rudolf Umla, and Manfred Lücke
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 System and Basic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Numerical Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Conductive Ground State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Structures for Positive Separation Ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Bifurcation Behavior of Roll, Crossroll, and Square Convection . . . . . . . . . . . . . 3.2 Structure of the Fields for Roll and Square Convection . . . . . . . . . . . . . . . . . . . . . 3.3 Stability Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Structures for Negative Separation Ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Bifurcation Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Structure of the Fields for TW Convection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Lateral Currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
752 753 753 756 759 762 762 763 765 770 771 772 775 776 777
M. Augustin () Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany e-mail: [email protected] R. Umla BP Exploration Operating Company Limited, Sunbury on Thames, UK e-mail: [email protected] M. Lücke Institut für Theoretische Physik, Universität des Saarlandes, Saarbrücken, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_75
751
752
M. Augustin et al.
Abstract
The study of convection patterns of binary mixtures in a porous medium plays an important role for modeling geothermal reservoirs as well as for many more industrial applications. Making use of a global Galerkin method allows to numerically determine in an efficient way various convection structures. The aim of this chapter is to describe the structural properties of these flow patterns, their bifurcation behavior, and stability against infinitesimal perturbations. The Soret effect, i.e., the generation of concentration gradients by temperature gradients, is taken into account and leads to several patterns with distinct features. We focus on those patterns that are of primary importance near the onset of convection; these include roll, crossroll, and square convection as well as traveling waves of convection rolls.
1
Introduction
Modeling fluid flow through a porous medium is of importance for many industrial applications as, e.g., absorption and adsorption processes, heat storage, nuclear reactors, and spacecraft thermal management systems (Saravanan and Kandaswamy 2003). In the context of geosciences, two prominent examples are the exploration of oil and gas fields (Pruess 1990; O’Sullivan et al. 2001; Thomas 2007) and the modeling of geothermal reservoirs (see, e.g., the corresponding chapter of this book). Especially for the latter one, it is important to get information about convection patterns and the associated changes in temperature to predict production rates of geothermal facilities. Often, it is necessary to consider not a simple fluid but a mixture of two fluids (Vafai 2005; Nield and Bejan 2006). In such a system, the generation of concentration gradients by temperature gradients, known as the Soret effect, gives rise to a variety of different convection patterns with distinct properties (Platten 2006). A positive Soret effect tends to enhance convection as the heavier component is transported toward lower temperatures, whereas a negative Soret effect works in the opposite direction. In order to investigate convection patterns of binary fluid flow in porous media, we choose the widely studied setup of the Rayleigh-Bénard system. It consists of a set of parallel plates which are held at constant but different temperatures with the lower one being warmer than the upper one. We consider the gap between the plates to be filled by a porous medium which is saturated by a binary fluid mixture. Such a setup was already investigated theoretically by Horton and Rogers (1945) as well as Lapwood (1948), who considered the stability of the conductive ground state. Convection in the form of roll structures was considered by Straus (1974) and De La Torre Juárez and Busse (1995). Experiments on this modified Rayleigh-Bénard system were performed, e.g., by Shattuck et al. (1997) and Howle et al. (1997) who observed such roll patterns. All of the aforementioned articles deal with one-component fluids. For binary mixtures, numerical investigations regarding the ground state are due to
Convection Structures of Binary Fluid Mixtures in Porous Media
753
Sovran et al. (2001), Charrier-Mojtabi et al. (2007), and Elhajjar et al. (2008). In each of these articles, binary fluids with negative and positive Soret effect are considered. Moreover, for binary fluids with a negative Soret effect, ground state stability and bifurcation scenarios were investigated by Brand and Steinberg (1983) and Knobloch (1986), respectively. This chapter aims at giving an overview on some spatial extended convection patterns of binary fluid mixtures with positive and negative Soret effect in the Rayleigh-Bénard system with porous medium. This setup may be seen as a simple model for an aquifer or for magma flow. The basis of this review are our results published in Augustin et al. (2010) and Umla et al. (2010, 2011). We begin in Sect. 2 by introducing the governing equations of the system and the Galerkin method which we used to model convection. Moreover, this section discusses briefly the stability threshold of the conductive ground state. Mixtures with positive Soret effect are studied in Sect. 3. Here, we investigate the behavior of temperature, concentration, and fluid velocity for roll and square convection as well as the stability of these two kinds of convection toward infinitesimal perturbation by means of a linear stability analysis. Section 4 deals with traveling waves which can be found in fluids with a negative separation ratio. We conclude by a summary of our results in Sect. 5.
2
Foundations
2.1
System and Basic Equations
For the convenience of the reader, we characterize here the system and the equations (Augustin et al. 2010; Umla et al. 2010, 2011) which are the point of departure for our considerations. The Rayleigh-Bénard system consists of two highly conducting, rigid, impermeable parallel plates with distance d which are infinitely extended in the other two dimensions. A homogeneous gravitational field g D gez acts perpendicular to the plates. The temperature of the plates is fixed to be T D T0 ˙
T 2
at
zD
d 2
(1)
with T0 being the mean temperature of the system and T > 0. Moreover, the gap between the plates contains a porous medium and the pores of this medium are filled with a binary fluid mixture. The porous medium is considered to be isotropic and locally in thermal equilibrium with the fluid mixture. The state of the system can be described by the fields of temperature T , fluid velocity v, mass concentration C of the lighter component, total fluid mass density D 1 C 2 , and fluid pressure P . Under the assumption that the OberbeckBoussinesq approximation is applicable, we treat the dynamic viscosity, the thermal expansion coefficients, the heat capacities, and the other transport coefficients and
754
M. Augustin et al.
Table 1 Parameters entering Eqs. (2) ca 0
K ˇT /ˇC g Cf /Cs /Ctot f /s /tot
Df /Dtot kT
Correction factor Mean density of the fluid Dynamic viscosity Permeability of the porous medium Thermal/solutal expansion coefficient of the fluid Gravitational constant Heat capacity per unit volume of fluid/solid/total medium Thermal conductivity of fluid/solid/total medium Porosity of the medium Concentration diffusivity of the fluid/total medium Thermodiffusion ratio
parameters in Table 1 as constants. They are taken at the thermodynamic values of the spatial averages T0 , C0 , P0 . Radiation effects and heat generation due to friction are neglected. However, the Soret effect, being regarded as non-negligible, is taken into account. This effect describes how temperature gradients drive concentration gradients. As governing equations of the system, we get (Nield and Bejan 2006) r v D 0; ca 0 @t v D rP
(2a)
v C 0 Œ1 C ˇT T C ˇC C gez ; K
Ctot @t T C Cf .v r/T D tot r 2 T;
@t C C .v r/C D Dtot r 2 C C Dtot
(2b) (2c)
kT 2 r T: T0
(2d)
The meanings of the parameters in these equations are given in Table 1. The total heat capacity and thermal conductivity are given by Ctot D Cf C .1 /Cs and tot D f C .1 /s , respectively, and Dtot D Df . Equation (2b) contains a correction factor ca in front of the time derivative term to make it consistent with experimental results (see Nield and Bejan 2006). The correction factor is a scalar as the porous medium is isotropic and homogeneous. It is convenient to modify Eqs. (2) in two ways. First, we nondimensionalize by 2 tot
scaling lengths by d , times by d Ctottot , temperatures by Cf Kd ˇT g0 , concentrations by tot
, Cf Kd ˇC g0
and pressures by CftotK . Secondly, instead of the velocity, temperature, concentration, and pressure themselves, we use the deviations of these quantities from the conductive state: u D uex C vey C wez (velocity), (temperature), c (concentration), and p (pressure). With these modifications, we get r u D 0; a @t u D rp u C . C c/ez ;
(3a) (3b)
Convection Structures of Binary Fluid Mixtures in Porous Media
755
Table 2 Dimensionless parameters entering Eqs. (3) a R
L
Ktot 0 ca d 2 Ctot
0 gˇT KCf d T tot
Cf
Ctot Dtot Cf tot ˇC kT ˇT T0
Correction factor Rayleigh-Darcy number Normalized porosity Lewis number Separation ratio
@t C .u r/ D Rw C r 2 ;
@t c C .u r/c D R w C L r 2 c r 2 :
(3c) (3d)
The dimensionless parameters entering into (3) are given in Table 2. As can be seen by its definition, the normalized porosity takes values in Œ0; 1 . The value of the correction factor a is usually very small (Nield and Bejan 2006). Therefore, the time derivative @t u is often neglected in (3b), as, e.g., in Sovran et al. (2001) and Elhajjar et al. (2008). However, there is some evidence that this term should not be neglected in a stability analysis (Vadasz and Olek 1999, 2000). For our purposes, we fixed a at the reasonable value of 0:0001. It turned out that there are no significant changes for smaller values of the correction factor. Since both, a and , appear only in front of time derivative terms, they influence only timedependent phenomena. Please note that the Lewis number L, which compares time scales of concentration and heat diffusion, is the inverse of the Lewis number Le as defined, e.g., in Nield and Bejan (2006). Thermal driving is measured by the Rayleigh-Darcy number R and the strength of the Soret effect by the separation ratio . Boundary conditions to (3) stem from the assumption that the plates are no-slip, impermeable, and ideal heat conductors leading to w D D @z .c / D 0
at z D ˙0:5:
(4)
The impermeability requirement couples c and gradients which is mathematically inconvenient. Therefore, one commonly introduces instead of c itself the combined field Dc (5) with the boundary condition @z D 0 at z D ˙0:5. Another simplification is to write the velocity u as u D r r .ˆez / C r .‰ez / C U.z; t/ :
(6)
756
M. Augustin et al.
Here ˆ and ‰ are scalar potentials and U.z; t/ D hu .x; y; z; t/ix;y is a possible horizontal mean flow. Here, h: : : ix;y denotes a lateral average over x; y. It is known that, under certain conditions, such a mean flow can be generated in the RayleighBénard system without a porous medium without imposing a pressure gradient (Linz et al. 1988; Barten et al. 1989, 1995). If we assume that such a mean flow may exist in our system and take the lateral average of (3b), we obtain a @t U D U:
(7)
According to this equation, every mean flow decays exponentially (with a large damping rate 1=a ) since (3b) does not contain nonlinear driving terms which could sustain a mean flow. As we are not interested in transients in this work, we set U D 0 from the start on. To derive equations for ‰ and ˆ, we apply the curl operator to (3b) once and twice, respectively, to obtain the equations a @t r 2 xy ˆ D r 2 xy ˆ xy . C c/; a @t xy ‰ D xy ‰
(8a) (8b)
with xy D @2x C @2y . Using the Representation (6) of the velocity u, the pressure p does no longer appear explicitly in the transport equations. Analogously to the dynamics of a mean flow, also ‰ is rapidly damped away. Thus we assume ‰ D 0. The final set of transport equations for this work reads a @t r 2 xy ˆ D r 2 xy ˆ xy . C c/;
(9a)
@t C .u r/ D Rw C r 2 ; (9b)
@t C .u r / C 1 .u r / D 1 R xy ˆ C Lr 2 r 2 ; (9c) together with the boundary conditions 0 D ˆ D D @z
2.2
at z D ˙0:5:
(10)
Numerical Method
To determine the solutions of (9) that describe convection, we used a Galerkin method. The general ansatz we chose is given by
X .x; y; z; t/ D
Nl X Nm X Nn X Xlmn ei .lkxCmkyt / C c:c: fn .z/ l
m
n
(11)
Convection Structures of Binary Fluid Mixtures in Porous Media
757
where X is one of the fields ˆ, , or and Cc:c: means that we add the complex conjugate of the summand immediately before the plus sign. Furthermore, k is the lateral wavenumber, is a frequency dependent on l and m. The functions fn .z/ come from an orthogonal system of trigonometric functions that satisfy the boundary conditions, i.e., (p 2 cos.nz/ for ˆ; W fn .z/ D p 2 sin.nz/
if n D 1; 3; 5 : : : if n D 2; 4; 6 : : :
(12)
and 8p ˆ ˆ N and all modes with .l C n/ > N =2 in ˆ. We choose N D 24 for positive separation ratios and N D 32 for negative separation ratios if not mentioned otherwise.
Convection Structures of Binary Fluid Mixtures in Porous Media
759
All of the before mentioned structures may be observed experimentally for parameter combinations for which they are stable against small perturbations. In order to determine the stability domain of the various convection structures in parameter space, we used different test perturbations. For roll structures, we used the perturbation ansatz
ıX .x; y; z; t/ D
Nl X
Nn X
ıXln e st Ci Œ.d lk/xCby fn .z/;
(21)
lDNl nDNn
with perturbation parameters s, d , and b, which is truncated consistently to the truncations for the ansatz of the structures themselves (Huke et al. 2000). The symmetries of the rolls allow for a reduction of the numerical work since only stability against perturbations with b > 0 and d 2 Œ0; k=2/ has to be tested. To decrease the computational costs further on for b D 0, we distinguished perturbations on the basis of their symmetry with respect to x into even and odd perturbations. Also, for d D 0, perturbations can be classified whether they satisfy the mirror-glide symmetry (15) or not. This classification may be transferred to the case d ¤ 0, although the interpretation regarding the mirror-glide symmetry is lost. As an aside, we mention that the algorithms that we used for the stability analysis turned out to be unstable in the case of negative separation ratios and hence no stability analysis was possible for < 0. For the stability analysis of the three-dimensional convection structures of squares and crossrolls at > 0, the reduced perturbation ansatz
ıX .x; y; z; t/ D
Nl X
Nm X
Nn X
ıXlmn e st Ci ŒlkxCmky fn .z/
(22)
lDNl mDNm nDNn
had to be used due to limited computational power and memory capacity. Again, perturbations can be distinguished regarding their behavior under the mirrorglide operation. With respect to their lateral symmetry, the perturbations make up three classes: even in x and y, odd in x and y, and even in x and odd in y (or vice versa). For squares and perturbations with the same symmetries in x and y, perturbations were additionally classified into perturbations which are symmetric or antisymmetric under a substitution of the lateral coordinates with each other. For further details on the symmetries and numerical methods, see Huke et al. (2000, 2007).
2.3
Conductive Ground State
For a small temperature gradient between the plates, the fluid will be at rest, i.e., v D 0. This is called the ground state of the system. Here, we briefly describe this solution of Eqs. (2) in terms of the original, unreduced fields entering into Eqs. (2).
760
M. Augustin et al.
As heat is transported merely by diffusion and the temperature difference between the plates is T , the temperature varies linearly between the plates according to T z: (23) Tcond .z/ D T0 d Via the Soret effect, this temperature gradient causes the appearance of a linear concentration field kT T Ccond .z/ D C0 C z: (24) T0 d Both fields contribute to the pressure between the plates, given by T kT T pcond .z/ D p0 C g ˇT .T0 z/ C ˇC .C0 z/ z: 2d 2d T0
(25)
In the case of a positive Soret effect, the lighter component moves toward the lower plate which increases the destabilizing effect of the temperature difference on the ground state. In the case of a negative Soret effect, the lighter component moves toward the upper plate, and the concentration gradient counteracts the destabilizing temperature gradient. Figure 1 shows the critical Rayleigh-Darcy number Rc above which convective perturbations can grow and the corresponding critical wavelength kc of the perturbations as a function of the separation ratio for various L and 0. The destabilizing effect of the Soret coupling becomes apparent in the form of a decreasing Rc from the value 4 2 of a pure fluid, D 0, with increasing . The stability threshold is stationary. Hence linear perturbations of the conductive ground state grow in a non-oscillatory manner beyond Rc . Furthermore, stationary, nonlinear convective roll and square solutions with wavenumber kc bifurcate out of the ground state at Rc . For a more detailed discussion of the ground state and its stability, see Sovran et al. (2001), Charrier-Mojtabi et al. (2007), and Elhajjar et al. (2008). For negative separation ratios, the stationary stability threshold diverges with decreasing . How fast this happens depends on the Lewis number L (Sovran et al. 2001; Charrier-Mojtabi et al. 2007; Elhajjar et al. 2008). Instead, the ground state is bounded by an oscillatory stability threshold. Figure 2 shows the corresponding critical Rayleigh-Darcy numbers Rc , critical wavelength kc , and Hopf frequency !c as a function of for different values of the normalized porosity and a fixed Lewis number L D 0:01. Since the stationary stability boundary diverges for . 0:01, it is not depicted. Rc and !c increase with decreasing and decreasing
, whereas kc is nearly constant with respect to for large normalized porosities. Small values of yield a decreasing critical wavelength for weak Soret coupling and an increasing critical wavelength for strong Soret coupling. A more detailed analysis can be found in Sovran et al. (2001), Charrier-Mojtabi et al. (2007), and Elhajjar et al. (2008).
Convection Structures of Binary Fluid Mixtures in Porous Media Fig. 1 Critical Rayleigh-Darcy numbers Rc (top) for onset of stationary roll and square convection and corresponding critical wavenumbers kc (bottom) as functions of the separation ratio for different Lewis numbers L
Fig. 2 Critical Rayleigh-Darcy number Rc (top) for onset of oscillatory TW convection, the wavenumber kc (middle), and Hopf frequency !c (bottom) as function of separation ratio for different normalized porosities and fixed Lewis number L D 0:01
761
762
M. Augustin et al.
3
Structures for Positive Separation Ratios
3.1
Bifurcation Behavior of Roll, Crossroll, and Square Convection
The typical bifurcation scenario of stationary rolls, crossrolls, and squares at positive separation ratios is shown in Fig. 3 by means of the leading amplitudes ˆ101 and ˆ110 of the velocity potential (6) that enters into the Balance Eqs. (9). The parameters are exemplarily chosen to be D 0:4, L D 0:1, and k D . Here, rolls are represented by the Ansatz (18). Hence, one of the amplitudes ˆ101 and ˆ110 vanishes, whereas the other one takes the same value as the amplitude ˆ11 of Ansatz (16) did. We denote the latter by ˆR and it is depicted by the dashed line. For squares, we have ˆ101 D ˆ110 and we denote both by ˆS which is depicted by the solid line. Both, rolls and squares, emerge when the Rayleigh-Darcy number exceeds the critical value Rc 4:8. This value for the stationary bifurcation threshold is much smaller than the corresponding quantity for a pure fluid ( D 0), Rc0 D 4 2 . With increasing thermal driving, the leading amplitudes of both structures p increase monotonically. The ratio of these amplitudes is approximately ˆR =ˆS 2, i.e., ˆ2101 C ˆ2110 is approximately the same for both structures. This relation is almost exact in the so-called Soret regime of small thermal driving, but increases slightly for larger Rayleigh-Darcy numbers. Crossrolls exist only in a finite R-interval and connect the branches of squares, from which they bifurcate forward at R 34:05, with the branch of rolls. As a consequence, their leading amplitudes ˆ101 and ˆ110 , depicted by the dotted lines
Fig. 3 Bifurcation diagram of rolls (dashed line), squares (solid line), and crossrolls (dotted lines) for D 0:4; L D 0:1; k D . The leading amplitudes of the velocity potential ˆ entering into the Representation (6) of the velocity field are plotted versus the Rayleigh-Darcy number R
Convection Structures of Binary Fluid Mixtures in Porous Media
763
in Fig. 3, differ from each other: one grows with increasing R to take the value of the corresponding amplitude for rolls while the other decreases and finally vanishes. For a fixed wavenumber, the bifurcation scenario described above is valid as long as the Soret effect is sufficiently strong and the concentration diffuses slowly enough. However, the R-interval in which crossrolls exist shrinks with decreasing and growing L until crossrolls do not exist anymore. Quantitative results are given in Sect. 3.3.
3.2
Structure of the Fields for Roll and Square Convection
To illustrate the structure of the fields, we again fix the wavenumber at k D as the behavior is quite similar for a wide range of wavenumbers. As parameters, we choose L D 0:01, D 0:3, and D 1, resulting in a critical thermal driving of R 0:39. The chosen parameters are realistic for ethanol-water mixtures (Jung 2005). The three columns of Fig. 4 show from top to bottom the temperature field, concentration distribution, streamlines, and lateral profiles of w, T , and C at midheight z D 0 over one periodicity length in x. In the left column, the thermal driving is weak with R D 10 such that convection is mainly driven by the Soret effect. Consequently, the temperature field deviates
–0.25 0.5
0.5
0.25
0.25
z
z
0.5
0
0
0
–0.25
–0.25
–0.25
0.5
0.5
0.25
0.25
z
0.5 0.25
z
z z
0
–0.25
0.25
0
0
0
–0.25
–0.25
–0.25
0.4
20
40
0.2
10
20
0
0
0
–0.2 –0.4 0
–10 w T C
0.5
1 x
1.5
2
temperature
0
concentration
–0.25
0.25
z
z
0.25
0
R = 100
0.5
streamlines
0.25
z
R = 50
0.5
lateral profiles
R = 10
0.5
–20 w T 10C
–20 0
0.5
1 x
1.5
w T 10C
–40 2
0
0.5
1 x
1.5
2
Fig. 4 Structure of roll convection. Shown are from top to bottom the fields of temperature and concentration, the streamlines, and the lateral profiles of the vertical velocity w, of the temperature T , and of the concentration C (or 10C , respectively) at midheight, z D 0. Parameters are D 0:3, L D 0:01, k D
764
M. Augustin et al.
only marginally from its linear profile in the ground state. On the other hand, the concentration field develops a plumelike structure because concentration differences that arise as a consequence of advective transport are not reduced efficiently by the comparatively weak diffusion (L 1). Moreover, this anharmonicity of the concentration field can also be seen in its lateral profile, and it is transferred to the velocity field as can be seen in the lateral profile of w as well as in the streamlines (Umla et al. 2011). The value R D 50, which we chose for the middle column of Fig. 4, belongs to the so-called Rayleigh regime. Here, the thermal driving is strong enough to generate convection even without a (positive) Soret effect, leading to a more distinct modulation of the temperature field. However, the lateral profile of the temperature indicates that it is still harmonic. This transfers to the velocity field as can be seen by the almost harmonic streamlines. Thus, one concludes that the anharmonicity in the streamlines for R D 10 is indeed induced by the Soret effect. By contrast, for R D 50, the lateral profile of the concentration field is almost constant with a dip at x D 0 and peaks at the boundary (note the magnification factor of 10 for the lateral C profile). This is due to a good mixing in the bulk as a consequence of the higher fluid velocity and the development of boundary layers at the plates and between the rolls. In the right column of Fig. 4, the Rayleigh-Darcy number increases further to R D 100. Now the temperature field shows plumelike structures similar to the concentration field for R D 10 as a result of nonlinear advection of the temperature field. This yields again an anharmonic structure of the streamlines. The boundary layers in the concentration field become thinner. Figure 5 gives a comparison of square (left) and roll (right) convection in terms of the temperature fields (top two rows) and concentration fields (bottom two rows) in the vertical cross section y D 1 for parameters L D 0:01, D 0:3, and k D . The upper row of each subfigure depicts the situation for R D 20, lower row for R D 40. As can be seen, the temperature fields of both structures are barely distinguishable for the two values of thermal driving. For R D 20, the temperature is still rather similar to the ground state profile, whereas for R D 40, deviations become visible. On the other hand, the concentration fields of rolls and squares clearly look different. In the middle of the depicted cross section, at x D 1, the boundary layer in the downflow region for squares is broader. Boundary layers at the margins of the cross section are only present for rolls for which upflow streams are located here, whereas for squares upflow is found at .x; y/ D .0; 0/. This can be seen in Fig. 6 which shows from top to bottom isolines of the vertical velocity component w, of the temperature field, and of the concentration field in a horizontal cross section at z D 0. The Rayleigh-Darcy numbers and the other parameters are the same as in Fig. 5. According to the w-isolines, the fluid is convected from the lower to the upper plate across the corners of the cross section depicted. Then, the fluid streams near the upper plate to the middle of the depicted area and flows down to the lower plate again. As before, the similarity in the anharmonic behavior of concentration isolines and w-isolines for weak thermal
Convection Structures of Binary Fluid Mixtures in Porous Media
Squares
765
Rolls
0.5 0.25 0 –0.25
temperature
z
R = 20
z
0.25 0
–0.25 R = 40 R = 20
0
concentration
z
0.25 –0.25
z
0.25 0
–0.25 –0.5
0
1 x
20
1 x
2
R = 40
Fig. 5 Structure of square and roll convection in a vertical cross section covering one wavelength. Shown are gray-scale representations with isolines of temperature and of concentration in squares at y D 1 (left column) and of rolls (right column). Parameters are R D 20 (top stripe) and R D 40 (bottom stripe) of each subgraph. Furthermore, L D 0:01; D 0:3, and k D
driving is obvious as well as the smoothing out of the w-isolines with increasing thermal driving as buoyancy is mainly generated by temperature differences. Thus, for R D 40, the w-isolines look more similar to the temperature isolines than to the concentration isolines.
3.3
Stability Results
Figure 7 gives an overview over the stability regimes of the different structures and the corresponding stability boundaries in the L-R plane. Inside the regimes marked with R, S, or CR in this phase diagram, stationary rolls, squares, or crossrolls are stable. Neither of these structures are stable in the regimes denoted by OCR or OS. The parameters are chosen to be D 0:4, D 1, and k D . Note that only perturbations with the same lateral periodicity as the underlying pattern are taken into account, see (22). For R values above the thin solid line in Fig. 7, the ground state becomes unstable against stationary perturbations with the lateral wavenumber k D .
766
M. Augustin et al. R = 20
R = 40
isolines of w-field
2 1.5 y 1
0.5
1.5 temperature
y 1
0.5
concentration
1.5 y 1
0.5
0
0
0.5
1 x
1.5
0.5
1 x
1.5
2
Fig. 6 Structure of square convection in a horizontal cross section at midheight, z D 0, covering one wavelength. Shown are isolines of w (top) and gray-scale representations with isolines of temperature (middle) and of concentration (bottom). Parameters are R D 20 (left column) and R D 40 (right column). Furthermore, L D 0:01; D 0:3, and k D
The resulting pattern is a roll state when the Lewis number is larger than L 0:55. There, the dashed line branches from the thin line representing the stability boundary of the ground state in Fig. 7. Below the dashed line, rolls become unstable against stationary perturbations and are replaced by crossrolls. In the corresponding bifurcation diagram, this transition belongs to the point where the crossroll branch merges into the roll branch (compare with Fig. 3). Since crossrolls are even in x- as well as in y-direction and since they have the same periodicity in both directions,
Convection Structures of Binary Fluid Mixtures in Porous Media
767
Fig. 7 Phase diagram for D 0:4, k D , and
D 1. In the regimes denoted by G, R, S, and CR, the ground state, rolls, squares, and crossrolls are stable, respectively. In the area marked by OS, squares are destabilized by oscillatory perturbations with square symmetry. In the area marked by OCR, we expect oscillatory crossrolls
the same symmetries are found for the perturbations which trigger the replacement of rolls by crossrolls. For L < 0:55, rolls become unstable at the onset giving rise to stable squares. The stability regime of squares above onset is restricted by three different stability boundaries. The first one, denoted by the thick solid line, is related to stationary perturbations which tend to replace squares by crossrolls when the thermal driving becomes strong enough. This transition occurs at the Rayleigh-Darcy number where the crossroll branch bifurcates forward out of the square branch, see the bifurcation diagram in Fig. 3. The perturbations related to this transition are even in x- and y-direction and have the same periodicity in both directions for the same reasons as stated above for the transition between CR and R. Since squares are symmetric under reflection at the lateral main diagonal but crossrolls are not, the perturbations have to break this symmetry, i.e., they are antisymmetric under this reflection. The second stability boundary of squares is illustrated by the lower dotted line in Fig. 7. It occurs at small L only (here for L < 0:015) where it precedes the stability boundary described above. The corresponding perturbations have the same symmetries as the stationary perturbations which tend to replace squares by crossrolls, but here they are oscillatory. We suppose that the perturbations give rise to the growth of the so-called oscillatory crossrolls (OCR, see also Umla et al. 2011). The third stability boundary of squares, given by the dash-dotted line, destabilizes squares in the area denoted by OS in Fig. 7. The instability is generated by perturbations which are oscillatory but have the same spatial symmetries as the square pattern itself. Therefore, it has been named oscillatory square (OS) instability in Umla et al. (2011). For crossrolls, we detected only one relevant instability mechanism displayed by the upper dotted line in Fig. 7. By contrast, the dashed and the thick solid lines, which mainly restrict the regime of crossrolls, are not stability boundaries for crossrolls but indicate the finite R-interval in which the crossroll state exists: above the dashed line crossrolls merge into rolls and below the thick solid line into squares. The instability related to the dotted line occurs for small L (here for L < 0:015)
768
M. Augustin et al.
Fig. 8 Stability boundaries in the k R plane for
D 1 and several L combinations. The line style is as in Fig. 7
when the thermal driving becomes weak enough. This mechanism should replace crossrolls by oscillatory crossrolls, which are expected to exist in the small area denoted by OCR (Umla et al. 2011). The corresponding perturbations are oscillatory and show the same spatial symmetries as crossrolls. The wavenumber dependence of the stability boundaries is depicted in Fig. 8 where the k-R plane is shown for D 1 and several L combinations. The stability boundaries are denoted by the same line types as in the phase diagram, Fig. 7. The sequence of convection structures can be illustrated by means of the plot for the parameter combination D 0:4 and L D 0:01 in the lower left corner of Fig. 8. When the thermal driving increases at fixed k, say, k D , the ground state becomes unstable against squares above the thin solid line. Then, squares are destabilized by the OS perturbations above the lower dash-dotted line until they become stable again beyond the upper dash-dotted line. When R grows above the dotted line, squares are presumably replaced by oscillatory crossrolls which in turn are replaced by stationary crossrolls for slightly higher Rayleigh-Darcy numbers. Note that only the lower boundary of the OCR regime is displayed in the plot since the upper one lies very close (R < 0:25) and thus cannot be resolved sufficiently. Finally, stationary crossrolls are transformed into rolls beyond the dashed line. According to Fig. 8, this
Convection Structures of Binary Fluid Mixtures in Porous Media
769
sequence of structures remains unchanged for a wide range of wavenumbers, i.e., 2:55 k 4:44. For wavenumbers outside this range, however, the OCR regime vanishes. Moreover, the thick solid line kinks downward and the dashed line kinks upward at the small-k side, thus leaving over a regime which has not been tested for convection patterns. In addition, the plot for D 0:4 and L D 0:01 shows the following: when the wavenumber changes from k D to larger k, the onset of convection and with it the stability regimes of rolls, squares, and crossrolls as well as the OS regime are shifted toward higher R. The R-interval in which crossrolls are stable increases slightly with k, whereas the OS regime shrinks. According to the other plots in Fig. 8, these changes with k hold true for a wide range of L combinations. Moreover, Fig. 8 shows that the stability regimes of squares and crossrolls expand when is increased, i.e., when the Soret coupling becomes stronger. In other words, a strong Soret effect allows for 3D patterns to exist in a large R-interval. When the separation ratio , on the contrary, is decreased, the R k regime of squares shrinks until squares do no longer exist as stable convection pattern. Then also crossrolls transferring the stability from squares to rolls do no longer exist. A more detailed stability analysis of three-dimensional convective structures that discusses in particular the parameter dependence of the OS and OCR regime is given in Umla et al. (2011). The stability regime of rolls is shown in Fig. 9 for various values of and L. In the region in the k R plane labeled with S, rolls are stable to infinitesimal perturbations. In this case, perturbations with a different lateral periodicity as the underlying role pattern are considered as well. The region of stable rolls is mainly restricted by the crossroll (CR) boundary given by the solid line. Rolls with either too large or too small wavenumbers are destabilized by the CR instability, which causes the growth of rolls perpendicular to the existing pattern. The CR perturbations are even in x-direction and possess the same periodicity in the xdirection as the original rolls so that we find the CR instability for d D 0. In addition, the region of stable rolls is in general restricted on the small-k side by the dash-dotted line belonging to the zigzag (ZZ) boundary. If the rolls are zigzag unstable, a new set of rolls with a larger wavenumber begins to grow. The ZZ perturbations satisfy the mirror-glide symmetry (15) and are odd in x-direction. They have the same periodicity in x-direction as the existing pattern and hence we find the ZZ instability again for d D 0. For large and small L, the ZZ boundary no longer intersects the CR-stable region as the CR boundary shifts to larger wavenumbers. Consequently, in this case, the region of stable rolls is limited exclusively by the CR instability mechanism. For small L, a part of the CR boundary turns oscillatory at large R. The new oscillatory part has been named oscillatory crossroll (OCR) boundary in Umla et al. (2010) and is marked by the dashed-double-dotted line in Fig. 10, where L D 0:01, D 0:01. As a time-dependent phenomenon, the OCR boundary can depend on and a . With decreasing , the OCR boundary moves toward the CR boundary, and the R-values increase at which the boundary changes from stationary to oscillatory until the boundary is stationary at all R for sufficiently small . When a is varied
770
M. Augustin et al.
Fig. 9 Complete stability balloons of roll convection in the k R plane for several combinations of L and . The CR boundary is denoted by the solid line, the ZZ boundary by the dash-dotted one, and the stability boundary of the conductive ground state by the dashed one. The letter S marks the region of stable rolls
between 104 and 0, the stability boundaries change negligibly. The minor influence of a can be explained as follows: in nondimensional units, the diffusion time of the temperature and the concentration field are 1 and 1=L D 100, whereas the time scale a of the velocity field is at least by a factor 104 smaller and thus can be neglected.
4
Structures for Negative Separation Ratios
As mentioned before, there are two structures which we investigated for negative separation ratios: stationary rolls and traveling waves (TWs). To focus on their main features, we fix the parameters to be L D 0:01, D 0:4, and D 1. As can be seen by Fig. 2, a wavenumber of k D is then close to the critical wavenumber.
Convection Structures of Binary Fluid Mixtures in Porous Media
771
Fig. 10 Complete stability balloon of roll convection in the k R plane for L D 0:01 and D 0:01. The curve style is the same as in Fig. 9, except that the new OCR boundary is denoted by the dashed-double-dotted line Fig. 11 Bifurcation behavior of TW (solid) and roll (dashed) solutions, depicted by means of leading amplitude ˆ11 and TW frequency ! versus the Rayleigh-Darcy number R. The forward bifurcating roll branch for D 0 in a pure fluid (pf) is shown by the dotted line. Parameters are L D 0:01; D 0:4; k D , and D 1
For a thorough discussion on changes in the behavior of the structures for different parameters, we refer the reader to Augustin et al. (2010).
4.1
Bifurcation Behavior
Figure 11 illustrates by means of the leading amplitude ˆ11 and the Hopf frequency ! how the structures under consideration bifurcate out of the ground state. We included here the leading amplitude ˆ11 of rolls for a pure fluid ( D 0) for which no TWs can be found. The pure fluid case is already discussed by Straus (1974) and De La Torre Juárez and Busse (1995).
772
M. Augustin et al.
Let us first consider the solution of stationary rolls (dashed line in Fig. 11). While rolls bifurcate forward for the pure fluid, the bifurcation for negative Soret coupling is backward. The rolls bifurcation branch is no longer connected to the ground state as can be concluded from the divergence of its stationary stability threshold (Sovran et al. 2001; Charrier-Mojtabi et al. 2007; Elhajjar et al. 2008). Following the lower roll solution branch with decreasing Rayleigh-Darcy number R, the amplitude ˆ11 does not increase significantly. This changes when the saddle point is reached and the direction of the bifurcation branch changes. From here on, ˆ11 increases on the upper solution branch with increasing R in a way comparable to the behavior of the pure fluid rolls (dotted line in Fig. 11). The behavior of the TW branch is more complicated. Like the roll branch, it bifurcates backward out of the ground state but it is still connected to the ground state. At the bifurcation point, a Hopf frequency ! 16 can be found. Another difference to the roll branch is that the TW branch contains three saddle points at R 62:11, R 62:44, and R 56. Thus, it can be divided into two backward running parts (R 66:34 to R 62:11 and R 62:44 to R 56) with unstable TWs and two forward running parts (R 62:11 to R 62:44 and R 56 to R 67). At the end of the second forward running part, the TW merges with the rolls such that the nonlinear TW frequency ! vanishes for R D R 67. For R 62, we expect bistability of slow-traveling and fast-traveling waves in analogy to the Rayleigh-Bénard system without porous medium (Hollinger et al. 1997). The characterization of the stable TW structures as fast and slow is due to the behavior of the frequency which is proportional to the phase velocity and still large for the first forward running part of the bifurcation branch. It turns out that this similarity is characteristic for TW structures for a variety of parameters (Augustin et al. 2010).
4.2
Structure of the Fields for TW Convection
The structure of the fields for roll convection at negative looks rather similar to the one already discussed in Sect. 3.2 for positive separation ratios. Therefore, we only discuss temperature, concentration, and velocity for TW convection. For better orientation with regard to the bifurcation branch, Fig. 12 shows how the leading vertical velocity mode w11 D k 2 ˆ11 and the phase velocity vp D !=k varies with R. The four points labeled a–d mark the TW solutions for which the fields are shown in the corresponding subfigures a–d of Fig. 13. These subfigures show from top to bottom the temperature T , the concentration C , and streamlines of the velocity field in a comoving reference frame. The lowest plot of each subfigure contains the lateral wave profiles of T , C , and w at midheight, z D 0. As can be seen from the lateral profiles in Fig. 13a, there is a phase difference between vertical velocity, temperature, and concentration. This phase difference diminishes as we move along the bifurcation branch until it vanishes when the TW branch merges into the roll branch at R . The lateral profile of the temperature is almost harmonic for all R, but the lateral profile of the vertical velocity differs from a harmonic one at the maxima and minima in Fig. 13c, d. The behavior of the
Convection Structures of Binary Fluid Mixtures in Porous Media
773
Fig. 12 Leading amplitude w11 D k 2 ˆ11 (solid) of the vertical velocity field and phase velocity vp D !=k (dashed) of the TWs displayed in the bifurcation diagram of Fig. 11. Points a–d mark TW states for which the fields and lateral profiles are shown in Fig. 13. Parameters are L D 0:01, D 0:4, k D , and D 1
concentration profile is more complex as one follows the TW solution branch: it is harmonic in the states a and b and becomes trapezoidal in c, and in state d it is mainly constant apart from peaks where the fluid flows downward and dips where it flows upward. Note that the concentration profiles in Fig. 13c, d are multiplied by a factor 10 for better visibility. The first row in each of the Fig. 13a–d shows the temperature distribution. Starting from the linear ground state profile, it becomes more and more modulated. But, although the thermal driving in Fig. 13d is quite strong at R D 67, the temperature field can still be well described by taking into account only a few modes. The concentration field is shown in the second row of each subfigure in Fig. 13. The concentration isolines look similar to the streamlines of the TW structure that are shown in the third row of Fig. 13a–d. This similarity can be explained by the smallness of the Lewis number L (Lücke et al. 1992; Barten et al. 1995) and the fact that there is approximately no exchange of concentration between regions of closed and regions of open streamlines. The behavior of the streamlines can be modeled quite well by a single-mode representation of the velocity field. In a frame that comoves with the phase velocity vp of the TW, this truncation amounts for k D to uQ D w11 sin .x/ sin .z/ vp ;
(26)
vQ D 0;
(27)
wQ D w11 cos .x/ cos .z/ :
(28)
Here the tilde identifies the velocity field in the comoving frame. This simple model may already explain the occurrence of closed streamlines as soon as the velocity amplitude exceeds the phase velocity. For more details on this, see Augustin et al. (2010). Moreover, for the concentration field CQ that is stationary in the comoving frame we have
M. Augustin et al.
a
b
c
d
z
0.25
temperature
774
0
streamlines concentration
–0.25
z
0.25 0 –0.25
z
0.25 0 –0.25
temperature
z
0.25
lateral profiles
20 10 0 –10 –20
0
streamlines concentration
–0.25
z
0.25 0 –0.25
z
0.25 0 20 10 0 –10 –20
lateral profiles
–0.25
0
0.5
1 x
1.5
2
0
0.5
1 x
1.5
2
Fig. 13 Structure of TW convection. Shown are in each subfigure (top to bottom) temperature, concentration, streamlines of the velocity in a comoving frame, and lateral profiles of w (solid line), T (dashed line), and C (or 10C , respectively, dotted line) at z D 0. Subfigures a–d correspond to the TW states that are marked by a–d in Fig. 12. Parameters are L D 0:01, D 0:4, k D , and D 1
Convection Structures of Binary Fluid Mixtures in Porous Media
uQ rQ CQ D LrQ 2 CQ
TQ :
775
(29)
This shows that .rQ CQ / uQ is almost zero for small L such that the isolines of the concentration field are almost identical with the streamlines of the velocity field Q For a more detailed discussion on the relationship between the streamlines and u. concentration isolines, see Augustin et al. (2010).
4.3
Lateral Currents
Since the TWs lack the mirror symmetry of stationary rolls, lateral heat and concentration currents may arise, although no mean flow exists. The lateral average of the lateral heat current density hjT;x ix is given by hjT;x ix D huT C @x T ix :
(30)
u D @zx ˆ .x; z/ ;
(31)
T D T0 Rz C .x; z/ ;
(32)
With u and T given by
we get together with Ansatz (19)
hjT;x ix D
N X Nˆ Mˆ X X
4mk ŒIm.mn / Re.ˆml / Re.mn / Im.ˆml / fn .z/ @z fl .z/ :
mD1 nD1 lD1
(33) Here, Im denotes the imaginary part of a quantity. Obviously hjT;x ix does vanish if all lateral modes are in phase. The same holds true for the lateral concentration current hjC;x ix , ˆ, and c. Figure 14 shows the laterally averaged lateral currents of heat, hjT;x ix , and of concentration, hjC;x ix , respectively, for the TW states a–d of Fig. 12. Both currents satisfy the symmetry hjx ix .z/ D hjx ix .z/
(34)
which follows from Eq. (33) and from the mirror-glide symmetry (14). Consequently, the total averages hjT;x ix;z and hjC;x ix;z vanish in accordance with the vanishing of any mean flow. In the upper half of the system, the heat current is parallel to the direction of propagation of the TW, whereas the concentration current is antiparallel. These relations are reversed in the lower half of the system. The heat current hjT;x ix always vanishes at z D ˙0:5 as a consequence of the boundary condition D 0. However, the concentration current hjC;x ix does not
776
M. Augustin et al.
Fig. 14 Vertical profiles of averaged lateral currents sustained by TWs. Full and dashed lines show the ˛ of concentration, ˛ ˝ ˝currents jC;x x , and of heat, jT;x x , respectively. The subfigures a–d correspond to the TW states that are identified by the points a–d in Fig. 12. Parameters are L D 0:01, D 0:4, k D , and
D 1
vanish in general at the plates because of the different boundary conditions (4) for the concentration field c. Following the TW branch from its bifurcation out of the conductive state up to R , the amplitudes of hjT;x ix and of hjC;x ix grow at first with growing amplitude of ˆ, , and modes. But then – i.e., beyond point c in Fig. 12 – the amplitudes of the laterally averaged currents decrease again, and the currents vanish as the TW branch transitions at R into the SOC branch.
5
Conclusion
The goal of this article is to provide the reader with an overview on some convection patterns of binary fluid mixtures in a Rayleigh-Bénard system with a porous medium that may be regarded as a simple model system for real-world applications such as geothermal reservoirs, water flow in aquifers or magma flow deep inside the Earth. We considered four kinds of convection structures: rolls, squares, and crossrolls in the case of a positive Soret effect as well as rolls and traveling waves in the case of a negative Soret effect. All structures have been determined numerically with a Galerkin method. Moreover, in the case of positive separation ratios, we carried out a full linear stability analysis for rolls and a restricted linear stability analysis for squares. Rolls are the primary convection structure for a positive Soret effect. Their temperature and concentration fields behave rather similar to rolls in a RayleighBénard system without a porous medium, whereas the streamlines show a previously unknown anharmonicity. This is due to the fact that anharmonicities of the fields that mainly drive convection – the concentration field in the Soret regime and the temperature field in the Rayleigh regime – are not smoothed by a diffusion term in
Convection Structures of Binary Fluid Mixtures in Porous Media
777
the velocity equation but transferred directly by a relaxation term. The same results hold true for square convection. The stability region of rolls in parameter space is mainly restricted by crossroll and zigzag perturbations. For a sufficiently strong Soret effect, the crossroll boundary undergoes two significant changes. Firstly, it is detached from the ground state, giving rise to other stable forms of convection, i.e., squares. Secondly, for small values of the Lewis number, a part of the crossroll boundary changes its behavior in the way that the corresponding perturbations become oscillatory. If both, rolls and square states, exist, they bifurcate simultaneously forward out of the ground state. For a sufficiently strong Soret effect, these convection states are connected via a crossroll branch that bifurcates forward out of the square state for sufficiently large R. This bifurcation behavior already gives the main limitation to the stability region of squares. For small Lewis numbers and large separation ratios, the crossroll boundary may be partially oscillatory. Moreover, for small Lewis number, we found a stability threshold corresponding to oscillatory perturbations with square symmetry. For a strong negative Soret coupling, the bifurcation branch of rolls is no more connected to the ground state, going backward without increasing much, passing a saddle point, and increasing as it turns forward again. The bifurcation branch of traveling waves is still connected to the ground state, bifurcating out of it backward and passing three saddle points. This gives rise to the possible coexistence of slowand fast-traveling waves. From the last saddle point, the TW branch bifurcates forward and finally merges with the roll branch. Whereas the structure of the temperature field of TW patterns is rather similar to the temperature field of rolls, the streamlines and concentration field show significant differences. The behavior of the streamlines can be understood by a simple single-mode model which already explains the occurrence of closed streamlines as soon as the velocity amplitude exceeds the phase velocity. The behavior of the concentration field is strongly coupled to the behavior of the streamlines due to the smallness of the Lewis number. Another difference between TW and roll states is the existence of nonvanishing lateral heat and concentration currents in the former. However, due to the fact that the linearity of Darcy’s law prevents the appearance of a mean flow, the total averages of both currents vanish, too.
References Augustin M, Umla R, Huke B, Lücke M (2010) Stationary and oscillatory convection of binary fluids in a porous medium. Phys Rev E 82:056303 Barten W, Lücke M, Hort W, Kamps M (1989) Fully developed travelling-wave convection in binary fluid mixtures. Phys Rev Lett 63:376–379 Barten W, Lücke M, Kamps M, Schmitz R (1995) Convection in binary fluid mixtures. I. Extended traveling-wave and stationary states. Phys Rev E 51:5636–5661
778
M. Augustin et al.
Brand H, Steinberg V (1983) Nonlinear effects in the convective instability of a binary mixture in a porous medium near threshold. Phys Lett A 93:333–336 Charrier-Mojtabi M-C, Elhajjar B, Mojtabi A (2007) Analytical and numerical stability analysis of Soret-driven convection in a horizontal porous layer. Phys Fluids 19:124104 De La Torre Juárez M, Busse FH (1995) Stability of two-dimensional convection in a fluidsaturated porous medium. J Fluid Mech 292:305–323 Elhajjar B, Charrier-Mojtabi M-C, Mojtabi A (2008) Separation of a binary fluid mixture in a porous horizontal cavity. Phys Rev E 77:026310 Hollinger S, Lücke M (1998) Influence of the Soret effect on convection of binary fluids. Phys Rev E 57:4238–4249 Hollinger S, Büchel P, Lücke M (1997) Bistability of slow and fast traveling waves in fluid mixtures. Phys Rev Lett 78:235–238 Horton C, Rogers F (1945) Convection currents in a porous medium. J Appl Phys 16:367–370 Howle LE, Behringer RP, Georgiadis JG (1997) Convection and flow in porous media. Part 2. Visualization by shadowgraph. J Fluid Mech 332:247–262 Huke B, Lücke M, Büchel P, Jung C (2000) Stability boundaries of roll and square convection in binary fluid mixtures with positive separation ratio. J Fluid Mech 408:121–147 Huke B, Pleiner H, Lücke M (2007) Convection patterns in colloidal solutions. Phys Rev E 75:036203 Jung D (2005) Ausgedehnte und lokalisierte Konvektion in zweikomponentigen Flüssigkeiten: Wellen, Fronten und Pulse. PhD thesis, Universität des Saarlandes Knobloch E (1986) Oscillatory convection in binary mixtures. Phys Rev A 34:1538–1549 Lapwood E (1948) Convection of a fluid in a porous medium. Proc Camb Philos Soc 44:508–521 Linz SJ, Lücke M, Müller H, Niederländer J (1988) Convection in binary fluid mixtures: traveling waves and lateral currents. Phys Rev A 38:5727–5741 Lücke M, Barten W, Kamps M (1992) Convection in binary mixtures: the role of the concentration field. Phys Rev E 61:183–196 Müller HW, Lücke M (1988) Competition between roll and square convection patterns in binary mixtures. Phys Rev A 38:2965–2974 Nield DA, Bejan A (eds) (2006) Convection in porous media. Springer, New York O’Sullivan MJ, Pruess K, Lippmann MJ (2001) State of the art of geothermal reservoir simulation. Geothermics 30:395–429 Platten JK (2006) The Soret–effect: a review of recent experimental results. J Appl Mech 73:5–15 Pruess K (1990) Modeling of geothermal reservoirs: fundamental processes, computer simulation and field applications. Geothermics 19:3–15 Saravanan S, Kandaswamy P (2003) Convection currents in a porous layer with a gravity gradient. Heat Mass Trans 39:693–699 Shattuck MD, Behringer RP, Johnson GA, Georgiadis JG (1997) Convection and flow in porous media. Part 1. Visualization by magnetic resonance imaging. J Fluid Mech 332:215–245 Sovran O, Charrier-Mojtabi M-C, Mojtabi A (2001) Naissance de la convection thermo-solutale en couche poreuse infinie avec effet Soret: onset on Soret-driven convection in an infinite porous layer. C R Acad Sci Ser IIb Mec 329:287–293 Straus JM (1974) Large amplitude convection in porous media. J Fluid Mech 64:51–63 Thomas O (2007) Reservoir analysis based on compositional gradients. PhD thesis, Stanford University Umla R, Augustin M, Huke B, Lücke M (2010) Roll convection of binary fluid mixtures in porous media. J Fluid Mech 649:165–186 Umla R, Augustin M, Huke B, Lücke M (2011) Three-dimensional convection of binary mixtures in porous media. Phys Rev E 84:056326 Vadasz P, Olek S (1999) Weak turbulence and chaos for low Prandtl number gravity driven convection in porous media. Transp Porous Media 37:69–91 Vadasz P, Olek S (2000) Route to chaos for moderate Prandtl number convection in a porous layer heated from below. Transp Porous Media 41:211–239 Vafai K (ed) (2005) Handbook of porous media. Springer, New York Veronis G (1966) Large-amplitude Bénard convection. J Fluid Mech 26:49–68
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models Johannes Wicht, Stephan Stellmach, and Helmut Harder
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mathematical Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Basic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Numerical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Numerical Dynamo Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Force Balances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Dynamo Regimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Scaling Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Double Diffusive Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Dynamo Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Is There a Distinct Low Ekman Number Regime? . . . . . . . . . . . . . . . . . . . . . . . . . 4 Comparison with the Geomagnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
780 783 783 786 787 790 790 796 799 801 802 803 812 826 828
Abstract
The last years have witnessed an impressive growth in the number and quality of numerical dynamo simulations. The numerical models successfully describe many aspects of the geomagnetic field and also set out to explain the various fields of other planets. The success is somewhat surprising since numerical limitations force dynamo modelers to run their models at unrealistic parameters.
J. Wicht () Max-Planck Intitut für Sonnensystemforschung, Kaltenburg-Lindau, Germany e-mail: [email protected] S. Stellmach • H. Harder Institut für Geophysik, Westfählische Wilhelms-Universität, Münster, Germany e-mail: [email protected]; [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_16
779
780
J. Wicht et al.
In particular the Ekman number, a measure for the relative importance of viscous to Coriolis forces, is many orders of magnitude too large: Earth’s Ekman number is E D 1015 , while even today’s most advanced numerical simulations have to content themselves with E D 106 . After giving a brief introduction into the basics of modern dynamo simulations, we discuss the fundamental force balances and address the question how well the modern models reproduce the geomagnetic field. First-level properties like the dipole dominance, realistic Elsasser and magnetic Reynolds numbers, and an Earth-like reversal behavior are already captured by larger Ekman number simulations around E D 103 . However, low Ekman numbers are required for modeling torsional oscillations which are thought to be an important part of the decadal geomagnetic field variations. Moreover, only low Ekman number models seem to retain the huge dipole dominance of the geomagnetic field once the Rayleigh number has been increased to values where field reversals happen. These cases also seem to resemble the low-latitude field found at Earth’s core-mantle boundary more closely than larger Ekman number cases.
1
Introduction
Joseph Larmor was the first to suggest that the magnetic fields of Earth and sun are maintained by a dynamo mechanism. The underlying physical principle can be illustrated with the homopolar or disc dynamo shown in Fig. 1. A metal disc rotates with speed ! in an initial magnetic field B0 . The motion induces an electromotive force that drives an electric current flowing from the rotation axis to the outer rim of the disc. Sliding contacts close an electric circuit through a coil which produces
Fig. 1 The homopolar or disc dynamo illustrates the dynamo process
ω B0
Metal disc
Rotation axis
Coil Sliding contacts
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
781
a magnetic field B. This may subsequently replace B0 provided the rotation rate exceeds a critical value !c that allows to overcome the inherent ohmic losses. The dynamo is then called self-excited. Lorentz forces associated with the induced current break the rotation and limit the magnetic field strength to a value that depends on the torque maintaining the rotation (Roberts 2007). The disc dynamo, like any electrical generator, works because of the suitable arrangement of the electrical conductors. In Earth, however, the dynamo process operates in the liquid outer part of Earth’s iron core, a homogeneously conducting spherical shell. For these homogeneous dynamos to function, the motion of the liquid dynamo medium and the magnetic field itself must obey a certain complexity. Unraveling the required complexity kept dynamo theoreticians busy for some decades and is still a matter of research. Cowling’s theorem, for example, states that a perfectly axisymmetric magnetic field cannot be maintained by a dynamo process (Cowling 1957; Ivers and James 1984). The same is true for a purely toroidal field (Kaiser et al. 1994), a field whose field lines never leave the dynamo region. Whether or not certain flow configurations would provide dynamo action was often explored in a kinematic dynamo setup where the flow field is prescribed and steady in time. The magnetic field generation is formulated by the induction equation which can be derived from the pre-Maxwell equations (Roberts 2007): @B D Rm r .U B/ C r 2 B : @t
(1)
We have chosen a nondimensional form here that uses a length scale ` – for example, the shell thickness – and the magnetic diffusion time D `2 = as the time scale. Here, D 1=./ is the magnetic diffusivity with the magnetic permeability (of vacuum) and the electrical conductivity. Rm is the magnetic Reynolds number Rm D
u`
(2)
that depends on the typical flow amplitude u. It measures the ratio of magnetic field production to diffusion via ohmic decay and has to exceed a certain critical value Rmc for dynamo action to arise. The induction Eq. (1) establishes an eigenvalue problem with eigensolutions B D B0 exp t . The dynamo is said to work when at least one solution has a positive growth rate, i.e., an eigenvalue with a positive real part. Several simple parameterized flows have been found to support dynamo action. Two-dimensional flows that depend on only two Cartesian coordinates are particularly simple examples. The flow U D sin y xO C sin x xO C .cos x cos y/ zO
(3)
suggested by Roberts (1972) inspired the design of the successful dynamo experiment in Karlsruhe, Germany (Stieglitz and Müller 2001). Here, xO , yO , and zO are the three Cartesian unit vectors. A consistently correlated helicity H D U .r U/ has
782
J. Wicht et al.
been shown to be important for providing large-scale dynamo action, and the above flow fulfills this criterion with a helicity that never changes sign. The convective motions in the dynamo regions of stars and planets are driven by temperature differences. In terrestrial planets with a freezing inner core, compositional differences provide an additional driving force: Only a small fraction of the lighter constituents (sulfur, oxygen, silicon) that are mixed into the liquid iron/nickel phase are build into the solid inner-core matrix. The remaining material is expelled at the growing inner-core front and leads to an instable compositional density gradient. Because of the highly supercritical Rayleigh numbers in planetary and stellar dynamo regions, the motions are likely turbulent and rather small scale. The feedback of the magnetic field on the flow via the Lorentz force limits the magnetic field growth and thereby determines the magnetic field strength. Typical state-of-the art dynamo codes are self-consistent and simultaneously solve for the evolution of magnetic field, flow, pressure, temperature, and possibly composition in a rotating spherical shell. The principal setup is shown in Fig. 2. Early self-consistent models demonstrated the general validity of the concept (Zhang and Busse 1988). The publications by Kageyama and Sato (1995) and Glatzmaier and Roberts (1995), however, are regarded as the first milestones of modern dynamo modeling. Since then, several comparable modern dynamo models have been developed; Christensen and Wicht (2007), Wicht et al. (2009), Wicht and Christensen (2010) and Jones et al. (2011) provide recent overviews. These models are capable of reproducing the strength and large-scale geometry of the geomagnetic field and also help to understand several smaller-scale features. Many aspects of the geomagnetic field variability are also convincingly captured. Several dynamo models show dipole excursions and reversals very similar to those documented for Earth (Glatzmaier and Roberts 1995; Glatzmaier and Coe 2007; Wicht et al. 2009). Advanced models even show torsional oscillations thought to Ω
Tangent cylinder Outer boundary (r =ro): electrically insulating heat sink Fluid volume: electrically conducting heat source, compositional sink Inner boundary (r =ri): electrically conducting heat source, compositional source Fig. 2 Principle dynamo setup
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
783
be an important component of the decadal geomagnetic field variation (Zatman and Bloxham 1997; Wicht et al. 2009). In a more recent development, numerical dynamo simulations have also ventured to explain the distinctively different magnetic fields of other planets in our solar system (Stanley and Bloxham 2004; Stanley et al. 2005; Takahashi and Matsushima 2006; Christensen 2006; Wicht et al. 2007; Stanley and Glatzmaier 2010). The success of modern dynamo simulations seems surprising since the numerical limitations force dynamo modelers to run their codes at parameters that are far from realistic values. For example, the fluid viscosity is generally many orders of magnitude too large in order to damp the small-scale turbulent structures that cannot be resolved numerically. This bears the question whether the numerical dynamos really operate in the same regime as planetary dynamos or reproduce the observed properties for the wrong reasons (Glatzmaier 2002). While the numerical simulations certainly do a bad job in modeling the very small-scale flow dynamics, they may nevertheless capture the larger-scale dynamo process correctly. This view is supported by recent analysis which show that the numerical results can be extrapolated to the field strength of several planets (Christensen and Aubert 2006) and even fast-rotating stars (Christensen et al. 2009) with rather simple scaling laws. The quest for reaching more realistic parameters nevertheless remains and is approached with new numerical methods that we will discuss in Sect. 2.3. Other efforts for improving the realism concern, for example, the treatment of thermal and/or compositional convection including appropriate boundary conditions. We will briefly touch on these points in Sect. 3. This chapter concentrates on the question to which degree recent numerical dynamo simulations succeed in modeling Earth’s magnetic field and which parameter combination helps to reach this goal. Our good knowledge of the geomagnetic field makes it an ideal test case for numerical models. We furthermore speculate which changes in the solutions can be expected should the increasing computing power allow to run the models at more realistic parameters in the future. The chapter is organized as follows. The mathematical formulation of the dynamo problem is outlined in Sect. 2.1. In Sect. 2.3, we provide an overview of the divers numerical methods employed in dynamo simulations. Section 3 reviews the results from recent simulations and addresses the question how well the solutions describe Earth’s magnetic field. A conclusion in Sect. 5 closes the chapter.
2
Mathematical Formulation
2.1
Basic Equations
Self-consistent dynamo models simultaneously solve for convection and magnetic field generation in a viscous electrically conducting fluid. The convection is driven by temperature differences, and compositional differences also contribute when a freezing inner core is modeled. Convection and magnetic field are treated as
784
J. Wicht et al.
small disturbances around a reference state which is typically assumed to be nonmagnetic, hydrostatic, adiabatic, and compositionally homogeneous. Most dynamo simulations of terrestrial planets neglect density and temperature variations of the reference state in the so-called Boussinesq approximation. In Earth’s outer core, both quantities increase by about 25 %, and similar values can be expected for the other terrestrial planets. These values indicate that the Boussinesq approximation may at least serve as a first approximation that offers considerable simplifications of the problem: Viscous heating and ohmic heating drop out, and only the density variations due to temperature and composition fluctuations in the buoyancy term are retained (Braginsky and Roberts 1995). The following equations comprise the mathematical formulation of the dynamo problem in the Boussinesq approximation: the Navier-Stokes equation dU r D rP 2Oz U C Ra? C rO C .r B/ B C Er 2 U ; dt ro
(4)
the induction equation E @B D r .U B/ C r 2B ; @t Pm
(5)
the codensity evolution equation dC E 2 D r C Cq ; dt Pr
(6)
r UD0 ;
(7)
the flow continuity equation
and the magnetic continuity equation rBD0 :
(8)
Here, d =dt stands for the substantial time derivative @=@t CUr, U is the convective flow, B is the magnetic field, P is a modified pressure that also contains centrifugal effects, and C is the codensity explained below. The equations are given in a nondimensional form that uses the shell thickness ` D ro ri as a length scale, the rotation period 1 as a time scale, the codensity difference c across the shell as the codensity scale, and ./ N 1=2 ` as the magnetic scale. Here, N is the reference state density. The radii of inner and outer boundary are denoted by ri and ro , respectively.
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
785
The model is controlled by five dimensionless parameters: the Ekman number ED
; `2
(9)
the modified Rayleigh number Ra? D
gN o c ; 2 `
(10)
the Prandtl number Pr D
;
(11)
Pm D
;
(12)
D ri =ro :
(13)
the magnetic Prandtl number
and the aspect ratio
The modified Rayleigh number Ra? is connected to the more classical Rayleigh number Ra D
gN o c`3
(14)
via Ra? D RaE 2 Pr1 . The five dimensionless parameters combine eight physical properties of which the kinematic viscosity , the thermal and/or compositional diffusivity , and the outer boundary reference gravity gN 0 have not been introduced so far. The above equations employ a simplification that has been adopted by several dynamo models for terrestrial planets: The density variations due to the superadiabatic temperature – only this component contributes to convection – and due to deviations & from the homogeneous reference state composition &N are combined in the codensity c: c D ˛T C & ;
(15)
where ˛ and are the thermal and compositional expansivity, respectively. For a simplified binary model with heavy constituents (iron, nickel) of total mass mH and light constituents (sulfur, oxygen, carbon) of total mass mL , the reference state composition is &N D mL =.mL C mH /. The compositional expansivity is then given by D . N H L /=H L , where H and L are the mean densities of heavy
786
J. Wicht et al.
and light elements in the liquid core, respectively. In Eq. (15) c, T , and & refer to dimensional values. Describing the evolution of temperature and composition by the combined Eq. 6 for the dimensionless codensity C D c=c assumes that both quantities have similar diffusivities. This seems like a daunting simplification since the chemical diffusivity may be three orders of magnitude smaller than its thermal counterpart (Braginsky and Roberts 1995). However, the approach is often justified with the argument that the small-scale turbulent mixing, which cannot be resolved by the numerical codes, may result in larger effective turbulent diffusivities that are of comparable magnitude (Braginsky and Roberts 1995). This has the additional consequence that the “turbulent” Prandtl number and magnetic Prandtl number would become of order one (Braginsky and Roberts 1995). Some studies suggest that the differences in diffusivities may have interesting implications and we will briefly discuss this issue in Sect. 3.4. For numerical and theoretical purpose, it is convenient to represent flow field and magnetic field by a poloidal and a toroidal contribution, respectively: U D r r .Orv/ C r .Orw/ ; B D r r .Org/ C r .Orh/ :
(16)
v and g are the poloidal scalar fields, w and h are the toroidal counterparts. Toroidal fields have no radial components and are therefore restricted to spherical shells. The toroidal magnetic field of a planet never leaves the dynamo region and cannot be measured on the planet’s surface. When using the ansatz (16), the continuity equations for flow (7) and magnetic field (8) are fulfilled automatically, and the number of unknown fields reduces from eight – the three components of U and of B, pressure P , and codensity C – to six. The equations solved for the six unknown fields are typically the radial components of the Navier-Stokes and the induction equation (4) and (5), respectively, the radial components of the curl of these two equations, the horizontal divergence of the Navier-Stokes equation, and the codensity evolution equation (6) (Christensen and Wicht 2007).
2.2
Boundary Conditions
The differential equations for the dynamo problem must be supplemented with appropriate boundary conditions. For the flow U, either rigid or free slip boundary conditions are used. In both cases, the radial flow component is forced to vanish at both boundaries. The horizontal flow components must match the rotation of the boundary for rigid boundary conditions. In the free slip case, the horizontal components of the viscous stresses are forced to zero instead (Christensen and Wicht 2007). While no-slip flow conditions seem most appropriated for the dynamo regions of terrestrial planets, some authors nevertheless use free slip boundaries to avoid the thicker Ekman boundary layers that develop at the unrealistically large Ekman numbers used in the numerical simulations (Kuang and Bloxham 1997; Busse and Simitev 2005b).
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
787
Most dynamo models assume a purely thermal driving and either employ fixed temperature or fixed heat flux boundary conditions, mostly for convenience. The latter translates to a fixed radial temperature gradient and requires a modification of the Rayleigh number (10) where c now stands for the imposed gradient times length scale `. For terrestrial planets, the much slower evolving mantle controls how much heat is allowed to leave the core, so that a heat flux condition is appropriate. Lateral variations on the thermal structure of the lower mantle will translate into an inhomogeneous core-mantle boundary heat flux (Aubert et al. 2008a) (see section “Persistent Features and Mantle Influence”). Since the light core elements cannot penetrate the mantle at any significant rate, a vanishing flux is the boundary condition of choice for the compositional component (Kutzner and Christensen 2000). The conditions at the boundary to a growing inner core are somewhat more involved. The heat flux originating from the latent heat of the phase transition is proportional to the mass undergoing the phase transition and thus to the compositional flux of light elements. The flux itself depends on the local cooling rate d T =dt which is determined by the convective dynamics and changes with time (Braginsky and Roberts 1995). The resulting conditions have so far only been implemented by Glatzmaier and Roberts (1996), while all other models use either a fixed codensity or codensity gradient at r D ri . The possible consequences of thermal boundary conditions are further discussed in section “The Impact of Lorentz Forces and Buoyancy Boundary Conditions on the Flow Scales in Numerical Simulations.” Since the conductivity of the rocky mantle in terrestrial planets is orders of magnitudes lower than that of the core, the magnetic field can be assumed to match a potential field at the interface r D ro . This matching condition can be formulated as a magnetic boundary condition for the individual spherical harmonic field contributions (Christensen and Wicht 2007). The same applies at the boundary to an electrically insulating inner core assumed in some dynamo simulations for simplicity. A simplified induction equation (5) must be solved for the magnetic field in a conducting inner core which has to match the outer core field at ri . We refer to Roberts (2007) and Christensen and Wicht (2007) for a more detailed discussion of the magnetic boundary conditions.
2.3
Numerical Methods
Up to now the vast majority of dynamo simulations rely on pseudo-spectral codes. A spectral approach had first been applied to the dynamo problem by Bullard and Gellman (1954) and gained more widespread popularity with the pioneering work of Glatzmaier (1984). Since then the method has been optimized, and several variants are in use today. Here, we can only sketch the basic ideas of the approach and refer for more details to the original work (Glatzmaier 1984) or to the recent review by Christensen and Wicht (2007). The core of the pseudo-spectral codes is the dual representation of the variables in grid space as well as in spectral space. The individual computational steps are
788
J. Wicht et al.
performed in whichever representations they are most efficient. Partial derivatives are usually calculated in spectral space, whereas the nonlinear terms are calculated in grid space. In the lateral directions (colatitude ‚ and longitude ˆ) spherical harmonics Ylm of degree l and order m are the obvious choice as base functions of the spectral expansion. Since spherical harmonics are eigenfunctions of the lateral Laplace operator, the calculation of the diffusion terms in the governing equations is particularly simple. Mostly Chebychev polynomials Cn are chosen as radial representations, where n is the degree of the polynomial. Chebychev polynomials allow for fast Fourier transforms and offer the additional advantage of an increased grid resolution near the inner and outer boundaries, where thin boundary layers may develop. Time stepping is performed in spherical harmonic-radial space (l; m; r) using a mixed implicit/explicit scheme that handles the nonlinear terms as well as the Coriolis force in, for example, an explicit Adams-Bashforth manner. The resulting scheme decouples all spherical harmonic modes .l; m/ which allows a fast calculation. A Crank-Nicolson implicit scheme is often used to complete the time stepping for the remaining terms. It is also possible to include the Coriolis force in the implicit discretization, but this would destroy the decoupling in the latitudinal index l. The nonlinear terms are always calculated in local grid space. This requires a transform from a spectral .l; m; r/ to a grid .‚; ˆ; r/ representation and back. Fast Fourier transforms are applicable in radius and longitude; however, in colatitude ‚, the best way to calculate the Legendre transform is by Gauss integration. These Gauss-Legendre transforms are usually the most time-consuming parts of the spectral transform method and evolve to a severe bottleneck in highresolution cases. As stated above, the continuity equations r U D 0 and r B D 0 are fulfilled identically by the introduction of poloidal and toroidal scalar fields (16), which is a very convenient feature of the method. Higher-order lateral derivatives of the fields (6th order) are required, but this creates no serious problem since the lateral Laplace operator is easily and accurately determined in spectral space. More problematic is the implementation of parallel computing. Since derivatives are calculated in spectral space and the spectral base functions – spherical harmonics and Chebychev polynomials – are globally defined, global inter-processor communication is needed. In fact, derivatives are determined in spectral space by recurrence relations (Glatzmaier 1984) which can be regarded as spectral next-neighbor calculation. However, in order to calculate a single function value at a given point in grid space, the complete spectrum must be available for the particular process. Therefore the calculation of the nonlinear terms during one time step requires the exchange of the complete spectral dataset between the parallel processes, i.e., global inter-processor communication. This is a serious obstacle for the efficient use of parallel computing with spectral methods. If only a few parallel processes are required, the solution is a onedimensional partition of the grid space, for example, in radial or longitudinal direction. This is particularly easy if a shared memory machine is used. For massively parallel computing using several thousands of processes, the solution is
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
789
much more cumbersome. However, as demonstrated by Clune et al. (1999), efficient parallel computing is also possible for spectral methods although the implementation is rather complex. Some approaches replace the Chebychev polynomials in radius by a finite difference approximation (Dormy et al. 1998; Kuang and Bloxham 1999). This adds some more flexibility in the radial discretization, at the expense of accuracy. For these reasons, the interest in applying local methods to dynamo problems has considerably increased in recent years. Kageyama et al. (1993) and Kageyama and Sato (1997) are early examples for a finite difference compressible approach. However, local approaches, like finite differences, finite volumes, or finite elements, are easier to apply to a compressible flow since in this case the pressure is determined more directly by an equation of state. Since an incompressible flow is a good approximation for the liquid iron core of the terrestrial planets, it seems not appropriate to complicate the physics of the problem further by compressibility. In sharp contrast to spectral methods, the use of poloidal and toroidal potentials (16) is not popular in three-dimensional flow simulations with local methods to avoid the 6th-order derivatives then required. Instead it is common practice to retain the primitive variables velocity and pressure. In the last years, several local primitive variable approaches have been developed for dynamo simulations. A common feature is that the continuity equation is satisfied iteratively by solving a Poisson-type equation for the pressure or pressure correction although the detailed strategies differ. Usually this correction iteration is by far the most computationally intensive part of local approaches. In other aspects, for example, grid structure, the applied methods are extremely diverse. For example, Kageyama and Yoshida (2005) use a yin-yang grid approach with two overlapping longitude-latitude grids. For this finite difference method, the authors report simulations with nearly up to 109 grid points and up to 4,096 parallel processes on the Earth Simulator System. Matsui and Buffett (2005) apply a finite element approach with hexahedral elements based on the parallel GeoFEM thermalhydraulic subsystem developed for the Japanese Earth Simulator project. Harder and Hansen (2005) use a collocated finite volume method on a cube-sphere grid, i.e., a grid which is projected from the six surfaces of an inscribed cube to the unit sphere. Hejda and Reshetnyak (2003) and Hejda and Reshetnyak (2004) also utilize a finite volume method but with a staggered arrangement of variables and a single longitude-latitude grid. Chan et al. (2006) describe both a finite element approach based on a icosahedral triangulation of a spherical surface and a finite difference approach based on a staggered longitude-latitude grid. The method of Fournier et al. (2005) uses a mixture of spectral and local methods: In azimuthal direction, a Fourier expansion is used, whereas in the meridional plane, a spectral element approximation is applied, based on macroelements with polynomials of high order. Wicht et al. (2009) have compiled published benchmark data of the various local methods and compared them with the dynamo benchmark of Christensen et al. (2001) which was established solely by spectral methods. Since the benchmark cases were defined at a very moderate Ekman number of E D 103 , the solutions are
790
J. Wicht et al.
very smooth and can be approximated with high accuracy by a few eigenfunctions. In this regime fully spectral methods are much more efficient than local methods. Therefore it is not surprising that of the mentioned local methods, only the spectral element approach by Fournier et al. (2005) could compete with the accuracy of fully spectral methods. Presumably the situation would be less clear at lower Ekman numbers, where solutions are less diffusion dominated and the solution spectra are more flat. As already stated, the motivation for the application of local approaches to the dynamo problem is the attempt to better utilize massively parallel computation. Since simulations at low Ekman number, E < 105 , are desirable for a better representation of the force balance in the Earth’s core, and such simulations need as much computational power as possible, the approach seems to be natural. However, although local methods are better suited for parallel computing than spectral methods, local approaches have some other disadvantages which may degrade the performance. For example, in local methods it is much more expensive to fulfill the continuity equations. Another obstacle is the implementation of the magnetic boundary condition: Usually an insulating exterior is assumed in dynamo simulations. However this cannot be implemented as a local boundary condition. The easiest way is to match a series expansion of spherical harmonics at the boundary which is easily done with a spectral method. This is also possible with a local approach but requires a transform of the solution at the boundary to spectral space. The amount of extra work depends mainly on the chosen grid structure. An alternative would be to extend the computational domain and solve also for the external field. A more elegant way is to calculate the external field by a boundary element technique (Isakov et al. 2004). The simplest, but somewhat nonphysical approach would be to replace the insulating condition by a local approximation, i.e., apply the so-called quasi-vacuum condition which corresponds to vanishing tangential field components at the boundary (Kageyama and Sato 1997; Harder and Hansen 2005). Up to now, it is not clear which is the most suitable alternative. Due to these limitations, local methods are often still outperformed by spectral approaches even on thousands of processors. Spectral codes therefore remain the method of choice for most dynamo simulations and have exclusively been used for the examples presented in this chapter. The situation may change in the future when more demanding simulations at low Ekman number require even higher processor numbers, or for more specialized cases (lateral variations of material parameters, topographic distortion of the boundaries, etc.) not amenable to spectral methods.
3
Numerical Dynamo Solutions
3.1
Force Balances
The fast rotation rate of planets guarantees that the Coriolis force enters the leading order force balance and plays a major role in the dynamo dynamics. Force balances are thus typically discussed in terms of dimensionless numbers that measure the
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
791
ratio of a specific force to the Coriolis force. The Ekman number (9), for example, provides a measure for the ratio of viscous to Coriolis forces. Two additional nondimensional numbers quantify the relative importance of inertial effects and magnetic forces, respectively: the Rossby number u `
(17)
b2 : N
(18)
Ro D and the Elsasser number ƒD
Here, u and b refer to typical dimensional flow and magnetic field amplitudes. The Rossby number Ro is identical to the nondimensional flow amplitude in the scaling chosen here. The magnetic Reynolds number Rm D u`= has already been introduced in Sect. 1 as an important measure for the relative importance of magnetic induction to diffusion. Another useful measure is the Alfvén mach number MA D
u./ N 1=2 ; b
(19)
the ratio of the flow velocity to the typical velocity of Alfvén waves. Note that Ro, Rm, ƒ, and MA are not input parameters for the dynamo simulations but output or diagnostic “parameters” that characterize the solution, namely, the typical (rms) flow amplitude u and the typical (rms) magnetic field amplitude b. Both the Rossby number and the magnetic Reynolds number are related to the flow amplitude: Ro D Rm E=Pm. The Alfvén Mach number can be written in terms of the magnetic Reynolds number and the Elsasser number: MA D Rm E1=2 =.ƒ Pm/1=2 . Christensen and Aubert (2006) note that the Rossby number may not provide a good estimate for the impact of the nonlinear inertial or advection term in the Navier-Stokes equation (4). The nonlinearity can give rise to cascades that interchange energy between flows of different length scales. The turbulent transport of energy to increasingly smaller scales in the Kolmogorov cascade is only one example. Another coupling between different length scales is provided by the socalled Reynolds stresses which result from a statistically persistent correlation between different small-scale flow components. The fierce zonal winds observed on Jupiter and Saturn are a prominent example for their potential power (Christensen 2002; Heimpel et al. 2005; Gastine and Wicht 2012) and demonstrate how smallscale contributions can effectively feed large-scale flows. Christensen and Aubert (2006) suggest to incorporate the length scale introduced by the r operator into a refined estimate for the relative importance of the nonlinear inertial forces which they call the local Rossby number
792
J. Wicht et al.
Table 1 List of parameters and properties for some of the dynamo simulations presented here and for Earth. For model E6, not all the values have been stored during the calculations. Models BM II and E6 have a fourfold azimuthal symmetry so that the dipole tilt is always zero. While this reflects the true solution of the BM II case, the symmetry has been imposed in model E6 to save computing time. All models employ rigid flow and fixed temperature boundary conditions. The exception is model E4R106 which mimics purely compositional convection by using a zero flux condition at the outer boundary and a fixed composition condition at the inner boundary. The Prandtl number is unity in all cases and the aspect ratio is fixed to an Earth-like value of D 0:35. Some of the cases have been studied elsewhere (Christensen et al. 2001; Wicht et al. 2009; Wicht and Christensen 2010; Wicht and Tilgner 2010). The values provided in these previous publications may differ because of modified definitions. The model names first indicate the negative power of the Ekman number and then the supercriticality Ra=Rac . Earth’s core Rayleigh number is hard to determine but thought to be large (Gubbins 2001); Earth’s local Rossby number has been estimated by Christensen and Aubert (2006); flow amplitudes that enter Ro and Rm are based on secular variation models and match with the scaling law by Christensen and Tilgner (2004). The values d , e, and a are measures to quantify the dipolarity, equatorial symmetry, and axial symmetry of the field at the outer boundary (see Eqs. (27), (28), and (29) in section “Dipole Properties and Magnetic Field Symmetries”). ‚ refers to the time-averaged dipole tilt E Ra Ra=Rac Pm Ro Ro` Rm ƒ MA d ‚ e a
BM II 103 0.011 2 5 9 103 0.02 46 8 0.2 0.83 0 0 0.79
E3R9 103 0.05 9 10 4 102 0.12 434 19 0.99 0.31 17.6 0.58 0.89
E4R106 3 104 1.800 106 3 4 102 0.13 408 5 1.82 0.41 15.1 0.68 0.94
E5R18 3 105 0.090 18 1 102 0.09 356 7 0.74 0.87 2.9 0.57 0.81
Ro` D
E5R36 3 105 0.108 36 1 102 0.10 405 8 0.78 0.86 2.2 0.57 0.77
u : `u
E5R43 3 105 0.135 43 1 2 102 0.18 607 4 1.66 0.20 37.7 0.74 0.94
E6 3 106 0.009 22 0.5 2 103 0.02 261 3 0.40 0.81 0
Earth 1015 Large Large 3 107 2 106 0.09 500 1 0.03 0.79 9.7 0.74 0.94
(20)
The length scale `u represents a weighted average based on the spherical harmonic decomposition of the flow: `u D `= < l > (21) with
P lEk .l/ ; < l >D Pl l Ek .l/
(22)
where Ek .l/ is the kinetic energy carried by all modes with spherical harmonic degree l. Table 1 lists the input and diagnostic parameters introduced above for Earth. The numbers highlight three characteristics for the force balance in planetary dynamo
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
793
regions: at E D 1015 , viscous effects are likely negligible on the larger length scale of interest in the dynamo process, and inertial forces should play a minor role at Ro D 2 106 . The associated dynamical consequences can be discussed in terms of a bifurcation scenario that starts with the onset of convective motions at a critical Rayleigh number Rac . Dynamo action sets in at higher Rayleigh numbers where the larger flow amplitude pushes the magnetic Reynolds number beyond its critical value. The small Ekman number of planetary dynamos suggests that viscous effects are negligible. Close to Rac , the Coriolis force is then predominantly balanced by pressure gradients in the so-called geostrophic regime. The TaylorProudman theorem states that the respective flow assumes a two-dimensional configuration, minimizing variations in the direction zO along the rotation axis: @U=@z ! 0 for E ! 0. Buoyancy, however, must be reinstated to allow for convective motions that necessarily involve a radial and thus non-geostrophic component. To facilitate this, viscous effects balance those Coriolis force contributions which cannot be balanced by pressure gradients (Zhang and Schubert 2000). At lower Ekman numbers, the azimuthal flow length scale decreases so that viscous effects can still provide this necessary balance: The azimuthal wave number mc at onset of convection scales like E1=3 . The impeding effect of the two dimensionality causes the “classical” critical Rayleigh number to rise with decreasing E: Rac E4=3 . Inertial effects can contribute to balancing the Coriolis force at larger Rayleigh numbers and thus reduce the two dimensionality of the flow. Figure 3 shows the principal convective motions and the effect of decreasing the Ekman number and increasing the Rayleigh number. The convective columns are illustrated with isosurfaces of the z-component of vorticity in the upper panels. Red isosurfaces depict cyclones that rotate in the same direction as the overall prograde rotation , and blue isosurfaces show anticyclones that rotate in the opposite sense. These columns are restricted to a region outside the socalled tangent cylinder that touches the inner-core boundary (see Fig. 2). Inside the tangent cylinder, buoyancy is primarily directed along the axis of rotation so that the Taylor-Proudman theorem is even more restricting here. The convective motions therefore start at higher Rayleigh numbers here, a few times Rac , and are plumelike rather than column-like. The meridional circulations in the north/south plane involve flow components along the direction of the rotation axis which violate the Taylor-Proudman theorem but are unavoidable in convection. The lower panels of Fig. 3 illustrate the meridional circulation with isosurfaces of the flow z-component which is directed equatorward in the (red) cyclones and polarward in the (blue) anticyclones. Inside the tangent cylinder, the meridional circulation is mostly outward close to the pole and inward close to the tangent cylinder. Decreasing the Ekman number thus has two principal effects: first, an increase of the rotational constraint promoting a more ordered geostrophic structure and second, a decrease of the length scales. The dynamics of the smaller flow structures also requires smaller numerical time steps and both effects significantly increase
794
J. Wicht et al.
E=10–3, Ra=2 Rac
E=10–3, Ra=8.1 Rac
E=3x10–5, Ra=1.8 Rac
Fig. 3 Flow in nonmagnetic convection simulations at different parameter combinations. The top row shows isosurfaces of the z-components of the vorticity r U, the bottom row shows isosurfaces of the z-component of the flow. The Prandtl number is unity in all cases
the numerical costs. For example, the time step required for the simulation at E D 3 105 shown in Fig. 3 is three orders of magnitude smaller than the time step required at E D 103 . Reaching smaller Ekman numbers is therefore numerically challenging and the available computing power limits the attainable values. When the Rayleigh number is increased beyond a certain critical value (see the discussion in section “Subcritical Dynamos and the Nature of the Dynamo Bifurcation”), dynamo action set in and Lorentz forces can contribute to balancing the Coriolis force and further release the two dimensionality. On theoretical grounds, the Lorentz force is thought to enter the leading order force balance in order to saturate magnetic field growth, and this seems to be confirmed by the geomagnetic value of ƒ (see Table 1). The magnetostrophic balance assumed to rule planetary dynamo dynamics therefore involves Coriolis forces, pressure gradients, buoyancy, and Lorentz forces and is thought to be characterized by an Elsasser number of order one. We will further discuss the influence of Lorentz forces on the flow dynamics in Sect. 3.6. An Ekman number of about E D 106 seems to be the current limit for numerical dynamo simulations (Kageyama et al. 2008; Miyagoshi et al. 2010, 2011). Increasing the Rayleigh number in order to retain dynamo action and to yield a more realistic structure and time dependence further decreases both length scales and time scales, as we will discuss in Sect. 3. The parameters for a few representative dynamo simulations – simple, advanced, and high end – are listed along with the geophysical values in Table 1. Simple dynamos are characterized by moderate Ekman numbers, E D 103 or even larger, which yield a largescale solution that can be computed with modest numerical efforts. These dynamos
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
795
therefore lend themselves to study long-time behavior like field reversal, to explore the dependencies on the parameters other than E, and to unravel the 3D dynamo mechanism. The simple drifting behavior found at low Rayleigh numbers made the case BM II listed in Table 1 an ideal candidate for a numerical benchmark (Christensen et al. 2001). Model E6 represents the high end of the spectrum at E D 3106 . Its solution is very small scale and complex and exhibits a chaotic time dependence. The dynamos at E D 3 105 listed in Table 1 are typically advanced models that can be simulated on today’s midrange parallel computing systems on a more or less regular basis. Figure 4 shows the rms force balance for four different dynamo models, three of which are listed in Table 1. The individual contributions have been normalized with the rms Coriolis force. In all simulations, Coriolis force and pressure gradient clearly dominate which explains the high degree of geostrophy characterizing typical dynamo solutions. Lorentz force contribution L and buoyancy B are somewhat weaker, and it remains debatable whether they should be regarded as part of the firstorder force balance (Christensen et al. 1999; Soderlund et al. 2012). Acceleration A and inertia I, or more precisely momentum advection, form a lower-order balance at small Rayleigh numbers. When Ra increases, their relative contribution grows and reaches a level of around 0:1 at the transition to the multipolar regime as correctly predicted by the local Rossby number. The local Rossby number thus seems to offer a fair estimate for the true rms force balance. Soderlund et al. (2012) remark that this transition roughly coincides with the point where the inertial contribution exceeds the viscous contribution V as is confirmed by Fig. 4 (compare the dipolar model E5R18 with the multipolar model E5R43). Once in the multipolar regime, inertia has nearly reached the level of the Lorentz force. It then becomes dubious to consider the latter as part of a first-order (magnetostrophic) force balance but not the former. The viscous contribution V is surprisingly high in all the models. A comparison of the three cases at E D 3 105 shows that V increases with Rayleigh number because the length scale of the solutions decreases. Since viscous effects are strongly dominated by the smallest length scales, the Ekman number cannot directly be used to estimate the true rms contribution. A better estimate would include a typical viscous length scale `V via V E=`2v . Using V 0:1, a typical value for the larger Rayleigh number solutions at E D 3 105 , suggests `V D .E=V /1=2 0:02 which is an order of magnitude smaller than the mean flow scale `u . This suggests that the rms value V overestimates the effect of viscosity on the larger typical length scales relevant for magnetic field generation. Figure 4 also illustrates that the Elsasser number tends to overestimate the relative importance of the Lorentz force. For all the cases examined here, ƒ exceeds one, while L clearly remains below unity. In analogy to the local Rossby number, Soderlund et al. (2012) therefore suggest a modified dynamic Elsasser number that provides a better estimate. The classical Elsasser number assumes that the electrical current density can be approximated by u b based on Ohm’s law. The dynamic Elsasser number ƒd allows more direct estimate based on the ratio of the Lorentz and Coriolis term in the Navier-Stokes equation:
796
J. Wicht et al.
Fig. 4 Force balance in the Navier-Stokes equation for four different dynamo models. Shown are rms force contributions due to total acceleration A, viscosity V, inertia (or momentum advection) I, Coriolis force C, Lorentz force L, pressure gradient P, and buoyancy B. All contributions have been normalized with the Coriolis contribution. See Table 1 for more information on the different dynamo models
10 0
10 –1 E=10–3, Ra/Rac=2, BMII E=3x10–5, Ra/Rac=9 E=3x10–5, Ra/Rac=18, E5R18 E=3x10–5, Ra/Rac=43, E5R43
10 –2 A
ƒd D
V
I
b2 ƒ ` D : 2u`b Rm `b
C
L
P
B
(23)
The typical magnetic length scale `b can be calculated similarly to `u when replacing the kinetic with the magnetic energy (see Soderlund et al. (2012) for a precise definition). A comparison of the parameters for the most advanced model E6 in Table 1 with Earth values shows that: (1) The Ekman number is still about nine orders of magnitude too large. (2) The Rossby number is three orders of magnitude too large. (3) The magnetic Prandtl number is six orders of magnitude too large and (4) the Alfvén Mach number is about one order of magnitude too large. On the other hand: (5) The magnetic Reynolds number reaches realistic values and (6) the Elsasser number is also about right. In terms of the dynamics, this means that: (1) Viscous effects are much too large, which is necessary to suppress unresolvable small-scale flow features. (2) Inertial effects are overrepresented. (3) Magnetic diffusion is much too low compared to viscous diffusion. (4) The numerical dynamos are too inefficient and produce weaker magnetic fields for a given flow amplitude than Earth. On the positive side: (5) The ratio of magnetic field production to diffusion is realistic and (6) the relative (rms) impact of the Lorentz forces on the flow seems correctly modeled. The role of the local Rossby number will further be discussed in the following section.
3.2
Dynamo Regimes
One possible strategy for choosing the parameters for a numerical dynamo model is to fix the Prandtl number to Pr D 1 appropriate for thermal convection and to choose E as small as the available numerical computing power permits. The Rayleigh
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
a 20
b 20
18
D
E
18
M
16
16
14
14
Pm
Pm
797
12
10
8
8
C
4 2
4
6
0
200 400 600 800 1000 1200
M
6
C
4
8 10 12 14 16 Ra/Rac
Rm
E
12
10 6
D
2 Λ
4
6
8 10 12 14 16 Ra/Rac
0 10 20 30 40 50 60 70 80 90
Fig. 5 Regime diagrams that illustrate the transition from stable dipole-dominated dynamos (regime D) to constantly reversing multipolar dynamos (regime M). Models showing Earthlike rare reversals can be found at the transition in regime E. Gray symbols mark nonmagnetic convective solutions (regime C). Different symbols code the time dependence: squares = drifting, upward pointed triangles = oscillatory, circles = chaotic, diamonds = Earth-like rarely reversing, downward pointed triangles = constantly reversing. The Ekman number is E D 103 and the Prandtl number is Pr D 1 in all cases
number and the magnetic Prandtl number Pm are then varied until the critical magnetic Reynolds number is exceeded and dynamo action starts. Note, however, that some authors have chosen a different approach in selecting their parameters (Glatzmaier 2002). Figure 5 shows the dependence of Rm on Ra and Pm at E D 103 and Pr D 1. Rigid flow and fixed temperature boundary conditions have been assumed. Larger Ra values yield larger flow amplitudes u, while larger Pm values are synonymous with lower magnetic diffusivities (or larger electrical conductivity ). The increase of either input parameter thus leads to larger magnetic Reynolds numbers Rm D u`= and finally promotes the onset of dynamo action. The minimal critical magnetic Reynolds number is about 50 here, a value typical for spherical dynamo models (Christensen and Aubert 2006). Figure 5 demonstrates that the increase of Rm and ƒ with Pm depends on the Rayleigh number which is an expression of changes in the dynamo mechanism and efficiency. Generally, the increase of Rm with Pm is slower than linear due to the growing field strength and thus intensified back reaction of the Lorentz force on the flow. Figure 5 highlights the four main dynamical regimes that have been identified in several extensive parameter space studies (Kutzner and Christensen 2002; Christensen and Aubert 2006; Amit and Olson 2008; Takahashi et al. 2008a; Wicht et al. 2009). C denotes the purely convective regime, and D is the regime where dipole-dominated magnetic fields are obtained. When u becomes large at values of Ra=Rac between 8 and 9, the dynamo changes its character and seems to
798
J. Wicht et al.
45 P [deg.]
0 –45
30 DM 20 [1022Am2] 10 0 3
5
7 Time [Myr]
9
11
Fig. 6 Reversal sequence in model E3R9 that typifies the behavior of dynamos in regime E (see Fig. 5). The top panel shows the dipole tilt P, the bottom panel shows the magnetic dipole moment DM, rescaled by assuming Earth-like rotation rate, core density, and electrical core conductivity. Excursions where the magnetic pole ventures further away than 45ı in latitude from the geographic pole, but then returns are marked in light gray. See Wicht et al. (2009) for further explanation
become less efficient. The dipole component loses its special role which leads to a multipolar field configuration – hence regime M for multipolar dynamo regime – and an overall weaker field as indicated by the smaller Elsasser number (see Fig. 5b). In addition, the critical magnetic Reynolds number increases. The dipole polarity, which remains stable in regime D, frequently changes in regime M. Earth-like reversals, where the polarity stays constant over long periods and reversals are relatively rare and short events, happen at the transition between the regimes D and M and define regime E. It is somewhat difficult to come up with a clear and unambiguous definition of “Earth-like” reversals. We follow Wicht et al. (2009) here, demanding that the magnetic pole should not spend more than 10 % of the time in transitional positions farther away than 45ı from either pole. Moreover, the dipole field should amount to at least 20 % of the total field strength at the outer boundary on time average (see Sect. 3). Figure 6 shows a time sequence for the Earth-like reversing model E3R9 (model T4 in Wicht et al. 2009) that exhibits ten reversals and several excursions where the magnetic pole also ventures farther away than 45ı from the closest geographic pole but then returns. The left boundary of regime E is difficult to pin down since it is virtually impossible to prove numerically that a dynamo never reverses (Wicht et al. 2009). The regime changes from D to E and further to M are attributed to the increased importance of the nonlinear inertial effects at larger Rayleigh numbers. They seem to have reached a critical strength compared to the ordering Coriolis force at the transition to regime M. Reversals typically set in at a critical local Rossby number around Ro`c D 0:1 (Christensen and Aubert 2006); the precise value depends on model details like the heating mode and the inner-core size (Aubert et al. 2009). Once in regime E the ever present fluctuation in the inertial forces may suffice
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
799
to drive the dynamo into regime M for a short while and thus facilitate reversals (Kutzner and Christensen 2002). Figure 16, discussed in more detail below, shows the location of the different regimes with respect to the local Rossby number Ro` . As already discussed above, Soderlund et al. (2012) report that the transition to reversing dynamos happens when the rms inertial force exceeds the rms viscous force and thus gains more influence in their simulations. This is confirmed by the example shown in Fig. 4, but the significance remains unclear. The scenario outlined for E D 103 above is repeated at lower Ekman numbers with a few changes. Most notably, the boundary towards the inertia-influenced dynamo regime M shifts towards more supercritical Rayleigh numbers. The critical relative strength of inertial effects is reached at larger flow amplitudes u since the rotational constraint is stronger at lower E values. The critical magnetic Reynolds number is therefore reached at lower magnetic diffusivities which means that the minimum magnetic Prandtl number Pmmin where dynamo action is still retained decreases (Christensen and Aubert 2006). While Pmmin is about 4 at E D 103 (compare Fig. 5), it decreases to roughly 0:1 at E D 105 (Christensen and Aubert 2006). Christensen and Aubert (2006) suggest that Pmmin D 450 E0:75 . An extrapolation for Earth’s Ekman number yields Pmmin 108 which is safely below the geophysical value of Pm D 3 107 . This predicts that a realistic magnetic Prandtl number can indeed be reached at realistic Ekman numbers. In a few instances, it has been reported that both dipole-dominated and multipolar solutions can be found at identical parameters (Christensen and Aubert 2006; Simitev and Busse 2009; Gastine et al. 2012). This phenomenon becomes more typical when stress-free rather than rigid outer boundary conditions are used which allow stronger zonal winds to develop. Two distinct solution branches then coexist for local Rossby number below the critical value for the transition to regime M: a branch with dipole-dominated stronger magnetic fields and weak zonal flows and a branch with multipolar weaker magnetic fields but strong zonal flows. Strong zonal flows and dipole-dominated fields seem merrily exclusive. Another example where a dipolar and a multipolar branch coexist is tied to subcritical dynamo action further discussed in section “Subcritical Dynamos and the Nature of the Dynamo Bifurcation.”
3.3
Scaling Laws
There is no unique way to rescale to dimensionless solutions to the planetary situation. Take, for example, time. Several different time scales enter the problem: the rotation time D 1 , the fluid turnover time u D `=u, the magnetic diffusion time D `2 =, the viscous diffusion time D `2 =, and the thermal diffusion time D `2 =. The first three are more directly relevant for the dynamo process. The ratio of and u is the magnetic Reynolds number and both are often used to rescale the simulations. Olson et al. (2012) demonstrate that using the turnover time yields a temporal Fourier power spectrum that is more similar to the geomagnetic one. Simulations with a realistic magnetic Reynolds number (around
800
J. Wicht et al.
500) have the advantage that using either time scale provides equivalent results. The ratio of and is the magnetic Ekman number E D
D D E Pm1 : `2
(24)
Since E is too large in the simulations, this is also true for E . When using either u of for rescaling a dynamo simulation, this automatically means that the rotation rate is orders of magnitude too slow. Some authors argue that choosing larger Pm values and thus small magnetic Ekman numbers may cure this problem (Christensen et al. 2010), and we come back to this issue in section “Dipole Properties and Magnetic Field Symmetries.” Dipole-dominated dynamos, even Earth-like reversing dynamos, are found for a wide range of Elsasser numbers (see Fig. 5). We have already remarked on the fact that the classical Elsasser number may not provide a good estimate of the true importance of the Lorentz force in dynamo simulations. The assumption that ƒ has to be of order one, however, is the basis for a classical conjecture that the magnetic field strength should be proportional to the square root of the planetary rotation rate since b2 D ./. N The dimensionless field strengths provided by dynamo simulations are therefore often rescaled by assuming Earth-like values for , core density , N and electrical conductivity D 1=./ (see, e.g., Fig. 6). Dynamo simulations can be used to check and revise scaling laws, at least within the covered parameter regime. A large suit of dynamo models supports the law b2 fOhm N1=3 . F qo /2=3 ;
(25)
where fOhm is the fraction of the available power that is converted to magnetic energy and F qo is the total thermodynamically available power (Christensen 2010). The form factor F subsumes the radial dependence of the convective vigore and is typically of order one; qo is the heat flux through the outer boundary. This scaling law not only successfully predicts the field strength for several planets in our solar system (Olson and Christensen 2006; Christensen and Aubert 2006; Yadav et al. 2013) but also for some fast-rotating stars whose dynamo zones may obey a similar dynamics as those found in the planetary counterparts (Christensen et al. 2009). This suggests that all operate in a similar regime and thus implies that the dynamical differences play no major role, i.e., that viscous and inertial effects are already small enough in the simulations to capture at least the primary features of the geodynamo and that the ratio of viscous to ohmic dissipation is not essential. We refer to Christensen (2010) for a detailed comparison of the different scaling laws that have been proposed over time, not only for the magnetic field strength but also for the flow vigore and other dynamo properties. The power-dependent scaling law for the local Rossby number developed by Christensen and Aubert (2006) predicts the value of Ro` 0:09 for Earth that is listed in Table 1. This nicely agrees with the fact that the numerical simulations show Earth-like reversal in this parameter range. Inertial effects may thus play a much
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
801
larger role in the geodynamo than previously anticipated based on the small Rossby number (Wicht and Christensen 2010). Scaled to Earth values, the associated length scale `u (Eq. 21) corresponds to roughly 100 m. It is hard to conceive that such a small length scale should play should matter in the dynamo process let alone influence its reversal behavior, but nonlinear interactions may play tricks here. The meaning of the local Rossby number and the related scaling laws needs to be explored further to understand their relevance. The fact that the power-dependent scaling law Eq. (25) is independent of the Ekman number seems good news for dynamo simulations. However, the numerical results still allow for a weak dependence on the magnetic Prandtl number (Christensen and Aubert 2006; Christensen 2010; Stelzer and Jackson 2013). This may amount for a significant change when extrapolating the simulation at Pm 0:1 to the planetary value of Pm D 3 107 . The much smaller viscous diffusion at Pm 1 allows for significantly smaller scales in the flow than in the magnetic field. The smaller flow scales should thus have no direct impact on the dynamo process in planetary dynamo regions but may still play a role for the magnetic field generation in dynamo simulations. However, the nonlinearities couple flow and magnetic field of different scales and may lead to complicated interactions. Simulations at lower magnetic Prandtl numbers are required to clarify these issues but, as outlined above, also require lower Ekman numbers which is numerically costly. The disparity between flow and magnetic field length scales further increases the numerical difficulties.
3.4
Double Diffusive Approach
Simulations that simultaneously model thermal and compositional convection and allow for significant differences in the diffusivities of both components are still rare. While for thermal convection a Prandtl number of one is realistic, the composition Prandtl number may be three orders of magnitude larger, when molecular diffusivities are considered. Several studies that model pure thermal convection but vary the Prandtl number indicate that inertial effects decrease when increasing the Prandtl number in convectively driven flows (Tilgner 1996; Breuer et al. 2002; Schmalzl et al. 2002) as well as in dynamos (Simitev and Busse 2005; Sreenivasan and Jones 2006; Wicht and Christensen 2010). Larger Prandtl numbers promote more confined thinner convective flows and lower overall flow amplitudes (Breuer et al. 2002; Schmalzl et al. 2002), and the latter is also responsible for the smaller inertial effects (Sreenivasan and Jones 2006). Higher magnetic Prandtl numbers are thus required to sustain dynamo action and keep Rm above the necessary critical value (Simitev and Busse 2005). Simitev and Busse (2005) moreover report that the dipole contribution becomes stronger at larger Prandtl numbers which is in line with our understanding that inertia is responsible for the transition from the dipolar to the multipolar regime in Fig. 5.
802
J. Wicht et al.
Breuer et al. (2010) present a double diffusive study at an Ekman number of E D 103 where thermal and compositional evolution is modeled separately with two equations of the form (6). The thermal and compositional Prandtl numbers are 0:3 and 3:0, respectively. They report that a sizable contribution of compositional buoyancy, similar to what can be expected for Earth, promotes smaller-scale flows and more convective plumes inside the tangent cylinder. Other important flow properties like the zonal flow or helicity are also significantly affected. In a double diffusive study geared to model Mercury’s dynamo, Manglik et al. (2010) observe that the thinner compositional plumes tend to destabilize a thermally stably stratified layer attached to the outer boundary. This result may also apply to the Earth’s core where recent studies suggest the presence of a similar stratified layer (Pozzo et al. 2012; Gubbins and Davies 2013).
3.5
Dynamo Mechanism
The simple structure of large Ekman number and small Rayleigh number simulations allows to analyze the underlying dynamo mechanism (Olson et al. 1999; Wicht and Aubert 2005; Aubert et al. 2008b). Figure 7 illustrates the mechanism at work in the benchmark II dynamo (Christensen et al. 2001) (BMII in Table 1). The solution obeys a fourfold azimuthal symmetry, and the dynamo process can be illustrated by concentrating on the action of one cyclone/anticyclone pair.
Fig. 7 Illustration of the dynamo mechanism in the benchmark II dynamo. Panel (a) shows isosurfaces of positive (red) and negative (blue) z-vorticity which illustrate the cyclonic and anticyclonic convective columns. Panel (b) shows isosurfaces of positive (red) and negative (blue) z-velocity, and panel (c) shows contours of the radial field at the outer boundary. Red (blue) indicates radially outward (inward) field. The magnetic field lines are colored accordingly, and their thickness is scaled with the local magnetic field strength
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
803
The mechanism is of the ˛ 2 type, a terminology that goes back to the mean field dynamo theory and refers to the fact that poloidal and toroidal magnetic fields are produced by the local action of small-scale flow; small scale refers to the individual convective columns here. The ˛!-mechanism, where the toroidal field is created by shear in the zonal flow, offers an alternative that may operate in models where stress-free boundary conditions allow for stronger zonal flows to develop (Kuang and Bloxham 1997; Busse and Simitev 2005a). Anticyclones stretch north-/south-oriented field lines radially outward in the equatorial region. This produces a strong inverse radial field on either side of the equatorial plane which is still clearly visible at the outer boundary. Inverse refers to the radial direction opposing the dominant axial dipole. The field is then wrapped around the anticyclone and stretched poleward, resulting in a field parallel to the original axial dipole and thus enforcing it. Advective transport from the anticyclones towards the cyclones and stretching by the meridional circulation down the axis of the cyclones closes the magnetic production cycle which maintains the field against ohmic decay. The converging flow into the anticyclones advectively concentrates the normal polarity field into strong flux lobes located at higher latitudes close but outside the tangent cylinder. The dipole field is to a good degree established by these lobes. Meridional circulation inside the tangent cylinder is responsible for the characteristically weaker magnetic field closer to the poles since it transports field lines away from the rotation axis towards the tangent cylinder. The distinct magnetic features associated with the action of the prograde and retrograde rotating convective columns have been named magnetic cyclones and anticyclones by Aubert et al. (2008b). An identification of the individual elements in the dynamo process becomes increasingly difficult at smaller Ekman numbers and larger Rayleigh numbers where the solutions are less symmetric, more small scale, and stronger time dependent. Many typical magnetic structures nevertheless prevail over a wide range of parameters which suggests that the underlying processes may still be similar (Aubert et al. 2008b).
3.6
Is There a Distinct Low Ekman Number Regime?
As outlined above, even the most advanced spherical dynamo models are restricted to Ekman numbers E O.106 / which is still nine orders of magnitude away from Earth’s value. Though the current dynamo models are rather successful in reproducing many features of observed planetary magnetic fields, and even though the changes with decreasing Ekman number seem modest in typical dynamo models, there is clearly the danger that more severe changes are encountered when future models succeed to reach more realistic Ekman numbers. In this section, we review some questions which arise in the context of the low Ekman number regime and discuss results from recent simulations.
804
J. Wicht et al.
The Influence of the Magnetic Fields on Rapidly Rotating Convection It has long been known that magnetic fields can have a strong impact on rotating convective flows at small Ekman number E. In particular, studies of so-called magnetoconvection, in which the field is externally imposed onto the convective flow, have suggested that the magnetic forces act to change the flow structure fundamentally in comparison to the small-scale motions typically encountered in the nonmagnetic case. The general effects are most easily discussed in terms of a simple Cartesian plane layer model studied already by Chandrasekhar (1961). This model extends the classical Rayleigh-Bénard configuration, a horizontal fluid layer heated from below, by adding the effects of vertical rotation and of a uniform, imposed vertical magnetic field. For simplicity, we assume fixed temperatures and vanishing shear stresses at the boundaries. The stability of the purely conductive solution against small perturbations can then be analyzed in a straightforward manner by a standard linear analysis. Figure 8a, taken from Hori et al. (2012), shows the critical Rayleigh number1 Rac for the onset of convection as a function of the horizontal wave number KH . a
b
Fig. 8 Critical Rayleigh number vs. horizontal wave number KH in a plane layer with (a) fixed temperature boundary conditions and (b) fixed heat flux conditions (From Hori et al. 2012)
1 Note that Ra is defined similar to Eq. (14), with ` replaced by the layer depth. Furthermore, possible oscillatory modes are omitted from Fig. 8a for simplicity. See Hori and Wicht (2013) for details.
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
805
The behavior is illustrated for various Ekman numbers E and for imposed fields of varying strengths, measured here by the Elsasser number ƒ0 defined in Eq. (18). The minimum of each curve signifies the Rayleigh number and horizontal wave number where the purely conductive state becomes unstable to convective motions. In the nonmagnetic case, ƒ0 D 0, the Rayleigh number required to destabilize the system strongly increases with decreasing E, proportional to E4=3 for small E, while the wave number of the first unstable mode increases like E1=3 . These scalings agree with those found for critical Rayleigh number and azimuthal wave number mc in spherical shells (Sect. 3.1). The behavior changes dramatically in the presence of a strong imposed field. The minima shift towards lower Rayleigh and wave numbers, revealing that a much weaker buoyancy forcing, Ra D O.E1 /, now suffices to destabilize the system. The first unstable mode is characterized by O.1/ flow scales, in sharp contrast to the O.E 1=3 / scales encountered in the nonmagnetic case. As already explained in Sect. 3.1, in the absence of a magnetic field, viscous forces play an important role in balancing the components of the Coriolis force that cannot be balanced by pressure alone, which requires small E 1=3 length scales to be present in the flow. For sufficiently strong magnetic fields, the Lorentz force enters the force balance and can help to balance the Coriolis force. Ultimately, the need for small flow scales vanishes, thus reducing the viscous dissipation and in turn the critical Rayleigh number. Note that the drop in both the critical Rayleigh from O.E 4=3 / to O.E 1 / and in the first unstable wave number from O.E 1=3 / to O.1/ increases with decreasing E, revealing that the effect of the imposed field successively becomes more pronounced as lower Ekman numbers are approached. As pointed out by Hori et al. (2012), it is interesting to compare the above results with a situation where constant heat flux conditions are imposed at the boundaries (Fig. 8b). In broad terms, the behavior is similar, but the curves now flatten out for small wave numbers, revealing that large flow scales are excited more easily. This suggests that the effect of magnetic fields on the large scales in rotating convection may be more pronounced if flux conditions are applied. We will come back to this observation in section “The Impact of Lorentz Forces and Buoyancy Boundary Conditions on the Flow Scales in Numerical Simulations.” The simple plane layer model considered above is certainly oversimplified. Linear studies using spherical shell geometries and more general imposed largescale magnetic field topologies in fact reveal the existence of more complicated convective modes (Fearn 1979; Sakuraba and Kono 2000; Zhang and Gubbins 2000a,b; Sakuraba 2002), but generally confirm the basic result that strong imposed fields tend to promote large-scale flows and decrease the critical Rayleigh number. The magnetic field typically begins to alter the flow structure for ƒ D O.E 1=3 /, showing that at low Ekman number, even weak magnetic fields can be dynamically important. Much depends on the geometry of the imposed fields and the model details. We do not attempt to review the wealth of results obtained over the last decades, and instead refer to the reviews by Proctor (1994) and Jones (2007). We do point out however that a linear analysis by Sakuraba (2002) suggests that the choice of temperature boundary conditions strongly affects the magnetic modes also in
806
J. Wicht et al.
spherical shells, a finding recently confirmed in fully nonlinear magnetoconvection simulations by Hori and Wicht (2013). Care should be taken when applying magnetoconvection results, which are obtained for topologically simple, externally imposed fields, to fully nonlinear dynamos, in which more complicated magnetic fields arise that are dynamically coupled to the flow. If we assume that the magnetoconvection results are significant for dynamos, this would suggest a growing scale disparity between weakly and strongly magnetic flows as E is reduced and ultimately lead to enormous scale differences between nonmagnetic and magnetic dynamics at Earth-like parameters. Typical estimates based on linear theory suggest that nonmagnetic columnar convection (as illustrated in Fig. 3b) would have azimuthal length scales ranging from about 30 m to one kilometer in the Earth’s core, depending on whether molecular or turbulent values of viscosity are used (Zhang and Gubbins 2000b). The planetary scale convection cells suggested by magnetoconvection studies for strong, Earthlike fields would be three to five orders of magnitude larger, again illustrating the possible key role of the magnetic field in the low E regime. Interestingly, this effect would counteract the scale disparity due to the small magnetic Prandtl number that we discussed in Sect. 3.3. As noted above, the implications of magnetoconvection studies for rapidly rotating dynamos are not trivial. Apart from the general hope to make the models more realistic, this is one of the reasons for the ongoing efforts to push the numerical simulations towards more realistic Ekman numbers.
The Impact of Lorentz Forces and Buoyancy Boundary Conditions on the Flow Scales in Numerical Simulations The majority of dynamo simulations published so far have assumed fixed codensity values at the boundaries. Although some simulations have reached Ekman numbers down to O.106 /, the effect of the magnetic field on the flow is typically modest. In a study at E D 2106 , Takahashi et al. (2008a) report an increase of `u by about 20 % as compared to the nonmagnetic case. Soderlund et al. (2012) also report small effects for Ekman numbers down to E D 105 . Sakuraba and Roberts (2009) were the first to notice that the situation can change dramatically if the buoyancy flux is prescribed at the boundaries and internal buoyancy sources are present. They compare two simulations at E 2 106 , shown in Fig. 9, one with a uniform temperature boundary condition applied at the core-mantle boundary and one with a fixed, uniform heat flux condition. Both simulations assume a homogeneous heat source throughout the core and also include a localized buoyancy source at the inner-core boundary to mimic the effects of innercore growth. The choice of thermal boundary conditions at the core-mantle boundary obviously has a huge impact on the system dynamics. While fixed temperatures at the outer boundary lead to a flow field that is dominated by small-scale features, the case with a uniform heat flux boundary condition shows a much larger ability to create large-scale motions. In regions where the magnetic field is strong, the velocity field
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
a
b
–0.3
0
br
0.3
c
e
g
d
f
h
–300
807
0 us
300
–300
0 uf
300
–1.5
0 bf
1.5
Fig. 9 Snapshots of the velocity and of the magnetic field at E 2 106 in the Sakuraba and Roberts (2009) model. Panels (a, c, e) and (g) show results obtained for a simulation with fixed heat flux boundary conditions on the outer shell, while a simulation with prescribed uniform surface temperature gives results as shown in (b, d, f) and (h). The radial magnetic field on the CMB is shown in (a) and (b). The remaining panels are velocity and magnetic field plots in the z D 0:1ro plane viewed from the north. The radial component of velocity (c, d), the azimuthal component of velocity (e, f), and the azimuthal magnetic field (g, h) are, respectively, shown. Note that the velocity and magnetic field scales differ from the scaling used in the present paper; see Sakuraba and Roberts (2009) for details
has a large-scale structure, with a dominant azimuthal wave number m 6. Smallscale turbulence is still observed in regions where the magnetic field is weak. Building up on the study of Sakuraba and Roberts (2009), Hori and her coauthors investigated the role of buoyancy boundary conditions in a series of papers (Hori et al. 2010; Hori et al. 2012; Hori and Wicht 2013). Following the codensity
808
J. Wicht et al.
approach introduced in Sect. 2.1, they considered both fixed codensity values and fixed codensity flux at the boundaries and also varied the amount of internal codensity sources. The results confirm the linear magnetoconvection results shown in Fig. 8 that fixed-flux boundary conditions promote larger convective scales than fixed boundary values of codensity. This is true in the nonmagnetic case, but becomes much more pronounced if a dynamo generates a strong, dipolar field. The details seem to depend strongly on how the flow is driven. If internal buoyancy sources dominate, the condition on the core-mantle boundary largely controls the dynamics, just as in the case presented by Sakuraba and Roberts (2009). In cases mainly driven by buoyancy sources at the inner-core boundary, the conditions for codensity applied there have a stronger influence. Fixed codensity values at the boundaries lead to much less pronounced effects of the magnetic field on the flow. A surprising result of the study by Hori et al. (2012) is that significant effects are already observed at moderate Ekman numbers of O.104 /.
Subcritical Dynamos and the Nature of the Dynamo Bifurcation The simplest bifurcation structure that might be expected based on the ideas presented above is sketched in Fig. 10. The illustration is taken from Hori and Wicht (2013), but similar figures have been published before. Starting from a weak magnetic perturbation, the nonmagnetic flow becomes a kinematic dynamo at Ra D Rad in a supercritical bifurcation when the critical magnetic Reynolds number for dynamo action, Rmd , is reached. If the Rayleigh number exceeds Rad only moderately, the dynamo may saturate on the so-called weak field branch where the magnetic field is too faint to substantially affect the flow. If Ra is increased
U or Rm Rmd mag
Rmd
Ra Ramag c
Rac
B2 or Λ Strong field branch
Λ = O(1)
Weak field branch mag
Rad
Rad
Ra
Subcritical dynamo
Fig. 10 A sketch of the classical dynamo bifurcation scenario as motivated by magnetoconvection results (Slightly modified from Hori and Wicht 2013)
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
809
further, the magnetic field reaches an ƒ D O.E 1=3 / amplitude where Lorentz forces become important in the force balance. It seems reasonable to expect that saturation does not occur before the Elsasser number ƒ becomes O.1/, where magnetoconvection suggests that the critical Rayleigh number reaches a minimum, viscous forces become negligible, Lorentz forces control the flow scales, and a magnetostrophic force balance is established. Once the system reaches this socalled strong field branch, it seems possible to reduce the Rayleigh number to values below Rad without shutting down the dynamo. Such solutions are commonly called subcritical dynamos. Note that this term is sometimes also used in a more restrictive sense where it refers to hypothetical dynamos which remain stable even for Rayleigh numbers below the critical Rayleigh number Rac for the onset of nonmagnetic convection. Indeed, the magnetoconvection results and Cartesian dynamo studies (St. Pierre 1993; Stellmach and Hansen 2004) suggest that such dynamos may exist, although they have not been observed in spherical shell simulations yet. The bifurcation behavior found in numerical simulations is certainly more complicated than the simple picture sketched above. That a strong magnetic field can help to maintain a dynamo for Ra < Rad can already be inferred from the benchmark dynamo (case BMII in Table 1), which is only found if the calculation is started from a suitable and sufficiently strong initial field. Since the benchmark dynamo operates at E D 103 , the flow scales are not expected to differ much from the nonmagnetic solution, which is readily confirmed by comparing Figs. 3 and 7. Additional effects to those discussed above are likely to be at work. In addition to subcritical dynamos, numerical simulations have also demonstrated bistability, showing that more than one solution can be stable for the same set of control parameters. A recent study by Morin and Dormy (2009) has further shown the existence of isola-type dynamo bifurcations in certain regions of parameter space, in which stable magnetic and nonmagnetic states coexist for a finite range of Rayleigh numbers, with the dynamo branch suddenly braking down when the Rayleigh number exceeds a certain value. Several mechanisms have been proposed to explain the origin of subcritical behavior in numerical dynamos. Two possible mechanisms have recently been identified by Sreenivasan and Jones (2011). They note that magnetic fields with dipolar symmetry tend to enhance the flow along the axis of the convection columns, resulting in an overall enhancement and a more coherent spatial distribution of the flow helicity (see Sect. 1). Since helicity, though not essential for dynamo action (Gilbert et al. 1988), generally helps to maintain large-scale magnetic fields, this might explain the occurrence of subcritical behavior. Interestingly, magnetic fields with quadrupolar symmetry have a much weaker effect on helicity, which may hint at an explanation for why so many dynamos produce dipolar fields for not too larger Rayleigh numbers. A second, less robust mechanism described by Sreenivasan and Jones (2011) is the competition between dipolar fields and zonal flows most relevant for stress-free boundary conditions, as discussed in Sect. 3.2. The shear associated with strong zonal flows destroys convection and thus kinetic helicity in a large fraction of the spherical shell so that only relatively weak magnetic fields
810
J. Wicht et al.
a
b 10 dominant mu
Λ
10 1
0.1
8 6 4 2
1
10 Ra/Rac
0
1
10 Ra/Rac
Fig. 11 Elsasser number ƒ (a) and dominant modes m in the kinetic energy spectra (b) versus Ra=Rac for models driven by internal heating and a prescribed heat flux at the core-mantle boundary at E D 104 ; P r D 1, and Pm D 3. Green crosses represent nonmagnetic convection with failed kinematic dynamos, blue stars are dynamos grown from small seed fields, and purple squares represent dynamos starting from a strong dipolar field. In (a), the dotted lines represent the jump from a solution with weak field to a solution with strong field and the decay of the strong field, respectively. The dominant m modes in (b) are defined as modes which contain more than 75 % of the energy of the peak mode. The first and the second peak modes are connected by thick and thin lines, respectively (From Hori and Wicht 2013)
are produced. A strong magnetic field, on the other hand, can eliminate these zonal flows via Lorentz forces, and the resulting stronger helicity in turn maintains the strong field. The results of Sreenivasan and Jones (2011) and Morin and Dormy (2009), obtained for fixed temperature boundary conditions and without strong internal heating, reveal only a small subcritical window, extending down to no more than about 80 % of Rad . A recent study by Hori and Wicht (2013) shows that the way convection is driven again has a key influence in this context. In models driven by internal sources and a prescribed, constant heat flux condition at the CMB, the subcritical range is shown to extend to 25 % of Rad in some simulations, even at the moderate Ekman number E D 104 . Figure 11 shows an example from this study. Two dynamo branches are clearly evident in Fig. 11a: a branch of solutions with weak, multipolar magnetic fields grown from small initial seed fields and a branch of dynamos with much stronger, dipolar fields, which are obtained either at high Rayleigh numbers or by using a strong initial field. Both branches coexist in a finite range of Rayleigh numbers until the lower branch becomes unstable at Ra 13:8 and the dynamos develop into strong dipolar solutions. A subcritical branch extending down to Ra 1:8Rac is also clearly visible. Figure 11b demonstrates that the strong dipolar fields cause much larger dominating flow scales than observed in either nonmagnetic or weak field solutions, in agreement with the expectations from magnetoconvection. The results found by Hori and Wicht (2013) largely resemble the scenario depicted in Fig. 10. The Elsasser number for the weak field cases is somewhat larger than expected, i.e., larger than the ƒ D O.E 1=3 / value where the magnetic field should become dynamically important. This may be due to the fact that multipolar fields have a weaker effect on the convection than well-organized dipolar fields.
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
811
Note that the magnetic Reynolds number for the strong field cases is lower than for nonmagnetic convection in all cases. The authors speculate that the existence of subcritical dynamos in these models is caused by the larger flow scales, which reduce ohmic dissipation, and thus allows the dynamo to operate at lower magnetic Reynolds numbers. Further studies at lower Ekman number are needed to investigate whether the combination of internal heating and flux conditions at the outer boundary might finally allow for subcritical dynamos in the stronger sense, i.e., for Ra < Rac . Although the Earth’s dynamo today is likely in a highly supercritical state, the Martian dynamo may have evolved along a subcritical branch before it went extinct about 4 Gyrs ago (Kuang et al. 2008). No inner core was present at that time, so that the ancient Mars dynamo was probably driven mainly by secular cooling through the mantle, i.e., precisely the scenario studied by Hori and Wicht (2013). It thus seems likely that subcriticality played a role in the abrupt collapse of the Martian magnetic field recorded in crustal magnetizations (Lillis et al. 2008).
Transitions in Low Ekman Number Rapidly Rotating Convection A phenomenon that has recently been observed in low Ekman number dynamo simulations is dual convection (Kageyama et al. 2008; Miyagoshi et al. 2010; Miyagoshi et al. 2011), illustrated in Fig. 12. Shown is the axial vorticity in the equatorial plane for varying Rayleigh numbers and angular velocities , corresponding to Ekman numbers between O.102 / and O.106 / in our definition. A regime transition to a state characterized by sheetlike plumes close to the tangent cylinder and a strong westward zonal flow is observed at low Ekman and high Rayleigh number (Fig. 12h, i). The simulations were carried out using no-slip conditions for the velocity field. A detailed analysis of one particular run from this dual convection regime reveals a local Rossby number Ro` 4 103 , with a dipolar magnetic field, as expected (see section “The Influence of the Magnetic Fields on Rapidly Rotating Convection”). Miyagoshi et al. (2010) point out that a similar dual convection regime has been observed in experiments using water as a working fluid (Gillet et al. 2007), suggesting that the magnetic field is not essential. The simulation results shown in Fig. 12 have been obtained by solving the compressible form of the magnetohydrodynamic equations without employing the Boussinesq approximation described in Sect. 2.1. Furthermore, the authors assumed a gravity field which drops off like r 2 , in contrast to the linearly increasing gravity field that is usually considered. It is thus difficult to directly compare the results to other studies. Still, the simulations strongly suggest that important new dynamical effects come into play at low Ekman numbers. A further hint that rapidly rotating, nonmagnetic convection might still yield surprises comes from a promising approach employing simplified equations (Julien et al. 1998, 2012; Julien and Knobloch 1998; Sprague et al. 2006). The authors point out that the extreme range of spatial and temporal scales which make direct numerical simulations of rapidly rotating convection so difficult can be exploited to simplify the governing equations considerably. Instead of solving the full Boussinesq equations, they propose to use a set of reduced equations which can be shown to be asymptotically valid in the limit of small Rossby number. Numerical
812
J. Wicht et al. Higher Ra
a
b
c
d
e
f
g
h
i
Higher W
Fig. 12 Axial vorticity in the equatorial plane obtained in dynamo simulations with varying Rayleigh number Ra and rotation rates by Miyagoshi et al. (2010). The solid arrow in panel (i) indicates the radial range of sheet-plume convection, and the dotted arrow the range of westward zonal flow (From Miyagoshi et al. 2010)
simulations using these reduced equations in Cartesian geometry (Sprague et al. 2006; Julien et al. 2012) reveal a surprisingly rich behavior. For planetary cores, a regime the authors identify as geostrophic turbulence may be of particular interest. In this regime, vertical coherence is lost, and for strongly driven flows, Julien et al. (2012) present evidence for an inverse cascade from small horizontal scales to large-scale, depth-independent horizontal flow. The authors speculate that as soon as energy is transferred to spatial scales where the latitudinal variation of the Coriolis force becomes felt, the large scales might organize into a zonal flow at the Rhines scale (see, e.g., Vallis 2006). Future studies are clearly needed in order to clarify these issues.
4
Comparison with the Geomagnetic Field
A detailed comparison of numerical solutions with the geomagnetic field structure and dynamics can serve to judge whether a particular simulation provides a realistic geodynamo model. Global geomagnetic models represent the field in terms of spherical harmonics which allows a downward continuation to the core-mantle
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
813
boundary. Sufficient resolution in time and space and an even global coverage are key issues here. We refer to Hulot et al. (2010) for more detailed review of the different geomagnetic field models. A comparison of simulation snapshots with geomagnetic models of certain epochs always bears the problem that both magnetic fields are highly variable in time. The conclusions may therefore depend on the selected epochs. Time-averaged fields offer additional insight though certain smaller-scale features vanish in the averaging process. The relatively high resolution and data quality make the satellitebased modern field models from the past decades ideal candidates for a comparison. However, one should keep in mind that at least some of their features may not represent the “typical” geomagnetic field and that they do not embrace the full geomagnetic time variability. The attainable resolution for the dynamo field is limited by the fact that the crustal magnetic field cannot be separated from the core contribution. Even the modern satellite models that thrive to represent the core-mantle boundary field are therefore only reliable to spherical harmonic degrees l 14 where the crustal contribution remains negligible. Satellite-based models reach back to 1980 and therefore encompass only a very small fraction of the inherent geomagnetic time scales. Historic field models cover the past 400 years and rely on geomagnetic observatory data and ship measurements among other sources. Naturally, the resolution and precession is inferior to the satellite models and degrades when going back in time. gufm1 (Jackson et al. 2000), the model that we will use for comparison in the following, provides spherical harmonic degrees up to l D 14 in 2:5 year intervals. Figure 13 compares the gufm1 core-mantle boundary (CMB) field model for the epoch 1990 with selected snapshots from models E5R36 and E6. The comparison with model E5R36 reveals striking similarities. An imposed fourfold azimuthal symmetry has been used in model E6 to save computing time. This complicates the comparison, but the field seems to show too little structure at low latitudes and lacks the inverse field patches that seem to be typical for the historic geomagnetic field in this region. A more detailed discussion of the individual features follows below. Respective comparisons for numerical dynamos at different Ekman numbers can be found elsewhere (Christensen and Wicht 2007; Wicht et al. 2009; Wicht and Christensen 2010). The normalized magnetic energy spectrum at the core-mantle boundary, PmDl mD0 Em .l; m; r D r0 / N Em .l; r D r0 / D PmD1 ; mD0 Em .l D 1; m; r D r0 /
(26)
provides information about the importance of the different spherical harmonic degrees in the core field. Here, Em .l; m; r/ is the magnetic energy carried by the mode of spherical harmonic degree l and order m at radius r. Figure 14 compares the time-averaged gufm1 spectra with time averaged spectra from different simulations. When downward continuing the magnetic measurements to construct models of core-mantle boundary field, damping of the harmonics beyond say degree l D 12 is required and this clearly affects the higher harmonics in gufm1. The spectra for the
814
J. Wicht et al. GUFM 1990
E5R36 filtered
E6 filtered
E536 full res.
E6 full res.
Fig. 13 Comparison of radial magnetic fields at the outer boundary of the dynamo region (coremantle boundary). The top panel shows the GUFM1 field for the year 1990, the other panels show snapshots from models E5R36 and E6, restricted to l 14 in the middle row and at full numerical resolutions in the bottom row. The color scheme has been inverted for the numerical model to ease the comparison. Generally, the dynamo problem in invariant with respect to a sign changes in the magnetic field
non-dipolar contribution in the numerical simulations remain basically white for low to intermediate degrees with some degrees, notably l D 5, sticking out. The relative importance of the dipole contribution grows with decreasing Ekman number as long as the solutions belong to regimes D or E, as we will further discuss below. The red curve in Fig. 14 is an example for a multipolar solution where the dipole has completely lost its prominence. Archeomagnetic models include the magnetic information preserved in recent lava flows and sediments. The widely used CALS7K.2 model by Korte et al. (2005) and Korte and Constable (2005) is a degree l D 10 model that is thought to be reliable up to degree l D 4 with a time resolution of about a century and reaches back seven millennia. Newer models from the same family cover the last ten millennia (Korte et al. 2011). Figure 15 shows the respective time-averaged CALS7K.2 CMB field in comparison with time-averaged numerical solutions that we discuss further below. Paleomagnetic data reaching further back in time neither have the spatial nor the temporal resolution and precision to construct faithful models for specific epochs.
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
WICHT ET AL.
101
normalized magnetic energy
815
100
10–1
10–2
10–3 0
1
2
3
4
5
6
7 8 degree l
9
10 11 12 13 14
Fig. 14 Time-averaged normalized magnetic energy spectrum at the outer boundary for GUFM1 (black), the E D 3 104 model E4R106 (green), and the E D 3 105 models E5R36 (blue) and E5R48 (red) which belong to regimes D and M, respectively. Colored bars in the width of the standard deviation indicate the time variability
Statistical interpretations are thus the norm and provide time-averaged fields (TAF) and mean paleo-secular variation, for example, in the form of VGP scatter (see below). In the following we assess the similarity of the geomagnetic field and simulation results in the context of some key issues that were discussed more extensively in recent years.
Dipole Properties and Magnetic Field Symmetries In the linearized system describing the onset of convection, solutions with different equatorial symmetry and azimuthal symmetry decouple. In terms of spherical harmonics, azimuthal symmetries are related to different orders m, while equatorial symmetric and antisymmetric solutions are described by harmonics where the sum of degree l and order m is even or odd, respectively. Equatorial symmetric flows are preferred in rotation dominated systems and are thus excited at lower Rayleigh numbers than antisymmetric contributions. The convective columns described above reach right through the shell and are evidently symmetric with respect to the equator. Equatorial symmetric flows support dynamos with either purely equatorial antisymmetric or purely symmetric magnetic fields which in the dynamo context are called the dipolar and quadrupolar families, respectively (Hulot and Bouligand
816
J. Wicht et al.
Fig. 15 Time-averaged fields for the CALS7K.2 archeomagnetic model by Korte and Constable (2005) and for the numerical models E5R18 (left) and E5R36 (right). Time averages over intervals corresponding to 500, 5;000, and 50;000 year are shown in the top middle and bottom panels, respectively
2005). These names refer to the primary terms in the families, the axial dipole .l D 1; m D 0/ or the axial quadrupole .l D 2; m D 0/. Both are coupled by equatorial antisymmetric flows which can transfer energies from one family to the other. The dipolar family is clearly preferred in dynamo solutions at lower Rayleigh numbers which correspond to regime D. However, a few exceptions have been reported for large Ekman numbers (Aubert and Wicht 2004) or stress-free outer boundaries. Dynamos with perfectly azimuthal symmetry and equatorial symmetry are found close to the onset of convection where the underlying flow would still retain perfect symmetries in the nonmagnetic case. The respective models have been marked by squares in Fig. 5. These solutions obey a very simple time dependence: a drift of the whole pattern in azimuth. When the Rayleigh number and the magnetic Prandtl number are increased, the azimuthal symmetry is broken first and then the
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
817
equatorial symmetry. This goes along with a change in time dependence from drifting to chaotic, sometimes via an oscillatory behavior (Wicht 2002; Wicht and Christensen 2010) (see Figs. 5 and 16). We quantify the field geometry and symmetry with four time-averaged measures here: The dipole contribution is characterized by time averages of its relative importance at ro , i.e., by the dipolarity P
mDl mD0
1=2 Em .l D 1; m; r0/
d D PlD11 PmDl lD1
1=2 ;
(27)
mD0 Em .l; m; r0 /
and by the mean dipole tilt ‚, the minimum angle between the magnetic dipole axis and the rotation axis. The symmetry properties of the non-dipolar harmonics are quantified by the relative strength of the equatorial symmetric contributions, i.e., by the equatorial symmetry measure P lD11 PmDl;lCmDeven eD
1=2
Em .l; m; r0 / mD0 lD2 P 1=2 lD11 PmDl E .l; m; r / m 0 lD2 mD0
;
(28)
and the relative strength of the non-axially symmetric contributions, i.e., by the axial symmetry measure P lD11 PmDl aD
1=2 E .l; m > 0; r / 0 lD2 mD1 m : P 1=2 P lD11 mDl E .l; m; r / m 0 lD2 mD0
(29)
These four measures are restricted to degrees 1 < l 11 to facilitate a comparison with the gufm1 model which is heavily damped for higher degrees. The dipole contributions would dominate both measures e and a and are therefore excluded. Dynamos where all non-dipolar modes statistically contain the same energy yield es D 0:73 and as D 0:93. Figure 16 shows the dependence of the four time-averaged measures on the local Rossby number Ro` for several dynamo models. The models span Ekman numbers from E D 103 to E D 3 105 and magnetic Prandtl numbers between Pm D 1 and Pm D 20 (see Table 1). Bottom driven cases with constant codensity boundary conditions are considered as well as cases where the compositional flux is forced to zero at the outer boundary. The former models can be considered as purely thermally driven, while the latter mimics compositional driving. For each model, Ro` is varied by changing the Rayleigh number. The gufm1 model is represented by the gray blocks in Fig. 16. Its vertical extension corresponds to minimum and maximum values within the represented 400 year period, and the horizontal extension indicates uncertainties in Ro` (Christensen and Aubert 2006). Note that Earth’s local Rossby
818
J. Wicht et al.
number is based on the scaling derived by Christensen and Aubert (2006) and not on direct measurements (see also the discussion in Sect. 3.2). Table 1 lists the four measures for selected cases. Figure 16 demonstrates that the dipolarity generally increases with decreasing Ekman number while the mean tilt becomes smaller. This can be attributed to the growing influence of rotation at smaller E values; the associated increase in
a 1.0
b 50
0.9 0.8
d
0.7
M
0.6 0.5
tilt
E
0.4 0.3
D
0.2 0.1 0.00 0.05 0.10 0.15 0.20 0.25 Rol
E
0.8 D 0.7 0.6 0.5 0.4 0.3 M 0.2 0.1 0.0 0.00 0.05 0.10 0.15 0.20 0.25 Rol
d 1.00 0.95
E D
0.90 0.85 a
e
c 0.9
45 40 35 30 D 25 20 15 M 10 5 E 0 0.00 0.05 0.10 0.15 0.20 0.25 Rol
0.80 0.75 0.70
M
0.65 0.60 0.00 0.05 0.10 0.15 0.20 0.25 Rol
Fig. 16 Dipole properties and symmetry properties of non-dipolar contributions for several dynamo models and the gufm1 (gray boxes). Different symbols code the time dependence as explained in the caption of Fig. 5. Panel (a) shows the time-averaged dipolarity d and panel (b) shows the time-averaged dipole tilt. Panels (c) and (d) display the time averaged equatorial symmetry measure e and time-averaged axial symmetry measure a, respectively. Blue and red, models from Fig. 5 with Pm D 10 and Pm D 20, respectively; yellow, identical parameters to the blue model but with chemical boundary conditions; green, E D 3 104 , Pm D 3, chemical boundary conditions; black, E D 3 105 , Pm D 1, fixed temperature conditions. The Prandtl number is unity in all cases. Colored bars in the width of the standard deviation indicate the time variability in panels (c) and (d). The standard deviation amounts to only a few percent in d (panel a) and is of the same order as the tilt itself (panel b)
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
819
“geostrophic” flow correlation promotes the production of axial dipole field. This is counteracted by the increased influence of inertial forces at larger Rayleigh numbers. The strong dipolarity of gufm1 and its sizable mean tilt around 10ı can only be reached at lower Ekman numbers combined with larger Rayleigh numbers corresponding to Ro` values around 0:1. The comparison of the three different models at E D 103 suggests that neither the magnetic Prandtl number (Pm D 10 and Pm D 20) nor the thermal outer boundary condition play an important role (see also Wicht et al. 2009). The equatorial and axial symmetry measures shown in panels (c) and (d) of Fig. 16 provide a less conclusive picture. There is a trend that equatorially symmetric and non-axial contributions become more important with growing local Rossby number. Generally, dynamo simulations prefer equatorially antisymmetric and axisymmetric modes in regimes D and E since a and e lie below the statistical value as and es marked by thick horizontal lines in Fig. 16. Gufm1 also shows a tendency towards equatorially antisymmetric modes which is even more pronounced in paleomagnetic data (Christensen et al. 2010). Figure 16 demonstrates that dynamo simulations are capable of reaching geodynamolike equatorial and axial symmetries provided the Rayleigh number is large enough. The simulations at E D 3 105 , however, show some exceptions. The lowest Ro` case is close to the onset of dynamo action and still shows the perfect equatorially antisymmetry e D 0. For such fields, the statistical a value decreases to as D 0:63 which explains why the solution is also particularly axisymmetric. Why three cases in regime D at intermediate Ro` also show a distinct preference towards axisymmetric magnetic field configurations remains unknown. They seem incompatible with the gufm1 data in this respect. Larger Rayleigh numbers should help to bring the models in line with geomagnetic values. Note that the degree of dipole dominance and the mean tilt are somewhat better constrained by additional archeomagnetic and paleomagnetic information than the equatorial and axial symmetry (Hulot et al. 2010). In a similar study, Christensen et al. (2010) confirm that even models at relatively large Ekman numbers can be surprisingly Earth-like, provided the Rayleigh number is large enough to yield sufficient complexity. They report that the magnetic Ekman number E has to be smaller than 104 and that the magnetic Reynolds number should reach at least about 200 with larger values being required at lower E . This implies that increasing the magnetic Prandtl number offers an alternative and numerically more affordable path towards more Earth-like models than decreasing the Ekman number. However, the path will never lead from nonreversing to reversing dynamos since the magnetic Prandtl number seems to have no influence on the time variability (see Fig. 5). The above analysis demonstrates that dipole tilt and equatorial and axial symmetry do not strongly constrain the dynamo model which all more or less comply with geomagnetic values. When sufficient temporal complexity is required, however, Ekman numbers smaller than say E D 104 are necessary to also keep the dipolarity on an Earth-like level.
820
J. Wicht et al.
Persistent Features and Mantle Influence Strong normal polarity flux concentrations at higher latitudes are a common feature in many dynamo simulations. Very similar flux lobes can be found in the historic geomagnetic field, two in the northern and two in the southern hemisphere, and seem to have changed only little over the last four centuries (Gubbins et al. 2007). The flux lobes are also present in archeomagnetic models spanning the last ten millennia but show some variability on this time scale (Korte and Holme 2010; Amit et al. 2010b, 2011; Korte et al. 2011). Even some paleomagnetic TAF models covering 5 Myr report persistent high-latitude flux lobes at similar locations (Gubbins and Kelly 1993; Johnson and Constable 1995; Kelly and Gubbins 1997; Carlut and Courtillot 1998; Johnson et al. 2003). Their position and symmetry with respect to the equator suggest that they are caused by the inflow into convective cyclones as discussed in Sect. 3.5. Figure 13 demonstrates that the seemingly large flux lobes in the filtered field version are the expression of much smaller field concentrations caused by convective features of similar scale. Figure 17 shows a close-up of the convection around the patches in the dynamo models E5R43. The small-scale correlation of cyclonic features with strong magnetic field patches is evident. As the Ekman number decreases, the azimuthal scale of the convective columns shrinks, while the scale perpendicular to the rotations axis is much less affected. The columns become increasingly sheetlike which translates into thinner magnetic patches that
Fig. 17 Panel (a) illustrates the complex sheetlike cyclones (red) and anticyclones (blue) in model E5R43 with isosurfaces of the z-vorticity. Panels (b) and (c) zoom in on the part of the northern hemisphere and illustrate how the normal polarity magnetic field is concentrated by the flow that converges into cyclones. The normal polarity field is radially outward in the northern hemisphere indicated by red field lines and the red radial field contours in panel (b)
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
821
are stretched into latitudinal direction. In the filtered version, the finer scale is lost and the action of several convective structures appears as one larger magnetic flux lobe. Since the azimuthal symmetry is not broken in the dynamo models presented above, longitudinal features average out over time. Figure 15 demonstrates that the historical time span is too short to yield this effect. Even time spans comparable to the period covered by archeomagnetic models may still retain azimuthal structures. The flux lobes still appear at similar locations as in the snapshots (see Fig. 13) or the historic averages. The solution finally becomes nearly axisymmetric when averaging over periods corresponding to 50;000 year. This suggests that the persistence over historic or archeomagnetic time scales is nothing special. The persistent azimuthal structures in some 5 Myr TAF models, however, can only be retained when the azimuthal symmetry is broken. The preferred theory is an influence of the lower thermal mantle structure on the dynamo process. It has already been mentioned above that the mantle determines the heat flow out of the dynamo region. Lateral temperature differences in the lower mantle translate into lateral variations in the CMB heat flux since the coremantle boundary is isothermal in comparison. The flux is higher where the mantle is colder than average and vice versa. The respective pattern is typically deduced from seismic tomography models that are interpreted in terms of temperature differences yielding the so-called tomographic heat flux models (Glatzmaier et al. 1999). Several authors have explored the potential influence of lateral CMB heat flux variations on the dynamo process (Glatzmaier et al. 1999; Olson and Christensen 2002; Christensen and Olson 2003; Amit and Olson 2006; Gubbins et al. 2007; Willis et al. 2007; Aubert et al. 2007, 2008a; Takahashi et al. 2008b; Sreenivasan 2009; Amit and Choblet 2009). Many confirm that these variations indeed yield persistent azimuthal features at locations similar to those in the geomagnetic field. The time scale over which the lobes become quasi-stationary depends on the model parameters. Willis et al. (2007) find that rather low Rayleigh numbers and strong heat flux variations, in the same order as the spherically symmetric total heat flow, are required to lock the patches on a historic time scale. Amit et al. (2010a) report rather time-dependent flux lobe locations similar to those observed in the archeomagnetic data for models run at higher Rayleigh numbers. According to Olson and Christensen (2002) rather long averaging times in the order of 100 kyr may then be needed to clearly reveal the mantle imposed heat flux structure. The inhomogeneous CMB heat flux may also help to explain some other interesting geophysical features. The non-axisymmetric structures in paleomagnetic TAF models remain arguable, but the axisymmetric properties are much better constrained. Typically, the axial quadrupole amounts to a few percents of the axial dipole, while the axial octupole is of similar order but of opposite sign (Johnson 2007). In dynamo simulations, the axial octupole contribution is typically too large. The axial quadrupole contribution, on the other hand, remains too low unless the equatorial symmetry is broken by an inhomogeneous CMB heat flux condition. The required degree of north/south heat flux asymmetry, however, is somewhat larger than what the typical “tomographic” models suggest (Olson and Christensen
822
J. Wicht et al.
2002). The direct translation of seismic velocity differences into temperature differences and subsequently heat flux patterns may oversimplify matters here since compositional variations may also have to be considered (Amit and Choblet 2009, 2012). In the historic magnetic field, the secular variation is clearly lower in the Pacific than in the Atlantic hemisphere (Jackson et al. 2000). Christensen and Olson (2003) demonstrate that using a tomographic CMB heat flux can promote such an asymmetry in the numerical simulations. It may also help to better model other features of the secular variation pattern at Earth’s CMB (Amit and Olson 2006; Aubert et al. 2007; Amit et al. 2008). Aubert et al. (2008a) evoke the inhomogeneous cooling of the inner core by a quasi persistent cyclone reaching deep into the core to explain seismic inhomogeneities in the outer 100 km of the inner-core radius. Note, however, that other authors promote an inhomogeneous inner-core growth caused by a specific mode of inner-core convection to explain this feature (Alboussière et al. 2010; Monnereau et al. 2010). The preference for magnetic poles to follow two distinct latitude bands during reversals (Gubbins and Love 1998) can also be explained with a tomographic CMB heat flux model (Coe et al. 2000; Kutzner and Christensen 2004). And finally, the CMB heat flux pattern has been shown to influence the reversal likelihood in dynamo simulations and may thus explain the observed changes in the geomagnetic reversal rate (Glatzmaier et al. 1999; Kutzner and Christensen 2004).
Inverse Magnetic Field Production and the Cause for Reversals The production of strong inverse magnetic field on both sides of the equatorial plane is an inherent feature of the fundamental dynamo mechanism outlined in Sect. 3.5. The associated pairwise radial field patches in the outer boundary field are therefore found in many dynamo simulations. Wicht et al. (2009) and Takahashi et al. (2008a), however, report that the patches become less pronounced at lower Ekman numbers (see also Sakuraba and Roberts 2009). Model E6 in Fig. 13 and model E5R18 in Fig. 15 demonstrate that the patches retreat to a small band around the equator which becomes indiscernible in the filtered field. Normal polarity patches now rule at mid- to lower latitudes, but the region around the equator generally shows rather weak field of normal polarity, which is compatible with the strong dipole component in these models. More pronounced inverse patches reappear when the Rayleigh number is increased, as is demonstrated by E5R36 in Fig. 13. The low-latitude field remains only weakly inverse when averaged over long time spans (see Fig. 15). The time-averaged archeomagnetic model CALS7K.2 (see Fig. 15) points towards a weakly normal field, but the lack of resolution may be an issue here. The larger Rayleigh number in model E5R36 also promotes a stronger breaking of the equatorial symmetry, destroying the pairwise nature of the patches. This contributes to the convincing similarity between the E5R36 solution and the historic geomagnetic field where strong and equatorially asymmetric normal polarity patches seem typical at lower latitudes (Jackson 2003; Jackson and Finlay 2007) (see Fig. 13).
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
823
The weak magnetic field in the polar regions can give way to inverse field at larger Rayleigh numbers, a feature also found in the historic geomagnetic field. This inverse field is produced by plumelike convection that rises inside the tangent cylinder, and the typical associated magnetic structure has been named magnetic upwellings (MU) by Aubert et al. (2008b). MUs may also rise at lower to midlatitudes when the equatorial symmetry is broken to a degree where one leg of the equatorial inverse field production clearly dominates. The symmetry breaking is essential since the inverse fields created on both sides of the equator would cancel otherwise. The MUs may trigger magnetic field reversals and excursions when producing enough inverse field to efficiently cancel the prevailing dipole field (Wicht and Olson 2004; Aubert et al. 2008b). This basically resets the polarity, and small field fluctuations then decide whether axial dipole field of one or the other direction is amplified after the MUs have ceased (Aubert et al. 2008b; Wicht et al. 2009). MUs are a common feature in larger Rayleigh number simulations and vary stochastically in strength, number, and duration. Wicht et al. (2009) suggest that particularly strong or long-lasting MUs are required to trigger reversals. Alternatively, several MUs may constructively team up. Both scenarios remain unlikely at not too large Rayleigh numbers which explains why reversals are rare in regime E (see Figs. 5 and 16).
Time Variability The internal geomagnetic field obeys a very rich time variability from short-term variations on the yearly time scale, the geomagnetic jerks, to variations in the reversal frequency on the order of several tens of million years (Hulot et al. 2010). Dynamo simulations are capable of replicating many aspects of the time variability (Olson et al. 2012), but the relative time scales of the different phenomena may differ from the geomagnetic situation depending on the model parameters. Christensen and Tilgner (2004) analyzed a suit of dynamo simulations to further elucidate how the typical secular variation time scale depends on the degree l of the magnetic component. They find the inverse relationship: l D 5:2u l 1 ;
(30)
which is compatible with the idea that the magnetic field is advected by the large scale flow and agrees well with geomagnetic findings (Hongre et al. 1998; Olsen et al. 2006; Lhuillier et al. 2011). For Earth, u amounts to approximately two centuries. Torsional oscillations (TOs) are a specific form of short-term variations on the decadal time scale (Braginsky 1970). They concern the motion of so-called geostrophic cylinders that are aligned with the planetary rotation axis. TOs are essentially one-dimensional Alfvén waves that travel along the magnetic field lines connecting the cylinders. The field lines act like torsional springs that react to any relative acceleration of the cylinders with respect to each other. TOs have the correct time scale for explaining the decadal variations in Earth’s length of day, which are typically attributed to an exchange of angular momentum between Earth’s core
824
J. Wicht et al.
and mantle (Jault et al. 1988; Jackson 1997; Jault 2003; Amit and Olson 2006). TOs have reportedly been identified in the geomagnetic secular variation signal (Zatman and Bloxham 1997; Gillet et al. 2010), are considered a possible origin for the geomagnetic jerks (Bloxham et al. 2002), and may be responsible for the length-of-day changes via coupling to Earth’s mantle (Gillet et al. 2010). The smallness of inertial forces and viscous forces in the magnetostrophic force balance is a prerequisite for torsional oscillations to become an important part of the short-term dynamics. Coriolis and pressure forces do not contribute to the integrated azimuthal force on geostrophic cylinders for geometrical reasons. This leaves the azimuthal Lorentz forces as the only constituent in the first-order balance. Taylor (1963) therefore conjectured that the dynamo would assume a configuration where the azimuthal Lorentz forces would cancel along the cylinders until the integrated force can be balanced by viscous or inertial effects. In recognition of Taylor’s pioneering work, dynamos where the normalized integrated Lorentz force R C .s/
O ..r B/ B/ dF
C .s/
j O .r B/ B j dF
T .s/ D R
;
(31)
is small are said to obey a Taylor state. Here, C .s/ is the geostrophic cylinder of radius s and dF is a respective surface element. Torsional oscillations are faster disturbances that ride on the background Taylor state which is established on the turnover time scale. Wicht and Christensen (2010) find torsional oscillations in dynamo simulations for Ekman numbers of E D 3 105 or smaller and for relatively low Rayleigh numbers where inertial forces remain secondary. Figure 18 shows the traveling waves and the T .s/ in their model E6 at E D 3 106 where TOs can be identified most clearly (see Table 1 for further model parameters and properties). The Alfvén Mach number (see Table 1) provides the ratio of the flow time scale to the Alfvén time scale characteristic for TOs. Alfvén waves travel more than an order of magnitude faster than the typical convective flow speed in Earth’s core. In the low Ekman number case where TOs have actually been identified, they were only roughly twice as fast as the flow, and larger ratios can only be expected when the Ekman number is further decreased below E D 3 106 (Wicht and Christensen 2010). A recent analysis by Christensen et al. (2012) indicates that dynamo simulation may nevertheless be capable of recovering the decadal variations found in newest satellite base geomagnetic field models. This could mean that torsional oscillations are actually not an important part of the decadal variations. The slowest magnetic time scales are associated with field reversals and variations in the reversal rate (Olson et al. 2012). Geomagnetic reversals typically last some thousand years, while simulated reversals seem to take somewhat longer (Wicht 2005; Wicht et al. 2009). Figure 6 shows an example for a reversal sequence in a numerical simulation. The duration of geomagnetic and simulated reversals
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
825
a 1.4
s
1.2 1.0 0.8 0.6 0
1
2
3 time
4
5
4
5
[x10–3]
b 1.4
s
1.2 1.0 0.8 0.6 0
1
10–2
2
3 time [x10–3]
10–1
100
Fig. 18 Panel (a) shows the speed of geostrophic cylinders outside the tangent cylinder for a selected time span in model E6 (Wicht and Christensen 2010). The time is given in units of the magnetic diffusion time here. Torsional oscillations travel from the inner-core boundary (bottom) towards the outer core boundary (top) with the predicted Alfvén velocity (white lines). The agreement provides an important clue that the propagating features are indeed torsional oscillations. Panel (b) shows the normalized integrated Lorentz force T .s/ which can reach values down to 102 where the Taylor state is assumed to a good degree. The Taylor state is broken during the torsional oscillations
obeys a very similar latitudinal dependence (Clement 2004; Wicht 2005; Wicht et al. 2009) with shorter durations at the equator and a gradual increase towards the poles. During the last several Myr, the average reversal rate was about 4 per Myr, but there were also long periods in Earth’s history when the geodynamo stopped reversing. The most recent of these so-called superchrons happened in the Cretaceous and lasted for about 37 Myr. Whether the variations in reversal rate are caused by changes in Earth’s mantle (Glatzmaier et al. 1999; Constable 2000; Biggin et al. 2012) or are an expression of the internal stochastic nature of the dynamo process (Jonkers 2003; Ryan and Sarson 2007) is still debated. Generally, dynamo
826
J. Wicht et al.
simulations (Glatzmaier et al. 1999; Wicht et al. 2009) and even rather simple parameterized models seem capable of showing Earth-like variations in reversal frequency. A more thorough analysis of reversal properties and their stochastic nature in geodynamo simulations is still missing. Such an analysis seems impossible at lower Ekman numbers due to the excessive computing times required (Wicht et al. 2009). Paleomagnetists frequently interpret the local magnetic field provided by a paleomagnetic sample as being caused by a pure dipole contribution which they call a virtual dipole. Consequently, the associated magnetic pole is referred to as a virtual geomagnetic pole (VGP). The scatter of the associated virtual geomagnetic pole around its mean position is used to quantify the paleosecular variation in paleomagnetic field models. The scatter shows a typical latitudinal dependence with low values around 12ı at the equator and rising to about 20ı at the poles. Some dynamo simulations show a very similar variation (Kono and Roberts 2002, Wicht 2005; Christensen and Wicht 2007; Wicht et al. 2009). Christensen and Wicht (2007) demonstrate that the amplitude of the scatter depends on the Rayleigh number and seems somewhat high at Rayleigh numbers where field reversals happen. Like for reversals, an analysis of the virtual dipole scatter is missing for lower Ekman number cases because of the long time spans required.
5
Conclusion
We have outlined the ingredients of modern numerical dynamo models that so successfully reproduce many features of the geomagnetic field. These models seem to correctly capture important aspects of the fundamental dynamics and very robustly produce dipole-dominated fields with Earth-like Elsasser numbers and magnetic Reynolds numbers over a wide range of parameters. Even Earth-like reversals are found when the Rayleigh number is chosen high enough to guarantee a sufficiently large impact of inertia. Simple scaling laws allow to directly connect simulations with Ekman numbers as large as E D 103 to the geodynamo at E D 1015 , to the dynamos of other planets in our solar system, and even to the dynamos in fast-rotating stars (Christensen and Aubert 2006; Olson and Christensen 2006; Takahashi et al. 2008a; Christensen et al. 2009). All this strongly suggests that the models get the basic dynamo process right. Lower Ekman number simulations certainly do a better job in capturing the small-scale turbulent flow in the dynamo regions. Is the associated excessive increase in numerical costs really warranted for modeling dynamo action? A closer comparison with the geomagnetic field suggests that some features indeed become more Earth-like. Reversing dynamos are inevitably too little “dipolar” at Ekman numbers E 3 104 . At E D 3 105 , however, the relative importance of the dipole contribution remains compatible with the historic magnetic field even at Rayleigh numbers where reversals are expected. Torsional oscillations, which may play an important role in the dynamics on decadal times scales, only start to become
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
827
significant at Ekman numbers E 3 105 (Wicht et al. 2009). Their time scale still remains too slow at E D 3 106 , and a further decrease in Ekman number will likely improve matters here. Low Ekman number simulation therefore seems in order to faithfully model the dynamics on decadal time scale where satellite data continue to provide the most reliable geomagnetic field models. On the other hand, even dynamo simulations around E D 105 already show a clear decadal variation signal (Christensen et al. 2012). Recently, Aubert (2013) used a data assimilation approach to demonstrate that even simpler dynamos at larger Ekman numbers are generally capable of reproducing the geomagnetic secular variation signal from the satellite area. Several published low Ekman number simulations have rather simple magnetic fields with much smaller dipole tilts than typical for Earth, little field complexity in the equatorial region, and a too symmetric field. Likely, the time variability is also little Earth-like. The reason is that the authors concentrated on decreasing the Ekman number while keeping the Rayleigh number moderate. Simulations with larger Rayleigh numbers that yield magnetic Reynolds numbers of at least a few hundred generally produce more Earth-like fields, even at larger Ekman numbers. The drastic regime change which marks the transition from the weak field branch to the strong field branch in Cartesian magnetoconvection problems has never been found in spherical dynamo simulations. Recent simulations have nevertheless shown that flow length scale clearly increases in the presence of a strong dipolar magnetic field, in particular when internal heating plays an important role in driving the dynamo (Sakuraba and Roberts 2009; Hori et al. 2012). The effect may further increase at lower Ekman numbers. The reason for the non-dipolar field and particular convection pattern found by Kageyama et al. (2008) at E D 106 remains to be clarified but may have to do with the different model setup. Dynamos at Ekman numbers E 105 will continue to be the focus of numerical dynamo simulations that aim at understanding the fundamental dynamics and mechanisms. For exploring the long-term behavior, relatively large Ekman numbers will remain a necessity. These should be supplemented by increased efforts to explore lower Ekman number models which show interesting distinct features that remain little understood. The use of boundary conditions that implement the laterally varying heat fluxes at Earth’s core-mantle boundary is another example how dynamo modelers try to improve their numerical codes. The heat flux can influence the reversal behavior (Glatzmaier et al. 1999), helps to model some aspects of the long-term geomagnetic field and its secular variation, and even has the potential to explain a seismic anisotropy of Earth’s inner core (Aubert et al. 2008a). Recently revised estimates of the thermal and magnetic diffusivities indicate that both may be around three times higher than previously anticipated (Pozzo et al. 2012). The higher thermal diffusivity means that more heat can be conducted down the adiabat and is not available for driving the dynamo. The heat flux through Earth’s core-mantle boundary may actually be subadiabatic today. Compositional
828
J. Wicht et al.
convection and radioactive heating based on potassium may come to a rescue here and will form a focus of future geodynamo research. Modern dynamo simulations have already proven useful for exploring the dynamics of planetary interiors. Model refinements and increasing numerical power will further increase their applicability in the future. Several space missions promise to map the interior magnetic fields of Earth (ESA’s Swarm mission), Mercury (NASA’s MESSENGER and ESA’s BepiColombo missions), and Jupiter (NASA’s Juno mission) with previously unknown precision. High-end numerical dynamo simulations will be indispensable for translating these measurements into interior properties and dynamical processes. Acknowledgements Johannes Wicht thanks Uli Christensen for useful discussions and hints.
References Alboussière T, Deguen R, Melzani M (2010) Melting-induced stratification above the Earth’s inner core due to convective translation. Nature 466:744–747 Amit H, Aubert J, Hulot G (2010a) Stationary, oscillating or drifting geomagnetic flux patches? J Geophys Res 115:B07108 Amit H, Aubert J, Hulot G, Olson P (2008) A simple model for mantle-driven flow at the top of Earth’s core. Earth Planets Space 60:845–854 Amit H, Choblet G (2009) Mantle-driven geodynamo features – effects of post-perovskite phase transition. Earth Planets Space 61:1255–1268 Amit H, Choblet G (2012) Mantle-driven geodynamo features – effects of compositional and narrow D” anomalies. Phys Earth Planet Inter 190:34–43 Amit H, Korte M, Aubert J, Constable C, Hulot G (2011) The time-dependence of intense archeomagnetic flux patches. J Geophys Res 116(B15):B12106 Amit H, Leonhardt R, Wicht J (2010b) Polarity reversals from paleomagnetic observations and numerical dynamo simulations. Space Sci Rev 155:293–335 Amit H, Olson P (2006) Time-average and time-dependent parts of core flow. Phys Earth Planet Inter 155:120–139 Amit H, Olson P (2008) Geomagnetic dipole tilt changes induced by core flow. Phys Earth Planet Inter 166:226–238 Aubert J (2013) Flow throughout the Earth’s core inverted from geomagnetic observations and numerical dynamo models. Geophys J Int 192:1537–556 Aubert J, Amit H, Hulot G (2007) Detecting thermal boundary control in surface flows from numerical dynamos. Phys Earth Planet Inter 160:143–156 Aubert J, Amit H, Hulot G, Olson P (2008a) Thermochemical flows couple the Earth’s inner core growth to mantle heterogeneity. Nature 454:758–761 Aubert J, Aurnou J, Wicht J (2008b) The magnetic structure of convection-driven numerical dynamos. Geophys J Int 172:945–956 Aubert J, Labrosse S, Poitou C (2009) Modelling the paleo-evolution of the geodynamo. Geophys J Int 179:1414–1429 Aubert J, Wicht J (2004) Axial versus equatorial dynamo models with implications for planetary magnetic fields. Earth Planet Sci Lett 221:409–419 Biggin AJ, Steinberger B, Aubert J et al (2012) Possible links between long-term geomagnetic variations and wholemantle convection processes. Nat Geosci 5:674 Bloxham J, Zatman S, Dumberry M (2002) The origin of geomagnetic jerks. Nature 420:65–68
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
829
Braginsky S (1970) Torsional magnetohydrodynamic vibrations in the Earth’s core and variation in day length. Geomag Aeron 10:1–8 ˘ Zs ´ core and the Braginsky S, Roberts P (1995) Equations governing convection in EarthâA geodynamo. Geophys Astrophys Fluid Dyn 79:1–97 Breuer M, Manglik A, Wicht J et al (2010) Thermochemically driven convection in a rotating spherical shell. Geophys J Int 183:150–162 Breuer M, Wesseling S, Schmalzl J, Hansen U (2002) Effect of inertia in Rayleigh-Bénard convection. Phys Rev E 69:026320/1–10 Bullard EC, Gellman H (1954) Homogeneous dynamos and terrestrial magnetism. Proc R Soc Lond A A 247:213–278 Busse FH, Simitev R (2005a) Convection in rotating spherical fluid shells and its dynamo states. In: Soward AM, Jones CA, Hughes DW, Weiss NO (eds) Fluid dynamics and dynamos in astrophysics and geophysics. CRC Press, Boca Rato, pp 359–392 Busse FH, Simitev R (2005b) Dynamos driven by convection in rotating spherical shells. Atronom Nachr 326:231–240 Carlut J, Courtillot V (1998) How complex is the time-averaged geomagnetic field over the past 5 Myr? Geophys J Int 134:527–544 Chan K, Li L, Liao X (2006) Phys Modelling the core convection using finite element and finite difference methods. Earth Planet Inter 157:124–138 Chandrasekhar S (1961) Hydrodynamic and hydromagnetic stability. Clarendon Press, Oxford Christensen UR (2002) Zonal flow driven by strongly supercritical convection in rotating spherical shells. J Fluid Mech 470:115–133 Christensen UR (2006) A deep dynamo generating Mercury’s magnetic field. Nature 444: 1056–1058 Christensen UR (2010) Accepted for publication at Space Sci Rev Christensen U, Aubert J (2006) Scaling properties of convection-driven dynamos in rotating spherical shells and applications to planetary magnetic fields. Geophys J Int 116:97–114 Christensen UR, Aubert J, Busse FH et al (2001) A numerical dynamo benchmark. Phys Earth Planet Inter 128:25–34 Christensen UR, Aubert J, Hulot G (2010) Conditions for Earth-like geodynamo models. Earth Planet Sci Lett 296:487–496 Christensen UR, Holzwarth V, Reiners A (2009) Energy flux determines magnetic field strength of planets and stars. Nature 457:167–169 Christensen U, Olson P (2003) Secular variation in numerical geodynamo models with lateral variations of boundary heat flow. Phys Earth Planet Inter 138:39–54 Christensen U, Olson P, Glatzmaier G (1999) Numericalmodeling of the geodynamo: a systematic parameter study. Geophys J Int 138:393–409 Christensen U, Tilgner A (2004) Power requirement of the geodynamo from Ohmic losses in numerical and laboratory dynamos. Nature 429:169–171 Christensen U, Wicht J (2007) Numerical dynamo simulations. In: Olson P (eds) Core dynamics. Treatise on geophysics, vol 8. Elsevier, Amsterdam/Boston, pp 245–282 Christensen UR, Wardinski I, Lesur V (2012) Time scales of geomagnetic secular acceleration in satellite field models and geodynamo models. Geophys J Int 190:243–254 Clement B (2004) Dependency of the duration of geomagnetic polarity reversals on site latitude. Nature 428:637–640 Clune T, Eliott J, Miesch M, Toomre J, Glatzmaier G (1999) Computational aspects of a code to study rotating turbulent convection in spherical shells. Parallel Comput 25:361–380 Coe R, Hongre L, Glatzmaier A (2000) An examination of simulated geomagnetic reversals from a paleomagnetic perspective. Philos Trans R Soc Lond A 358:1141–1170 Constable C (2000) On the rate of occurence of geomagnetic reversals. Phys Earth Planet Inter 118:181–193 Cowling T (1957) The dynamo maintainance of steady magnetic fields. Q J Mech Appl Math 10:129–136
830
J. Wicht et al.
Dormy E, Cardin P, Jault D (1998) Mhd flow in a slightly differentially rotating spherical shell, with conducting inner core, in a dipolar magnetic field. Earth Planet Sci Lett 158:15–24 Fearn D (1979) Thermal and magnetic instabilities in a rapidly rotating fluid sphere. Geophys Astrophys Fluid Dyn 14:103–126 Fournier A, Bunge H-P, Hollerbach R, Vilotte J-P (2005) A Fourier-spectral element algorithm for thermal convection in rotating axisymmetric containers. J Comput Phys 204:462–489 Gastine T, Duarte L, Wicht J (2012) Dipolar versus multipolar dynamos: the influence of the background density stratification. Astron Atrophys 546:A19 Gastine T, Wicht J (2012) Effects of compressibility on driving zonal flow in gas giants. Icarus 219:428–442 Gilbert AD, Frisch U, Pouquet A (1988) Helicity is unnecessary for alpha effect dynamos, but it helps. Geophys Astrophys Fluid Dyn 42(1–2):151–161 Gillet N, Brito D, Jault D, Nataf H (2007) Experimental and numerical studies of convection in a rapidly rotating spherical shell. J Fluid Mech 580:83 Gillet N, Jault D, Canet E, Fournier A (2010) Fast torsional waves and strong magnetic fields ˘ Zs ´ core. Nature 465:74–77 within the EarthâA Glatzmaier G (1984) Numerical simulation of stellar convective dynamos. 1. The model and methods. J Comput Phys 55:461–484 ˘ S¸ how realistic are they? Annu Rev Earth Planet Glatzmaier G (2002) Geodynamo simulations âA Sci 30:237–257 Glatzmaier G, Coe R (2007) Magnetic polarity reversals in the core. In: Olson P (eds) Core dynamics. Treatise on geophysics, vol 8. Elsevier, Amsterdam/Boston, pp 283–297 Glatzmaier G, Coe R, Hongre L, Roberts P (1999) The role of the Earth’s mantle in controlling the frequency of geomagnetic reversals. Nature 401:885-890 Glatzmaier G, Roberts P (1995) A three-dimensional convective dynamo solution with rotating and finitely conducting inner core and mantle. Phys Earth Planet Inter 91:63–75 Glatzmaier G, Roberts P (1996) An anelastic evolutionary geodynamo simulation driven by compositional and thermal convection. Physica D 97:81–94 Gubbins D (2001) The Rayleigh number for convection in the Earth’s core. Phys Earth Planet Inter 128:3–12 Gubbins D, Davies CJ (2013) The stratified layer at the core-mantle boundary caused by barodiffusion of oxygen, sulphur and silicon. Phys Earth Planet Inter 215:21–28 Gubbins D, Kelly P (1993) Persistent patterns in the geomagnetic field over the past 2.5 ma. Nature 365:829–832 Gubbins D, Love J (1998) Preferred vgp paths during geomagnetic polarity reversals: symmetry considerations. Geophys Res Lett 25:1079–1082 Gubbins D, Willis AP, Sreenivasan B (2007) Correlation of Earth’s magnetic field with lower mantle thermal and seismic structure. Phys Earth Planet Inter 162:256–260 Harder H, Hansen U (2005) A finite-volume solution method for thermal convection and dynamo problems in spherical shells. Geophys J Int 161:522–532 Heimpel M, Aurnou J, Wicht J (2005) Simulation of equatorial and high-latitude jets on Jupiter in a deep convection model. Nature 438:193–196 Hejda P, Reshetnyak M (2003) Control volume method for the dynamo problem in the sphere with the free rotating inner core. Stud Geophys Geod 47:147–159 Hejda P, Reshetnyak M (2004) Control volume method for the thermal convection problem in a rotating spherical shell: test on the benchmark solution. Stud Geophys Geod 48:741–746 Hongre L, Hulot G, Khokholov A (1998) An analysis of the geomangetic field over the past 2000 years. Phys Earth Planet Inter 106:311–335 ˘ Z´ core: Implications for cessation Hori K, Wicht J (2013) Subcritical dynamos in the early MarsâA of the past Martian dynamo. Phys Earth Planet Inter 219:21–33 Hori K, Wicht J, Christensen UR (2010) The effect of thermal boundary conditions on dynamos driven by internal heating. Phys Earth Planet Inter 182:85–97
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
831
Hori K, Wicht J, Christensen UR (2012) The influence of thermo-compositional boundary conditions on convection and dynamos in a rotating spherical shell. Phys Earth Planet Inter 196:32–48 Hulot G, Bouligand C (2005) Statistical paleomagnetic field modelling and symmetry considerations. Geophys J Int 161. doi:10.1111/j.1365 Hulot G, Finlay C, Constable C, Olsen N, Mandea M (2010) The magnetic field of planet Earth. Space Sci Rev. doi: 10.1007/s11,214–010–9644–0 Isakov A, Descombes S, Dormy E (2004) An integro-differential formulation of magnet induction in bounded domains: boundary element-finite volume method. J Comput Phys 197:540–554 Ivers D, James R (1984) Axisymmetric antidynamo theorems in non-uniform compressible fluids. Philos Trans R Soc Lond A 312:179–218 Jackson A (1997) Time dependence of geostrophic core-surface motions. Phys Earth Planet Inter 103:293–311 Jackson A (2003) Intense equatorial flux spots on the surface of the Earth’s core. Nature 424:760–763 Jackson A, Finlay C (2007) Geomagnetic secular variation and applications to the core. In: Kono M (ed) Geomagnetism. Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 147–193 Jackson A, Jonkers A, Walker M (2000) Four centuries of geomagnetic secular variation from historical records. Philos Trans R Soc Lond A358:957–990 Jault D (2003) Electromagnetic and topographic coupling, and LOD variations. In: Jones CA, Soward AM, Zhang K (eds) Earth’s core and lower mantle. Taylor & Francis, London/New York, pp 56–76 Jault D, Gire C, LeMouël J-L (1988) Westward drift, core motion and exchanges of angular momentum between core and mantle. Nature 333:353–356 Johnson C, Constable C (1995) Time averaged geomagnetic field as recorded by lava flows over the past 5 Myr. Geophys J Int 122:489–519 Johnson C, Constable C, Tauxe L (2003) Mapping long-term changed in Earth’s magnetic field. Science 300:2044–2045 Johnson CL, McFadden P (2007) Time-averaged field and paleosecular variation. In: Kono M (ed) Geomagnetism. Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 217–254 Jones C (2007) Thermal and compositional convection in the outer core. In: Olson P (eds) Core dynamics. Treatise on geophysics, vol 8. Elsevier, Amsterdam/Boston, pp 131–186 Jones CA, Boronski P, Brun AS et al (2011) Anelastic convection-driven dynamo benchmarks. Icarus 216:120–135 Jonkers A (2003) Long-range dependence in the cenozoic reversal record. Phys Earth Planet Inter 135:253–266 Julien K, Knobloch E (1998) Strongly nonlinear convection cells in a rapidly rotating fluid layer: the tilted f-plane. J Fluid Mech 360:141–178 Julien K, Knobloch E, Werne J (1998) A new class of equations for rotationally constrained flows. Theor Comput Fluid Dyn 11(3–4):251–261 Julien K, Rubio A, Grooms I, Knobloch E (2012) Statistical and physical balances in low Rossby number Rayleigh–Bénard convection. Geophys Astrophys Fluid Dyn 106(4–5):392–428 Kageyama A, Miyagoshi T, Sato T (2008) Formation of current coils in geodynamo simulations. Nature 454:1106–1109 Kageyama A, Sato T (1995) Computer simulation of a magnetohydrodynamic dynamo. II. Phys Plasmas 2:1421–1431 Kageyama A, Sato T (1997) Generation mechanism of a dipole field by a magnetohydrodynamic dynamo. Phys Rev E 55:4617–4626 Kageyama A, Watanabe K, Sato T (1993) Simulation study of a magnetohydrodynamic dynamo: convection in a rotating shell. Phys Fluids B 24(8):2793–2806 Kageyama A, Yoshida M (2005) Geodynamo and mantle convection simulations on the Earth simulator using the yin-yang grid. J Phys Conf Ser 16:325–338 Kaiser R, Schmitt P, Busse F (1994) On the invisible dynamo. Geophys Astrophys Fluid Dyn 77:93–109
832
J. Wicht et al.
Kelly P, Gubbins D (1997) The geomagnetic field over the past 5 million years. Geophys J Int 128:315–330 Kono M, Roberts P (2002) Recent geodynamo simulations and observations of the geomagnetic field. Rev Geophys 40:1013. doi:10.1029/2000RG000102 Korte M, Constable C (2005) Continuous geomagnetic field models for the past 7 millennia: 2. cals7k. Geochem Geophys Geosys 6:Q02H16 Korte M, Constable C, Donadini F, Holme R (2011) Reconstructing the Holocene geomagnetic field. Earth Planet Sci Lett 312:497–505 Korte M, Genevey A, Constable C, Frank U, Schnepp E (2005) Continuous geomagnetic field models for the past 7 millennia: 1. A new global data compilation. Geochem Geophys Geosyst 6:Q02H15 Korte M, Holme R (2010) On the persistence of geomagnetic flux lobes in global Holocene field models. Phys Earth Planet Inter 182:179–186 Kuang W, Bloxham J (1997) An Earth-like numerical dynamo model. Nature 389:371–374 Kuang W, Bloxham J (1999) Numerical modeling of magnetohydrodynamic convection in a rapidly rotating spherical shell: weak and strong field dynamo action. J Comput Phys 153:51–81 Kuang W, Jiang W, Wang T (2008) Sudden termination of martian dynamo? Implications from subcritical dynamo simulations. Geophys Res Lett 35(14):14,202 Kutzner C, Christensen U (2000) Effects of driving mechanisms in geodynamo models. Geophys Res Lett 27:29–32 Kutzner C, Christensen U (2002) From stable dipolar to reversing numerical dynamos. Phys Earth Planet Inter 131:29–45 Kutzner C, Christensen U (2004) Simulated geomagnetic reversals and preferred virtual geomagnetic pole paths. Geophys J Int 157:1105–1118 Lhuillier F, Fournier A, Hulot G, Aubert J (2011) The geomagnetic secular variation timescale in observations and numerical dynamo models. Geophys Res Lett 38:L09306 Lillis R, Frey H, Manga M (2008) Rapid decrease in martian crustal magnetization in the noachian era: implications for the dynamo and climate of early mars. Geophys Res Lett 35(14):14,203 Manglik A, Wicht J, Christensen UR (2010) A dynamo model with double diffusive convection ´ core. Earth Planet Sci Lett 289:619–628 for MercuryâA˘ Zs Matsui H, Buffett B (2005) Sub-grid scale model for convection-driven dynamos in a rotating plane layer. Phys Earth Planet Inter 153:74–82 Miyagoshi T, Kageyama A, Sato T (2010) Zonal flow formation in the Earth’s core. Nature 463(7282):793–796 Miyagoshi T, Kageyama A, Sato T (2011) Formation of sheet plumes, current coils, and helical magnetic fields in a spherical magnetohydrodynamic dynamo. Phys Plasmas 18:072901 Monnereau M, Calvet M, Margerin L, Souriau A (2010) Lopsided growth of Earth’s inner core. Science 328:1014 Morin V, Dormy E (2009) The dynamo bifurcation in rotating spherical shells. Int J Mod Phys B 23(28n29):5467–5482 Olsen N, Haagmans R, Sabaka TJ et al (2006) The Swarm End-to-End mission simulator study: a demonstration of separating the various contributions to Earth’s magnetic field using synthetic data. Earth Planets Space 58:359–370 Olson P, Christensen U (2002) The time-averaged magnetic field in numerical dynamos with nonuniform boundary heat flow. Geophys J Int 151:809–823 Olson P, Christensen U (2006) Dipole moment scaling for convection-driven planetary dynamos. Earth Planet Sci Lett 250:561–571 Olson P, Christensen UR, Driscoll PE (2012) From superchrons to secular variation: a broadband dynamo frequency spectrum for the geomagnetic dipole. Earth Planet Sci Lett 319–320:75–82 Olson P, Christensen U, Glatzmaier G (1999) Numerical modeling of the geodynamo: mechanism of field generation and equilibration. J Geophys Res 104:10383–10404 Pozzo M, Davies C, Gubbins D, Alfè D (2012) Thermal and electrical conductivity of iron at Earth’s core conditions. Nature 485:355–358
Numerical Dynamo Simulations: From Basic Concepts to Realistic Models
833
Proctor M (1994) Convection and magnetoconvection in a rapidly rotating sphere. In: Proctor MRE, Gilbert AD (eds) Lectures on solar and planetary dynamos, vol 1. Cambridge University Press, Cambridge/New York, p 97 Roberts P (1972) Kinematic dynamo models. Philos Trans R Soc Lond A 271:663–697 Roberts P (2007) Theory of the geodynamo. In: Olson P (eds) Core dynamics. Treatise on geophysics, vol 8. Elsevier, Amsterdam/Boston, pp 245–282 Ryan DA, Sarson GR (2007) Are geomagnetic field reversals controlled by turbulence within the Earth’s core? Geophys Res Lett 34:2307 Sakuraba A (2002) Linear magnetoconvection in rotating fluid spheres permeated by a uniform axial magnetic field. Geophys Astrophys Fluid Dyn 96:291–318 Sakuraba A, Kono M (2000) Effect of a uniform magnetic field on nonlinear magnetocenvection in a rotating fluid spherical shell. Geophys Astrophys Fluid Dyn 92:255–287 Sakuraba A, Roberts P (2009) Generation of a strong magnetic field using uniform heat flux at the surface of the core. Nat Geosci 2:802–805 Schmalzl J, Breuer M, Hansen U (2002) The influence of the Prandtl number on the style of vigorous thermal convection. Geophys Astrophys Fluid Dyn 96:381–403 Simitev R, Busse F (2005) Prandtl-number dependence of convection-driven dynamos in rotating spherical fluid shells. J Fluid Mech 532:365–388 Simitev RD, Busse FH (2009) Bistability and hysteresis of dipolar dynamos generated by turbulent convection in rotating spherical shells. Europhys Lett 85:19001 Soderlund KM, King E, Aurnou JM (2012) The influence of magnetic fields in planetary dynamo models. Earth Planet Sci Lett 333–334:9–20 Sprague M, Julien K, Knobloch E, Werne J (2006) Numerical simulation of an asymptotically reduced system for rotationally constrained convection. J Fluid Mech 551:141–174 Sreenivasan B (2009) On dynamo action produced by boundary thermal coupling. Phys Earth Planet Inter 177:130–138 Sreenivasan B, Jones CA (2006) The role of inertia in the evolution of spherical dynamos. Geophys J Int 164:467–476 Sreenivasan B, Jones CA (2011) Helicity generation and subcritical behaviour in rapidly rotating dynamos. J Fluid Mech 688:5–30 St Pierre M (1993) The strong field branch of the Childress-Soward dynamo. In: Proctor MRE et al (eds) Solar and planetary dynamos, Cambridge University Press, Cambridge, pp 329–337 Stanley S, Bloxham J (2004) Convective-region geometry as the cause of Uranus’ and Neptune’s unusual magnetic fields. Nature 428:151–153 Stanley S, Bloxham J, Hutchison W, Zuber M (2005) Thin shell dynamo models consistent with ´ weak observed magnetic field. Earth Planet Sci Lett 234:341–353 mercuryâA˘ Zs Stanley S, Glatzmaier G (2010) Dynamo models for planets other than Earth. Space Sci Rev 152:617–649 Stellmach S, Hansen U (2004) Cartesian convection-driven dynamos at low ekman number. Phys Rev E 70:056312 Stelzer Z, Jackson A (2013, in press) Extracting scaling laws from numerical dynamo models. Geophys J Int Stieglitz R, Müller U (2001) Experimental demonstration of the homogeneous two-scale dynamo. Phys Fluids 1:561–564 Takahashi F, Matsushima M (2006) Dipolar and non-dipolar dynamos in a thin shell geometry with implications for the magnetic field of Mercury. Geophys Res Lett 33:L10202 Takahashi F, Matsushima M, Honkura Y (2008a) Scale variability in convection-driven MHD dynamos at low Ekman number. Phys Earth Planet Inter 167:168–178 Takahashi F, Tsunakawa H, Matsushima M, Mochizuki N, Honkura Y (2008b) Effects of thermally heterogeneous structure in the lowermost mantle on the geomagnetic field strength. Earth Planet Sci Lett 272:738–746 Taylor J (1963) The magneto-hydrodynamics of a rotating fluid and the Earth’s dynamo problem. Proc R Soc Lond A 274:274–283 Tilgner A (1996) High-Rayleigh-number convection in spherical shells. Phys Rev E 53:4847–4851
834
J. Wicht et al.
Vallis GK (2006) Atmospheric and oceanic fluid dynamics: fundamentals and large-scale circulation. Cambridge University Press, Cambridge Wicht J (2002) Inner-core conductivity in numerical dynamo simulations. Phys Earth Planet Inter 132:281–302 Wicht J (2005) Palaeomagnetic interpretation of dynamo simulations. Geophys J Int 162:371–380 Wicht J, Aubert J (2005) Dynamos in action. GWDG-Bericht 68:49–66 Wicht J, Christensen UR (2010) Torsional oscillations in dynamo simulations. Geophys J Int 181:1367–1380 ´ internal magnetic field. Wicht J, Mandea M, Takahashi F et al (2007) The origin of MercuryâA˘ Zs Space Sci Rev 132:261–290 Wicht J, Olson P (2004) A detailed study of the polarity reversalmechanism in a numerical dynamo model. Geochem Geophys Geosyst 5. doi:10.1029/2003GC000602 Wicht J, Stellmach S, Harder H (2009) Numerical models of the geodynamo: from fundamental Cartesian models to 3d simulations of field reversals. In: Glassmeier K, Soffel H, Negendank J (eds) Geomagnetic field variations – space-time structure, processes, and effects on system Earth. Springer monograph. Springer, Berlin/Heidelberg/NewYork, pp 107–158 Wicht J, Tilgner A (2010) Theory and modeling of planetary dynamos. Space Sci Rev 152:501–542 Willis AP, Sreenivasan B, Gubbins D (2007) Thermal core mantle interaction: exploring regimes for locked dynamo action. Phys Earth Planet Inter 165:83–92 Yadav RK, Gastine T, Christensen UR (2013) Scaling laws in spherical shell dynamos with freeslip boundaries. Icarus 225:185–193 Zatman S, Bloxham J (1997) Torsional oscillations and the magnetic field within the Earth’s core. Nature 388:760–761 Zhang K-K, Busse F (1988) Finite amplitude convection and magnetic field generation in in a rotating spherical shell. Geophys Astrophys Fluid Dyn 44:33–53 Zhang K, Gubbins D (2000a) Is the geodynamo process intrinsically unstable? Geophys J Int 140:F1–F4 Zhang K, Gubbins D (2000b) Scale disparities and magnetohydrodynamics in the Earth’s core. Philos Trans R Soc Lond A 358:899–920 Zhang K, Schubert G (2000) Magnetohydrodynamics in rapidly rotating spherical systems. Annu Rev Fluid Mech 32:409–443
Mathematical Properties Relevant to Geomagnetic Field Modeling Terence J. Sabaka, Gauthier Hulot, and Nils Olsen
Contents 1 2 3
4
5 6
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Helmholtz’s Theorem and Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Potential Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Magnetic Fields in a Source-Free Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Surface Spherical Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Magnetic Fields from a Spherical Sheet Current . . . . . . . . . . . . . . . . . . . . . . . . . . Non-potential Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Helmholtz Representations and Vector Spherical Harmonics . . . . . . . . . . . . . . . . 4.2 Mie Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Relationship of B and J Mie Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Magnetic Fields in a Current-Carrying Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Thin-Shell Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spatial Power Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mathematical Uniqueness Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Uniqueness of Magnetic Fields in a Source-Free Shell . . . . . . . . . . . . . . . . . . . . . 6.2 Uniqueness Issues Raised by Directional-Only Observations . . . . . . . . . . . . . . . . 6.3 Uniqueness Issues Raised by Intensity-Only Observations . . . . . . . . . . . . . . . . . . 6.4 Uniqueness of Magnetic Fields in a Shell Enclosing a Spherical Sheet Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
836 837 839 839 841 845 846 847 851 853 856 858 858 860 860 863 865 868
T.J. Sabaka () Planetary Geodynamics Laboratory, Code 698, NASA Goddard Space Flight Center, Greenbelt, MD, USA e-mail: [email protected] G. Hulot Equipe de Géomagnétisme, Institut de Physique du Globe de Paris, Sorbonne Paris Cité, Université Paris Diderot, Paris, France e-mail: [email protected] N. Olsen DTU Space, Technical University of Denmark, Kgs. Lyngby, Denmark e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_17
835
836
T.J. Sabaka et al.
6.5 Uniqueness of Magnetic Fields in a Current-Carrying Shell . . . . . . . . . . . . . . . . . 7 Concluding Comments: From Theory to Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
870 874 875
Abstract
Geomagnetic field modeling consists in converting large numbers of magnetic observations into a linear combination of elementary mathematical functions that best describes those observations. The set of numerical coefficients defining this linear combination is then what one refers to as a geomagnetic field model. Such models can be used to produce maps. More importantly, they form the basis for the geophysical interpretation of the geomagnetic field, by providing the possibility of separating fields produced by various sources and extrapolating those fields to places where they cannot be directly measured. In this chapter, the mathematical foundation of global (as opposed to regional) geomagnetic field modeling is reviewed, and the spatial modeling of the field in spherical coordinates is focused. Time can be dealt with as an independent variable and is not explicitly considered. The relevant elementary mathematical functions are introduced, their properties are reviewed, and how they can be used to describe the magnetic field in a source-free (such as the Earth’s neutral atmosphere) or source-dense (such as the ionosphere) environment is explained. Completeness and uniqueness properties of those spatial mathematical representations are also discussed, especially in view of providing a formal justification for the fact that geomagnetic field models can indeed be constructed from ground-based and satellite-born observations, provided those reasonably approximate the ideal situation where relevant components of the field can be assumed perfectly known on spherical surfaces or shells at the time for which the model is to be recovered.
1
Introduction
The magnetic field measured at or near the Earth’s surface is the superposition of contributions from a variety of sources, as discussed in the accompanying chapter by Olsen et al. in this handbook (Part II, chapter Satellite-to-Satellite Tracking (Low-Low/High-Low SST)). The sophisticated separation of the various fields produced by these sources on the basis of magnetic field observations is a major scientific challenge which requires the introduction of adequate mathematical representations of those fields (see, e.g., Hulot et al. 2007). Here a synopsis of general properties relevant to such mathematical representations on a planetary scale is provided, which is the scale that is mainly dealt with here (regional representations of the field will only briefly be discussed, and the reader is referred to, e.g., Purucker and Whaler (2007) and to classical texts as Harrison (1987), Blakely (1995), and Langel and Hinze (1998)). Such representations are of significant utility since they encode the physics of the magnetic field and allow for a means of deducing these fields from measurements via inverse theory. It should be noted that these representations are spatial in
Mathematical Properties Relevant to Geomagnetic Field Modeling
837
nature, i.e., they provide instantaneous descriptions of the fields. The physical cause of the time variability of such fields is a complicated subject and, in its true sense, relies on the electromagnetic dynamics of the environment. Induced magnetism, for instance, is a well-understood subject which is covered in several good texts (e.g., Merrill et al. 1998) as is dynamo theory (cf. chapter Unstructured Meshes in Large-Scale Ocean Modeling). However, while the physics is known, the incorporation of the dynamics into inverse problems is in its embryonic stages, especially for the Earth’s core dynamo, and entails the subject of data assimilation, which is beyond the scope of this discussion. In what follows, time is thus considered as an independent implicit variable. This chapter is divided into five parts. First, the general equations governing the behavior of any magnetic field are briefly recalled. Then naturally the concepts of potential and nonpotential magnetic fields are introduced, the mathematical representation of which is discussed in the following two sections. Next a short section introduces the useful concept of spatial power spectra. The last section finally deals with uniqueness issues raised by the limited availability of magnetic observations, a topic of paramount importance when defining observational and modeling strategies to recover complete mathematical representations of the field. The chapter concludes with a few words with respect to the practical use of such mathematical representations, guiding the reader to further reading.
2
Helmholtz’s Theorem and Maxwell’s Equations
A convenient starting point for describing the spatial structure of any vector field is the Helmholtz theorem which states that if the divergence and curl of a vector field are known in a particular volume, as well as its normal component over the boundary of that volume, then the vector field is uniquely determined (e.g., Backus et al. 1996). When measurements of the Earth’s magnetic field are taken, they usually reflect some aspect of the magnetic induction vector B. Hence, statements about its divergence and curl will define many of its spatial properties. Maxwell’s equations then provide all the information necessary, sans boundary conditions, to describe the spatial behavior of both B and the electric field intensity E. They apply to electromagnetic phenomena in media which are at rest with respect to the coordinate system used. t ; 0
(1)
r B D 0;
(2)
r ED
@B ; @t
(3)
r B D 0 J:
(4)
r E D
838
T.J. Sabaka et al.
In SI units E is expressed in V =m and B is expressed in tesla denoted T .T D V s=m2 /. The remaining quantities are the total electric charge density t in C/m3 , the permittivity of free space "0 D 8:851012 F =m, the permeability of free space 0 D 4 107 H =m, and the total volume current density J in A=m2 . The total volume current density is actually the sum of volume densities from free charge currents Jf , equivalent currents Je , and displacement currents @D such @t that J D Jf C Je C
@D : @t
(5)
While Jf reflects steady free currents in nonmagnetic materials, Je represents an equivalent current effect due to magnetized material whose local properties are described by the net dipole moment per unit volume M such that Je D r M:
(6)
The displacement current density can be omitted if the time scales of interest are much longer than those required for light to traverse a typical length scale of interest (e.g., Backus et al. 1996). This criterion can be justified by the fact that, for instance, in a linear isotropic medium where D D "E (" is the permittivity of the medium) and Jf and Je are absent, taking the curl of Eq. 4, substituting Eq. 3, and making use of Eq. 2 give the homogeneous magnetic wave equation: @2 B D cl2 r 2 B; @t 2
(7)
where the wave propagates at the speed of light cl . When time scales of interest are longer than those required for the light to traverse length scales of interest, the left-hand side of Eq. 7 can be neglected, amounting to neglect displacement currents in Eq. 4. Geomagnetism deals with frequencies up to only a few Hz and with length scales smaller than the radius of the Earth, which the light traverses in about 20 ms. Neglecting the displacement current is thus appropriate for most purposes in geomagnetism. Ampere’s law is then given in a form which reflects the state of knowledge before Maxwell, thus yielding the pre-Maxwell equations where r B D 0 J;
(8)
J D Jf C Je :
(9)
and
What is clear from Maxwell’s equations is the absence of magnetic monopoles; magnetic field lines are closed loops. Indeed, because Eq. 2 is valid everywhere and under all circumstances, the magnetic field is solenoidal: its net flux through
Mathematical Properties Relevant to Geomagnetic Field Modeling
839
any closed surface must be zero. By contrast, Ampere’s law (Eq. 8) shows that the magnetic field is irrotational only in the absence of free currents and magnetized material. It is this presence or absence of J that naturally divides magnetic field representations into two classes: potential and nonpotential magnetic fields.
3
Potential Fields
3.1
Magnetic Fields in a Source-Free Shell
Let space be divided into three regions delineated by two shells of lower radius a and higher radius c so that regions I, II, and III are defined as r a; a < r < c, and c r. Imagine that current systems are confined to only regions I and III and that one wants to describe the field in the source-free region II where J = 0, such as, for instance, the neutral atmosphere. From Ampere’s law, r B D 0 in region II. It is well known that the magnetic field is then conservative and can be expressed as the gradient of a scalar potential V (Lorrain and Corson 1970; Backus et al. 1996; Jackson 1998): B D rV:
(10)
These types of fields are known as potential magnetic fields. In addition, from Eq. 2, B must still be solenoidal which implies r 2 V D 0:
(11)
i.e., that the potential V be harmonic. This is Laplace’s equation, which can be solved for V in several different coordinated systems via separation of variables. The spherical system is most natural for the near-Earth environment, and its coordinates are .r; ; /, where r is radial distance from the origin, is the polar angle rendered from the north polar axis (colatitude), and is the azimuthal angle rendered in the equatorial plane from a prime meridian (longitude). The Laplacian operator may be written in spherical coordinates as r2 D
@2 @ 2 @ 1 C 2C 2 r @r @r r sin @
@ 1 @2 sin C : 2 @ r 2 sin @' 2
(12)
Two linearly independent sets of solutions exist, depending on whether the source currents reside in either region I, V i , or region III, V e , i.e., are interior or exterior to the measurement shell, respectively. These solutions correspond to negative and positive powers of r, respectively, and are given by (e.g., Langel 1987; Backus et al. 1996)
840
T.J. Sabaka et al.
V i .r; ; '/ D a
1 n X m a nC1 X m gn cos m' C hm n sin m' Pn .cos /; r nD1 mD0
V e .r; ; '/ D a
1 X n X r n
a
nD1
qnm cos m' C snm sin m' Pnm .cos /;
(13)
(14)
mD0
which leads to B.r; ; '/ D Bi .r; ; '/ C Be .r; ; '/;
(15)
with Bi .r; ; '/ D
Be .r; ; '/ D
1 n X a nC2 X m mc ms gn …ni .; '/ C hm n …ni .; '/ ; r nD1 mD0 1 n X r n1 X
a
nD1
m ms qnm …mc ne .; '/ C sn …ne .; '/ ;
(16)
(17)
mD0
where m O …mc ni .; '/ D .n C 1/Pn .cos / cos m' r
C
m m P .cos / sin m' '; O sin n
(18)
m O …mc ni .; '/ D .n C 1/Pn .cos / sin m' r
C
dPnm .cos / sin m' O d
m P m .cos / cos m' '; O sin n
m O …mc ne .; '/ D nPn .cos / cos m' r
C
dPnm .cos / cos m' O d
dPnm .cos / cos m' O d
(19)
(20)
m P m .cos / sin m' '; O sin n
m O …ms ne .; '/ D nPn .cos / sin m' r
C
dPnm .cos / sin m' O d
m P m .cos / cos m' '; O sin n
(21)
and .Or; O ; '/ O are the unit vectors associated with the spherical coordinates (r; ; ).
Mathematical Properties Relevant to Geomagnetic Field Modeling
841
In these equations, a is the reference radius and in this case corresponds to the lower shell, Pnm .cos/ is the Schmidt quasi-normalized associated Legendre function of degree n and order m (both being integers), and gnm and hm n are internal, and qnm and snm external, constants known as the Gauss coefficients. Scaling these equations by a yields coefficients whose units are those of B, i.e., magnetic induction, which in the near-Earth environment is usually expressed in nanoteslas (nT). Note that although they formally appear in Eqs. 13–17, h0n and sn0 Gauss coefficients are in fact not needed (because sin m D 0 when m D 0) and therefore not defined. One should also notice the omission of the n = 0 terms in both solutions V i and e V (and therefore Bi and Be ). This is a consequence of the fact that for V i , this term leads to an internal field Bi not satisfying magnetic monopole exclusion at the origin (required by Eq. 2), while for V e , this term is just a constant (since P00 .cos / D 1, see below) which produces no field. It may thus be set to zero. Finally, the reader should also be warned that the choice of the Schmidt quasinormalization for the Pnm .cos / in Eqs. 13 and 14 is very specific to geomagnetism. It dates back to the early work of Schmidt (1935) and has since been adopted as the conventional norm to be used in geomagnetism, following a resolution of the International Association of Terrestrial Magnetism and Electricity (IATME) of the International Union of Geophysics and Geodesy (IUGG) (Goldie and Joyce 1940).
3.2
Surface Spherical Harmonics
The angular portions of the terms in Eqs. 13 and 14 are often denoted Ynm;c .; / Pnm .cos / cos m and Ynm;s .; / Pnm .cos / sin m and are known as Schmidt quasi-normalized real surface spherical harmonics. They are related to analogous complex functions Yn;m .; / known as complex surface spherical harmonics, which have been extensively studied in the mathematical and physical literature. For any couple of integers (n; m) with n 0 and n m n, those Yn;m .; / are defined as complex functions of the form (e.g., Backus et al. 1996; Edmonds 1996) s Yn;m .; '/ D .1/
m
.2n C 1/
.n m/Š Pn;m .cos /e i m' ; .n C m/Š
(22)
where the Pn;m .cos / are again the associate Legendre functions of degree n and order m, but now satisfying the much more common Ferrers normalization (Ferrers 1877), and defined by (with D cos ) Pn;m ./ D
d nCm 1 2 m 2 .1 / .2 1/n : 2n nŠ d nCm
(23)
Note that this definition holds for n m n and leads to the important property that
842
T.J. Sabaka et al.
Pn;m ./ D .1/m
.n m/Š Pn;m ./: .n C m/Š
(24)
which then also implies that Yn;m .; '/ D .1/m YNn;m .; '/:
(25)
where the overbar denotes complex conjugation. The Yn;m .; / are eigenfunctions of the angular portion of the Laplacian operator rS2 rS2
1 @ sin @
@ 1 @2 sin C ; @ sin2 @' 2
(26)
such that rS2 Yn;m .; '/ D n.n C 1/Yn;m.; '/:
(27)
They represent a complete, orthogonal set of complex functions on the unit sphere. The cumbersome prefactor to be found in Eq. 22 is chosen so that the inner products of these complex surface spherical harmonics over the sphere have the form hYn;m ; Yl;k i
1 4
Z
2
Z
YNn;m .; '/Yl;k .; '/ sin d d ' D ınl ımk ; 0
(28)
0
where ıij is the Kronecker delta. Those complex surface spherical harmonics are then said to be fully normalized. The reader should however be aware that not all authors introduce the 1=4 factor in the definition (Eq. 28) of the inner product, in which case fully normalized p complex surface spherical harmonics are not exactly defined as in Eq. 22 (a1= 4 factor then needs to be introduced). Here, the convention used in most previous books dealing with geomagnetism (such as, e.g., Merrill and McElhinny 1983; Langel 1987; Merrill et al. 1998, and especially Backus et al. 1996, where many useful mathematical properties satisfied by those functions can be found) is simply chosen. Similarly, the .1/m factor in the definition of Eq. 22 is not always introduced. The Schmidt quasi-normalized Pnm .cos / introduced in Eqs. 13 and 14 are then related to the Ferrers normalized Pn;m .cos / through ( Pnm .cos / D
for m D 0; q Pn;m .cos / 2.nm/Š P .cos / for m > 0: .nCm/Š n;m
(29)
and the Schmidt quasi-normalized real surface spherical harmonics Ynm;c .; / and Ynm;s .; / are related to the fully normalized complex surface spherical harmonics
Mathematical Properties Relevant to Geomagnetic Field Modeling
843
Yn;m .; / through s Ynm;c .; '/
D .1/
m
2 RŒYn;m .; '/ ; .2n C 1/.1 C ım0 /
(30)
s Ynm;s .; '/
D .1/
m
2 IŒYn;m .; '/ ; .2n C 1/
(31)
for n 0 and 0 m n. Note that whereas the Pn;m .cos / are defined for n 0 and n m n, which are all needed for the Yn;m .; / to form a complete orthogonal set of complex functions on the unit sphere, the Pnm .cos / are only used for n 0 and 0 m n, which is then enough for the Ynm;c .; / and Ynm;s .; / to form a complete, orthogonal set of real functions on the unit sphere. They satisfy D
E D E Ynm;c ; Ylk;c D Ynm;s ; Ylk;s D
1 ınl ımk ; 2n C 1
(32)
and D
E Ynm;c ; Ylk;s D 0:
(33)
Finally, because of Eqs. 30 and 31, they too are eigenfunctions of rS2 and satisfy rS2 Ynm;.c;s/ .; '/ D n.n C 1/Ynm;.c;s/ .; '/:
(34)
m;.c;s/
.; / stands for either Ynm;c .; / or Ynm;s .; /. where Yn The following recursion relationships allow the Schmidt quasi-normalized series to be generated for a given value of (Langel 1987): r Pnn .cos /
Pnm .cos /
D
2n 1 n1 sin Pn1 .cos /; 2n
2n 1 m D p cos Pn1 .cos / n2 m2 n > m 0:
r
n > 1;
(35)
.n 1/2 m2 m Pn2 .cos /; n2 m2 (36)
844
T.J. Sabaka et al.
1
P 06
P 16
P 26
P 36 P 4 P 5 6 6
P 66
0.5
0
−0.5
−1
0
30
60
90 θ
120
150
180
Fig. 1 Schmidt quasi-normalized associated Legendre functions P6m .cos / as a function of
The first few terms of the series are p
P00 .cos / D 1; P10 .cos / D cos ; P11 .cos / D sin ;
P22 .cos / D 23 sin2 ; 0 P3 .cos / D p12 .5 cos3 3 cos /; P31 .cos / D 2p32 sin .5 cos2 1/;
P20 .cos / D 12 .3 cos2 1/; p P21 .cos / D 3 cos sin ;
P32 .cos / D 215 psin2 cos ; P33 .cos / D p5 sin3
p
2 2
Plots of the P6m .cos / functions are shown in Fig. 1 as a function of . Real surface spherical harmonic functions Ynm;c .; / and Ynm;s .; / have n m zeros in the interval Œ0; along meridians and 2m zeros along lines of latitude. When m D 0, only the Yn0;c .; / exist, which is then denoted Yn0 .; /. They exhibit annulae of constant sign in longitude and are referred to as zonal harmonics. When n D m, there are lines of constant sign in latitude and these are referred to as sectorial harmonics. The general cases are termed tesseral harmonics. Figure 2 illustrates examples of each from the n = 6 family. Finally, note that since Y0;0 D P0;0 D P00 D Y00 D 1, using either Eqs. 28 or 32–33 for n D m D 0 leads to the important additional property that
1 4
Z 0
2
Z
0
1 Yn;m .; '/ sin d d ' D 4
2 Z
Z 0
0
Ynm .; '/ sin d d ' D ın0 ım0 ; (37)
Mathematical Properties Relevant to Geomagnetic Field Modeling
Y 06
Y 36
845
Y 66
Fig. 2 Color representations (red positive, blue negative) of real surface harmonics Y60 , Y63 , and longitudinal phase shift) Y66 (Note that Ynm;c and Ynm;s are identical to within a 2m
For many more useful properties satisfied by associate Legendre functions and real or complex surface spherical harmonics, the reader is referred to, e.g., Langel (1987), Backus et al. (1996), and Dahlen and Tromp (1998). But beware of conventions and normalizations!
3.3
Magnetic Fields from a Spherical Sheet Current
Let space again be divided into three regions, but in a different way. Introduce a single shell of radius b (where a < b < c) so that regions I, II, and III are now defined as r < b; r D b, and b < r. This time imagine that the current system is confined to only region II (the spherical shell surface) and that one wants to describe the field produced by those sources in source-free regions I and III, where J D 0. Here, sources previously assumed to lie either below r D a or above r D c are thus ignored, and only the description of a field produced by a spherical sheet current is considered. This is very applicable to the Earth environment where currents associated with the ionospheric dynamo reside in the E-region and peak near 115 km (cf. chapter Satellite-to-Satellite Tracking (Low-Low/High-Low SST)). In region III those sources are seen as internal, producing a field with a potential of the form of Eq. 13: V i .r; ; '/ D a
1 n X a nC1 X nD1
r
.anm cos m' C bnm sin m'/Pnm .cos /;
(38)
mD0
where anm and bnm are now the constants. In region I, by contrast, those sources are seen as external, producing a field with a magnetic potential of the form of Eq. 14. However, because B is solenoidal, its radial component must be continuous across the sheet so that @V i @V e jrDb D jrDb : @r @r
(39)
846
T.J. Sabaka et al.
Unlike the independent internal and external magnetic fields found in a shell sandwiched between source bearing regions, the fields in regions I and III here are not independent. They are now coupled through Eq. 39. This also means that the expansion coefficients for V e are now related to those anm and bnm for V i (Granzow 1983; Sabaka et al. 2002): V e .r; ; '/ D a
1 n X n C 1 a 2nC1 r n X nD1
n
b
a
anm cos m' Cbnm sin m' Pnm .cos /:
mD0
(40) Although the radial component of the field is the same just above and below the sheet current, the horizontal components are not. However, they are in the same vertical plane perpendicular to the sheet, and if Ampere’s circuital law (the integral form of Ampere’s law) is applied to the area containing the sheet, then the surface current density is seen to be (e.g., Granzow 1983) Js D
1 rO .Be Bi /; 0
(41)
which in component form is Js Js' 0 1 1 n m m 1 P 2nC1 a nC2 P m m a sin m' C b cos m' P .cos/ n n n sin B 0 nD1 n b C mD1 C: DB 1 n m @ A P 2nC1 a nC2 P m dPn .cos / m 10 cos m' C b sin m' a n n n b d nD1
mD0
(42) Finally, note that Js can also be written in the often used form Js D Or r‰:
(43)
where ‰ is known as the sheet current function. The SI unit for surface current density is A=m, while that of the sheet current function ‰ is A.
4
Non-potential Fields
In the previous section, mathematical representations were developed for magnetic fields in regions where J D 0. Satellite, surface, and near-surface surveys used in near-Earth magnetic field modeling do not sample the source regions of the core, crust, magnetosphere, or typically the ionospheric E-region. The representations developed so far can therefore be used to describe the fields produced by those sources in regions were observations are made. However, there are additional
Mathematical Properties Relevant to Geomagnetic Field Modeling
847
currents which couple the magnetosphere with the ionosphere in the F -region where satellite measurements are commonly made (cf. chapter Satellite-to-Satellite Tracking (Low-Low/High-Low SST)). These additional currents need to be considered. In this section, the condition r B D 0 is therefore relaxed. This leads to field forms which are described by two scalar potentials rather than one. While it is true that if r B D 0, then there exists a vector potential, A, such that r A D B, the magnetic fields of this section, which cannot be written in the form of the gradient of a scalar potential (as in Eq. 10), will be referred to as nonpotential fields.
4.1
Helmholtz Representations and Vector Spherical Harmonics
To begin with, consider a very general vector field F. Recall that the Helmholtz theorem then states that if the divergence and curl of F are known in a particular volume, as well as its normal component over the boundary of that volume, then F is uniquely determined. This is also true if F decays to zero as r ! 1 (for a rigorous proof and statement, see, e.g., Backus et al. 1996; Blakely 1995). In addition, F can then always be written in the form F D rS C r A;
(44)
where S and A are however then not uniquely determined. In fact this degree of freedom further makes it possible to choose A of the form A D T r C r P r, so that F can also be written as (e.g., Stern 1976) F D rS C r T r C r r P r;
(45)
This representation has the advantage that if F satisfies the vector Helmholtz equation r 2 F C k 2 F D 0;
(46)
then the three scalar potentials S; T , and P also satisfy the associated scalar Helmholtz equation: r 2Q C k2Q D 0
(47)
where Q is either S; T , or P . The solution to this scalar Helmholtz equation can be achieved through separation of variables in spherical coordinates. Expanding the angular dependence of Q in terms of real surface spherical harmonics leads to Q.r; ; '/ D
n 1 X X nD0 mD0
.Qnm;c .r/Ynm;c .; '/ C Qnm;s .r/Ynm;s .; '//;
(48)
848
T.J. Sabaka et al.
and Eq. 47 then implies
d2 2 d n.n C 1/ 2 C k Qnm.c;s/.r/ D 0; C dr 2 r dr r2
(49)
which depends only on n. This may be further transformed into Bessel’s equation such that the solution of Eq. 49 can be written in the form Qnm.c;s/ .r/ D Cnm.c;s/ jn .kr/ C Dnm.c;s/ nn .kr/;
(50)
where jn .kr/ and nn .kr/ are spherical Bessel functions of the first and second kind, m.c;s/ m.c;s/ and Dn are constants. respectively (Abramowitz and Stegun 1964), and Cn If k ! 0, then it can be shown that jn .kr/ and nn .kr/ approach the r .nC1/ internal and r n external potential forms, respectively (Granzow 1983). This is consistent with the fact that, as seen in the previous section, if the field F is potential, Eq. 45 reduces to Eq. 10 and if the corresponding potential is harmonic, Eq. 47 reduces to Eq. 11, in which case it can be expanded in the form of Eqs. 13 and 14. More generally, if the vector field F is to be defined within the spherical shell a < r < c, the general solutions (Eq. 50) of Eq. 49 for nonzero k can be used for the purpose of writing expansions of S; T; P , and therefore F, in terms of elementary fields. One can then indeed take advantage of the fact that any well-behaved function defined on .a; c/ can always be expanded in terms of a sum (over i , or an integral if c ! 1) of spherical Bessel functions of the first kind jn .ki r/ where ki D xci and the xi are the positive roots of jn .x/ D 0 (Granzow 1983; Watson 1966). It thus follows that in the spherical shell of interest, any scalar function Q.r; ; / (and, in particular, S; T , and P ) can be written in the form of the following expansion:
Q.r; ; '/ D
1 I X X i D1 nD0
jn .ki r/
n X
m;c m;c m;s m;s Qn;i Yn .; '/ C Qn;i Yn;i .; '/ :
(51)
mD0
Using such expansions for S; T , and P in Eq. 45 then provides an expansion of any vector field F within the spherical shell a < r < c in terms of elementary vector fields, where advantage can be taken of the fact that each jn .ki r/Ynm;c .; / and jn .ki r/Ynm;s .; / will satisfy Eq. 47, with k D ki . More details about how this can be applied to describe any nonpotential magnetic field can be found in Granzow (1983). Rather than using Eq. 45, one may also use the alternative Helmholtz representation (e.g., Backus et al. 1996; Dahlen and Tromp 1998): F D U rO C rS V rO rS W;
(52)
Mathematical Properties Relevant to Geomagnetic Field Modeling
849
where the angular portion rS D rr r
@ @r
(53)
of the r operator has been introduced (note that rs rs D rs2 as defined by Eq. 26). Equation 52 amounts to decomposing F in terms of a purely radial vector field U rO and a purely tangent (to the sphere) vector field rS V rO rS W . Still considering F to be a general (well-behaved) vector field defined within the spherical shell a < r < c, this representation can be shown to be unique, provided one requires that for any value of r within the shell, the average values of V and W over the sphere of radius r (denoted S .r/) is such that hV iS .r/ D hW iS .r/ D 0
(54)
Of course, each scalar function U; V; W can then also be expanded in terms of real surface spherical harmonics of the form
U .r; ; '/ D
n 1 X X
.Unm;c .r/Ynm;c .; '/ C Unm;s .r/Ynm;s .; '//;
(55)
nD0 mD0
and similarly for V and W (for which the n D 0 term must however be set to zero because of Eq. 54). This then has the advantage that in implementing Eq. 55 and the equivalent expansions for V and W in Eq. 52, the radial dependence of each Unm;c .r/; Unm;s .r/, etc., is not affected by the rs operator. This leads to F.r; ; '/ D
1 P n P
m;s m;s .Unm;c .r/Pm;c n .; '/ C Un .r/Pn .; '//
nD0 mD0 1 P n P
C
C
nD1 mD0 1 P n P nD1 mD0 m.c;s/
m;s m;s .Vnm;c .r/Bm;c n .; '/ C Vn .r/Bn .; '//
(56)
m;s m;s .Wnm;c .r/Cm;c n .; '/ C Wn .r/Cn .; '//;
m;.c;s/
m;.c;s/
.r/; Vn .r/, and Wn .r/ functions can indepenwhere the various Un dently be expanded with the help of any relevant representation and the following elementary vector functions have been introduced Pm;.c;s/ .; '/ D Ynm;.c;s/ .; '/Or; n
(57)
Bm;.c;s/ .; '/ D rS Ynm;.c;s/ .; '/; n
(58)
Cm;.c;s/ .; '/ D Or rS Ynm;.c;s/ .; '/ D r Ynm;.c;s/ .; '/r: n
(59)
850
T.J. Sabaka et al.
These are known as real vector spherical harmonics (see, e.g., Dahlen and Tromp 1998). Just like the surface spherical harmonics from which they derive, those vectors can also be introduced in complex form and with various norms (see, e.g., Morse and Feshbach 1953; Stern 1976; Granzow 1983; Jackson 1998; Dahlen and Tromp 1998). Here, for simplicity, real quantities and Schmidt quasi-normalizations are considered. Introducing the following inner product for two real vector fields K and L defined on the unit sphere: hK; Li
1 4
Z
2
Z
K.; '/ L.; '/ sin d d '; 0
(60)
0
it can be shown that (see, e.g., Dahlen and Tromp 1998) k;.c;s/
hKm;.c;s/ ; Ll n m;.c;s/
i D 0;
(61)
k;.c;s/
and Ll are not strictly identical real vector spherical as soon as Kn m;.c;s/ m;.c;s/ m;.c;s/ ; Bn , and Cn , it can harmonics as defined by Eqs. 57–59. For each Pn further be shown that (see again, e.g., Dahlen and Tromp 1998, but beware the choice of inner product (Eq. 60) and normalization (Eq. 32)) m;c m;s m;s hPm;c n ; Pn i D hPn ; Pn i D
1 ; 2n C 1
(62)
and m;c m;s m;s m;c m;c m;s m;s hBm;c n ; Bn i D hBn ; Bn i D hCn ; Cn i D hCn ; Cn i D
n.n C 1/ : 2n C 1
(63)
Note the discrepancy by a factor of n.n C 1/ between Eqs. 62 and 63, which can be avoided by introducing an additional factor .n.n C 1//1=2 in the right-hand side of the definitions Eqs. 58 and 59 (as is most often done in the literature). Those real vector spherical harmonics are thus mutually orthogonal within each family and between families with respect to this inner product. They provide a convenient general basis for expanding any vector field F in the spherical shell a < r < c, as is made explicit by Eq. 56. Of course, many other basis can be built by making appropriate linear m;.c;s/ m;.c;s/ m;.c;s/ combinations of the Pn ; Bn , and Cn . Such linear combinations naturally arise when, for instance, implementing expansions of the type Eq. 51 for S; T , and P in Eq. 45. Also, it should be noted that other linear combinations of the kind have already been encountered when potential vector fields were considered (as described in Sect. 3). This indeed led to Eqs. 16 and 17, where the elementary m;.c;s/ vector functions …ni .; '/ and …m;.c;s/ .; '/ naturally arose. These can now be ne rewritten in the form
Mathematical Properties Relevant to Geomagnetic Field Modeling m;.c;s/
…ni
D .n C 1/Pm;.c;s/ Bm;.c;s/ ; n n
…m;.c;s/ D nPm;.c;s/ Bm;.c;s/ : ne n n
851
(64) (65)
m;.c;s/
, they again form a general basis of mutually orthogTogether with the Cn onal vector fields (i.e., satisfying Eq. 61, as one can easily check). They also satisfy m;c m;s m;s h…m;c ni ; …ni i D h…ni ; …ni i D n C 1;
(66)
m;c m;s m;s h…m;c ne ; …ne i D h…ne ; …ne i D n:
(67)
An interesting discussion of how this alternative basis can be used for the purpose of describing the Earth’s magnetic field within a shell where sources with simplified geometry exist can be found in Winch et al. (2005).
4.2
Mie Representation
Both Helmholtz representations, Eqs. 45 and 52, apply to any well-behaved vector field F. Another more restrictive representation can be derived if one requests the field F to be solenoidal, as the magnetic field is requested to be because of Eq. 2. Indeed, if a solenoidal field is written in the form of Eq. 45, then S must be harmonic, i.e., it must satisfy Laplace’s equation (Eq. 11). However, consider the following identity for the r r operator in spherical coordinates @ .rP / rr 2 P: r r Pr D r (68) @r This suggests that if an additional scalar potential of the form Z 1 Ps D S dr r
(69)
is added to the original scalar potential P , then rS may be absorbed into Eq. 68. This is true since PS is then also harmonic and SD
@ .rPs /: @r
(70)
What this means is that any solenoidal field such as B may be written as the curl of a vector potential, which is to say, the last two terms of Eq. 45. The condition of r B has eliminated the need for one of the three original scalar potentials and leaves B in the form B D r T r C r r P r:
(71)
852
T.J. Sabaka et al.
This representation can be shown to be unique provided one requests hT iS .r/ D hP iS .r/ D 0;
(72)
for all values of r within the spherical shell of interest a < r < c. It is known as the Mie representation for a solenoidal vector (Mie 1908; Backus 1986; Backus et al. 1996; Dahlen and Tromp 1998). The first term in Eq. 71 is known as the toroidal part of B, denoted as Btor , and T is known as the toroidal scalar potential
D
Btor D r T r;
(73)
1 @T O @T ': O sin @' @
(74)
This part has no radial component and its surface divergence is zero rS Btor D 0;
(75)
where the surface divergence operates on a given vector field F as (recall the definition Eq. 53 of rS ) rS F D 2Fr C
1 sin
@F' @ .F sin / C : @ @'
(76)
The second term in Eq. 71 is known as the poloidal part of B, denoted as Bpol , and P is known as the poloidal scalar potential
1 1 @ D rS2 P rO C r r @
Bpol D r r P r;
(77)
@ 1 @ @ .rP / O C .rP / ': O @r r sin @' @r
(78)
This part has a vanishing surface curl on the sphere, i.e., it satisfies rO .rS Bpol / D ƒS Bpol D 0;
(79)
where the operator ƒS rO rS known as the surface curl operator has been introduced (Backus et al. 1996; Dahlen and Tromp 1998). This operator satisfies for any vector field F: 0
0
0
0
sh B D Bipol C Bepol C Bsh pol C Btor ;
(80)
The terms toroidal and poloidal were coined by Elsasser (1946) and so the Mie representation is sometimes also referred to as the toroidal-poloidal decomposition.
Mathematical Properties Relevant to Geomagnetic Field Modeling
853
Just like for the Helmholtz representations, one can again expand the toroidal and poloidal scalar fields in terms of real spherical harmonics: T .r; ; '/ D
n 1 X X
.Tnm;c .r/Ynm;c .; '/ C Tnm;s .r/Ynm;s .; '//;
(81)
.Pnm;c .r/Ynm;c .; '/ C Pnm;s .r/Ynm;s .; '//;
(82)
nD1 mD0
P .r; ; '/ D
n 1 X X nD1 mD0
where the sum starts at n D 1, and not at n D 0, because of Eq. 72. This then leads to (recalling Eq. 59) Btor .r; ; '/ D
n 1 X X
m;s m;s .Tnm;c .r/Cm;c n .; '/ C Tn .r/Cn .; '//;
(83)
nD1 mD0
and (recalling Eqs. 57 and 58 and making use of Eq. 27) Bpol .r; ; '/ D
1 P n P nD1 mD0 1 P
C
m;s m;s n.n C 1/.Pnm;c .r/Pm;c n .; '/ C Pn .r/Pn .; '// n P d
nD1 mD0
dr
.rPnm;c .r//Bm;c n .; '/ C
d .rPnm;s .r//Bm;s n .; '/ dr
;
(84) which shows that Btor and Bpol are expressible in terms of different families of vector spherical harmonics. As a result, it also follows that for any value of r within the shell of interest a < r < c; Btor and Bpol are orthogonal over the sphere of radius r, i.e., with respect to the inner product defined by Eq. 60: hBtor .r/; Bpol .r/i D 0:
4.3
(85)
Relationship of B and J Mie Representations
Recall from the pre-Maxwell form of Ampere’s law that J is also the curl of a vector. This means that it also is a solenoidal field r JD0
(86)
and so also possesses a toroidal-poloidal decomposition. Let TB and PB be the toroidal and poloidal scalars representing the magnetic field B and let TJ and PJ be the same for the volume current density J. It follows from Ampere’s law equation (8) and Eqs. 68 and 71 that 0 J D r r TB r C r r r PB r;
(87)
D r .r 2 PB /r C r r TB r:
(88)
854
T.J. Sabaka et al.
Taking advantage of the uniqueness of the Mie representation when Eq. 72 is satisfied, one can then identify the following relationships between the various scalar functions 1 2 r PB ; 0
TJ D
PJ D
1 TB : 0
(89)
(90)
This then shows that poloidal magnetic fields are associated with toroidal current densities and toroidal magnetic fields are associated with poloidal current densities. Conversely, one may solve Eqs. 89 and 90 for TB and PB in terms of TJ and PJ . The first equation yields Poisson’s equation for PB at position r r 2 PB .r/ D 0 TJ .r/
(91)
whose classical solution (see, e.g., Jackson 1998) is given by PB .r/ D
0 4
Z 0
TJ .r0 / 0 d ; jr r0 j
(92)
where 0 is a volume enclosing all currents, d 0 is the differential volume element, and r0 is the position within the volume. The second equation yields simply TB .r/ D 0 PJ .r/:
(93)
Substituting these into Eqs. 73 and 77 shows the dependence of the toroidal and poloidal parts of B with respect to those of J Btor .r/ D r Œ0 PJ .r/ r; Bpol .r/ D r r
0 4
Z 0
TJ .r0 / 0 d r: jr r0 j
(94) (95)
This is an interesting result considering that r is operating at the position r where B is being calculated. If this position happens to be in a source-free region where J D 0, then PJ .r/ D 0 and the toroidal magnetic field Btor .r/ D 0. This implies that toroidal magnetic fields only exist within a conductor or magnetized material where the associated poloidal J is present. At the same location r, the poloidal magnetic field Bpol , however, does not identically vanish. This is because Bpol is sensitive to the toroidal current scalar TJ .r0 / evaluated within the distant current-carrying volume and not only to its local value. Of course at this point, Eq. 95 collapses to the usual potential form
Mathematical Properties Relevant to Geomagnetic Field Modeling
855
Z 0 TJ .r0 / 0 @ @ .rPB .r// D r d r : Bpol .r/ D r @r @r 4 0 jr r0 j
(96)
This is because of Eq. 68 and of the fact that 2
r PB .r/ D
0 TJ .r/r inside 0 ; 0r outside 0 :
(97)
This result then also provides the possibility of relating the present formalism to the one previously described in Sect. 3 when considering potential fields that arise in a source-free shell a < r < c. Then, indeed, the internal (below r D a) and external (above r D c) current sources lead to internal PBi and external PBe magnetic poloidal scalar potentials, which can be related to the harmonic scalar potentials V i and V e defined by Eqs. 13 and 14 through Vi D
@ .rPBi /; @r
(98)
Ve D
@ .rPBe /; @r
(99)
and leading to expansions of the form: PBi .r; ; '/ D a
1 n X a nC1 X nD1
r
Gnm cos m' C Hnm sin m' Pnm .cos /;
mD0
1 X n X m r n PBe .r; ; '/ D a Qn cos m' C Snm sin m' Pnm .cos /; a mD0 nD1
(100) (101)
where
Gnm D n1 gnm ; Hnm D n1 hm n
(
1 qnm Qnm D nC1 1 m Sn D nC1 snm
(102)
This then amounts to state that within a source-free spherical shell a < r < c, the r dependence of the Pnm;c .r/ and Pnm;s .r/ in Eq. 82 is entirely specified and of the form m a nC1 q m r n g ; (103) Pnm;c .r/ D a n n n r nC1 a Pnm;s .r/ D a
a nC1 s m r n hm n ; n n r nC1 a
while of course Tnm;c .r/ D Tnm;s .r/ D 0.
(104)
856
4.4
T.J. Sabaka et al.
Magnetic Fields in a Current-Carrying Shell
The toroidal-poloidal decomposition provides a convenient way of describing the magnetic field in a current-carrying shell a < r < c. The magnetic field then has four parts sh B D Bipol C Bepol C Bsh pol C Btor ;
(105)
where Bipol is the potential field due to toroidal currents Jitor in the region r < a (in fact, due to all currents in the region r < a, since poloidal currents in this region do not produce any field in the shell a < r < c), Bepol is the potential field due to toroidal currents Jetor in the region c < r (in fact, due to all currents in the region c < r, since poloidal currents in this region do not produce any field in the shell sh a < r < c), and Bsh pol and Btor are the nonpotential poloidal and toroidal fields sh due to in situ toroidal Jsh tor and poloidal Jpol currents in the shell. From what was just seen, it is now known that the behavior of Bipol and Bepol is entirely dictated by the fact that their respective poloidal scalars PBi and PBe must take the forms Eqs. 100 and 101. Those potential fields are thus entirely defined by the knowledge m of the Gnm ; Hnm ; Qnm , and Snm coefficients (or, equivalently, of the Gauss gnm ; hm n ; qn , sh m sh and sn coefficients). By contrast, all that is known about Bpol and Btor is that they are associated with toroidal and poloidal scalars which can be written in the form Eqs. 81 and 82, so that they themselves can be written in the form Eqs. 83 and 84. m;.c;s/ m;.c;s/ .r/ and Tn .r/ functions is The radial dependence of the corresponding Pn sh unknown, but dictated by the distribution of the current sources Jsh tor and Jpol within the shell, because of Eqs. 92 and 93. It is interesting at this stage to introduce an inner shell a0 < r < c 0 within the spherical shell a < r < c considered so far (and which is referred here as the outer shell). Space can then naturally be divided into five regions (Fig. 3): region I, for r < a, region II for a < r < a0 , region III for a0 < r < c 0 (the inner shell), region IV for c 0 < r < c, and region V for r > c. As was just seen, everywhere within the outer shell a < r < c, the field B can be written in the form of Eq. 105. But in exactly the same way, everywhere within the inner shell a0 < r < c 0 , the field can also be written in the form 0
0
0
0
sh B D Bipol C Bepol C Bsh pol C Btor ; 0
0
(106) 0
where Bipol is the potential field due to toroidal currents Jitor in the region r < a0 ; Bepol 0 0 is the potential field due to toroidal currents Jetor in the region c 0 < r, and Bsh pol and 0 0 Jsh Bsh tor are the 0nonpotential poloidal and toroidal fields due to in situ toroidal tor and 0 i i poloidal Jsh pol currents in the inner shell. It is important to note that Bpol ¤ Bpol 0 and Bepol ¤ Bepol . Toroidal currents flowing in region I (resp. V), which already 0 0 contributed to Bipol (resp. Bepol ), still contribute to Bipol (resp. Bepol ). But toroidal
Mathematical Properties Relevant to Geomagnetic Field Modeling
857
V IV r =c r =c’
III II I
r =R r =a’ r =a
Fig. 3 Schematic showing a meridional cross section of a general current-carrying shell a < r < c within which an inner shell a0 < r < c 0 is defined. This then defines five regions, as identified by their numbers (I, II, III, IV, and V), within which both toroidal and poloidal sources can be found. These will contribute differently to the field, depending on where the field is observed. Of particular interest is the way these contribute when the field is observed in the inner shell, and this shell progressively shrinks to the sphere r D R (see text for details) i currents flowing in region II (resp. IV), which contributed to Bsh pol and not to Bpol 0 0 (resp. Bepol ), now contribute to Bipol (resp. Bepol ). Only the toroidal currents flowing sh0 in region III, which already contributed to Bsh pol , still contribute to Bpol . By contrast, 0 sh Bsh tor D Btor , both because the toroidal-poloidal decomposition is unique and because the toroidal field is only sensitive to the local behavior of the poloidal currents (recall Eq. 94). Obviously, the smaller the thickness h D c 0 a0 of the inner sh0 shell, the less sources contributing to Bsh pol still contribute to Bpol . In fact, it can be shown that if this inner shell eventually shrinks to a sphere of radius r D R 0 sh0 sh (Fig. 3), then Bsh pol goes to zero as h=R ! 0, while Btor remains equal to Btor 0 (Backus 1986). Further introducing BiR (resp. BeR ) as the limit of Bipol (resp. 0 Bepol ) when h=R ! 0 then makes it possible to introduce the following unique decomposition of a magnetic field B on a sphere of radius r D R surrounded by sources
B.R; ; '/ D BiR .R; ; '/ C BeR .R; ; '/ C Bsh tor .R; ; '/;
(107)
where (recall Eq. 83) Bsh tor .R; ; '/ D
n 1 X X
m;s m;s .Tnm;c .R/Cm;c n .; '/ C Tn .R/Cn .; '//;
(108)
nD1 mD0
is the toroidal field produced on the sphere r D R by the local poloidal currents and BiR .R; ; / and BeR (R, ; ) are the potential fields produced on the sphere r D R by all sources, respectively, below and above r D R. These fields are in fact the values taken for r D R of the potential fields BiR .r; ; / and BeR .r; ; / which may be defined more generally for, respectively, r R and r R, with the help of (recall Eqs. 98–102 and 16–17)
858
T.J. Sabaka et al.
BiR .r; ; '/ D
1 n X a nC2 X nD1
BeR .r; ; '/ D
r
(109)
m ms .qnm .R/…mc ne .; '/ C sn .R/…ne .; '//;
(110)
mD0
1 n X r n1 X nD1
m ms .gnm .R/…mc ni .; '/ C hn .R/…ni .; '//;
a
mD0
where the R dependence of the Gauss coefficients is here to recall that these Gauss coefficients describe the fields produced by all sources, respectively, below gnm .R/, m m hm n .R/ and above qn .R/, sn .R/r D R.
4.5
Thin-Shell Approximation
Finally, if h=R is not zero, but small enough, one may still write within the (then thin-shell) a0 < r < c 0 around r D R: B.r; ; '/ D Bi .r; ; '/ C Be .r; ; '/ C Bsh tor .r; ; '/;
(111)
where Bi .r; ; /; Be .r; ; /, and Bsh tor .R; ; / are given by Eqs. 108–110. Equash tion 111 is then correct to within Bsh pol and Btor corrections of order h=R (Backus 1986). This approximation is known as the thin-shell approximation.
5
Spatial Power Spectra
In discussing the various components of the Earth’s magnetic field, and in particular the way each component contributes on average to the observed magnetic field, it will prove useful to deal with the concept of spatial power spectra. This concept was introduced by Mauersberger (1956) and popularized by Lowes (1966, 1974), both in the case of potential fields. However, it is quite straightforward to introduce these also in the case of nonpotential fields. Indeed, consider the sphere S .R/ of radius r D R and assume the most general case when this sphere is surrounded by sources. Then, as seen previously, the field can be written in the form of Eq. 107, and its average squared magnitude over S .R/ can be written in the form hB2 .R; ; '/iS .R/ D D
1 4
R 2 R 0
B.R; ; '/ B.R; ; '/ sin d d ' W i .R/ C W e .R/ C W T .R/; 0
(112)
where W i .R/ D
1 P nD1
Wni .R/; W e .R/ D
1 P nD1
Wne .R/; W T .R/ D
1 P nD1
WnT .R/ ;
(113)
Mathematical Properties Relevant to Geomagnetic Field Modeling
859
with Wni .R/ D .n C 1/
n a 2nC4 X R
2 .gnm .R//2 C .hm n .R// ;
(114)
mD0
Wne .R/ D n
n R 2n2 X
.qnm .R//2 C .snm .R//2 ;
(115)
.Tnm;c .R//2 C .Tnm;s .R//2 ;
(116)
a mD0
WnT .R/ D
n.nC1/ 2nC1
n P mD0
all of which follows from Eqs. 108 to 110 and the orthogonality properties of the m;.c;s/ m;.c;s/ Cn and …n;.i;e/ spherical harmonic vectors (recall Eqs. 60–62). Equations 112–113 then show that each type of field – the potential field produced by all sources above r D R, the potential field produced by all sources below r D R, and the nonpotential (toroidal) field produced by the local (poloidal) sources on r D R – and within each type of field, each degree n (in fact, each elementary field of degree n and order m, as is further shown by Eqs. 114–116) contributes independently to the average squared magnitude B2 .R; ; /S .R/ on the sphere r D R. Hence, plotting Wni .R/ (resp. Wne .R/, WnT .R/) as a function of n provides a very convenient mean of identifying which sources, and within each type of source, which degrees n, most contribute on average to the magnetic field B.R; ; / on the sphere r D R. Such plots are known as spatial power spectra. In the more restrictive (and better known) case when the field is potential within a source-free shell a < r < c, the field takes the form (Eq. 15) B.r; ; / D Bi .r; ; / C Be .r; ; /, where Bi .r; ; / and Be .r; ; / are respectively defined by Eqs. 16 and 17. In exactly the same way, it can again be shown that for any value of r within the shell, each degree n of the field Bi .r; ; / of internal origin (with sources below r D a) and of the field Be .r; ; / of external origin (with sources above r D c) contributes to the average squared magnitude B2 .r; ; /S .r/ on the sphere S .r/ by, respectively, Wni .r/ D .n C 1/
n a 2nC4 X r
2 .gnm /2 C .hm n/ ;
(117)
mD0
Wne .r/ D n
n r 2n2 X a
.qnm /2 C .snm /2 ;
(118)
mD0
which again define spatial power spectra. Note that in that case, the only rdependence is the one due to the geometric factors .a=r/2nC4 and .r=a/2n2 . The
860
T.J. Sabaka et al.
spectrum for the field of internal origin Wni .r/ is what is often referred to as the Lowes-Mauersberger spectrum.
6
Mathematical Uniqueness Issue
Sections 2–4 introduced several mathematical ways of representing magnetic fields in spherical shells, when sources of those fields lie either below, within, or above those shells. The goal is now to take advantage of those representations to explain how the best possible description of the Earth’s magnetic field can be recovered from available observations. Obviously, the more the observations, the better the description. However, even an infinite number of observations might not be enough to guarantee that one does eventually achieve a proper description of the field. Observations do not only need to be numerous; they also need to provide adequate information. This is an important issue since, in practice, observations cannot be made anywhere. Historical observations have all been made at the Earth’s surface. Aeromagnetic surveys provided additional observations after 1950, and satellite missions only started in the 1960s. Assuming that all those observations could have been made in an infinitely dense way (here, the issue of the limited density of observations is not discussed) and instantly (temporal issues are not discussed either), this means that the best information historical observations and aeromagnetic surveys can provide is the knowledge of B or of some derived quantities (components, direction, or intensity) over the entire surface of the Earth. Satellites bring additional information, which at best is a complete knowledge of B or of some derived quantities within a thin shell where sources can be found. To what extent is this enough to recover a complete mathematical description of the Earth’s magnetic field?
6.1
Uniqueness of Magnetic Fields in a Source-Free Shell
First consider the situation when the sources of the magnetic field are a priori known to be either internal to an inner surface †i or external to an outer surface †e , so that the shell in between †i and †e is source-free. Within this shell, Eqs. 2 and 8 show that both r B.D 0/ and r B.D 0/ are known. Applying Helmholtz theorem then shows that the field is completely characterized within the shell, provided the normal component of the field is known everywhere on both †i and †e . This very general result, also known in potential theory as the uniqueness theorem for Neumann boundary conditions (e.g., Kellogg 1954; Blakely 1995), can be made more explicit in the case considered in Sect. 3 when the shell is spherical, and †i and †e are defined by r D a and r D c. In that case, it is known that within the shell, B is described by B.r; ; / D Bi .r; ; / C Be .r; ; / (Eq. 15) where Bi .r; ; / and Be .r; ; / are given by Eqs. 16 and 17, the radial components of which can easily be inferred from Eqs. 18–21:
Mathematical Properties Relevant to Geomagnetic Field Modeling
Bi r .r; ; '/ D
1 X
.n C 1/
n a nC2 X r
nD1
Ber .r; ; '/ D
1 X
861
m gnm cos m' C hm n sin m' Pn .cos /;
mD0
.n/
nD1
n r n1 X a
(119) qnm
cos m' C
snm
sin m'
Pnm .cos /:
mD0
(120) This can then be used together with Eqs. 32 and 33 to show that 2nC1 4
2nC1 4
R R 2
Br .r; ; '/Pnm .cos / cos m' sin d d ' nC2 m n1 m D .n C 1/ ar gn n ar qn ;
(121)
Br .r; ; '/Pnm .cos / sin m' sin d d ' nC2 m n1 m D .n C 1/ ar hn n ar sn :
(122)
D0 'D0
R R 2
D0 'D0
Then, assuming that the normal component of B is known on both †i .r D a/ and †e .r D c/ amounts to assume that B.a; ; / and B.c; ; / are both completely known; making use of Eqs. 121 and 122 once for r D a and again for r D c leads m m to a set of linear equations from which all Gauss coefficients gnm , hm n ; qn ; sn can be inferred; and recalling Eqs. 15–17 shows that the field B is then indeed completely defined within the spherical shell a < r < c. More generally, it is important to note that the field B can be characterized just as well as soon as Br .r; ; / is completely known for two different values r1 and r2 of r, provided a r1 < r2 c (i.e., provided the two spherical surfaces defined by r D r1 and r D r2 lie within the source-free spherical shell). Similar conclusions can be reached if the potential V .r; ; / in place of the radial component Br .r; ; / is assumed to be known on both †i and †e . This very general result, known in potential theory as the uniqueness theorem for Dirichlet boundary conditions (e.g., Kellogg 1954; Blakely 1995), can also be made more explicit in the case when the shell is spherical, and †i and †e are defined by r D a and r D c. In that case, it is known that within the shell, V .r; ; / D Vi .r; ; / C Ve .r; ; /, where Vi .r; ; / and Ve .r; ; / are given by Eqs. 13 and 14, which can be used together with Eqs. 32 and 33 to show that 2nC1 4a
2nC1 4a
R R 2 D0 'D0
R R 2 D0 'D0
V .r; ; '/Pnm .cos / cos m' sin d d ' D V .r; ; '/Pnm .cos / sin m' sin d d ' D
a nC1 r
a nC1 r
gnm C hm n C
r n a
qnm ; (123)
r n a
snm : (124)
Then, assuming that V .r; ; / is known on both †i .r D a/ and †e .r D c/ makes it possible to use Eqs. 123 and 124, once for r D a, and again for r D c, to form m m another set of linear equations from which all Gauss coefficients gnm ; hm n ; qn ; sn can again be inferred. The same reasoning as in the previous case then follows, leading
862
T.J. Sabaka et al.
to the same conclusions (and generalization to the case when V .r; ; / is known for r D r1 and r2 , provided a r1 < r2 c). In practice however, only components of the magnetic field B and not its potential V are directly accessible to observations. The previous result can nevertheless be used to show yet another, directly useful, uniqueness property now applying to the B .r; ; / component of the field, when this component is assumed to be known on the two spherical surfaces defined by r D r1 and r2 , where a r1 < r2 c. This component is such that B .r; ; '/ D
1 @V .r; ; '/ : r @
(125)
Integrating along a meridian (with fixed values of r and ) starting from = 0 thus leads to Z
V .r; ; '/ D C .r/ r
B .r; 0 ; '/d 0 ;
(126)
0
which shows that if B .r; ; / is known, so is V .r; ; /, to within a function C .r/. But it is known from Eqs. 13, 14, and 37 that the average value of V .r; ; / over the sphere S .r/ of radius r is such that V .r; ; /S .r/ D 0 (recall that this is true, only because magnetic fields do not have monopole sources). It thus follows that *Z
+
0
C .r/ D r
B .r; ; '/d 0
0
;
(127)
S .r/
and that as soon as B .r; ; / is known for a given value of r, so is V .r; ; /. V .r; ; / can then again be used to compute the set of linear equations (Eqs. 123 and 124) for two different values r1 and r2 of r such that a r1 < r2 c, from m m which all Gauss coefficients gnm ; hm n ; qn ; sn can finally again be inferred. It is important to note that by contrast, no similar conclusion applies when B ; .r; ; /, rather than B .r; ; /, is considered. This is because, as is clear from, e.g., Eqs. 16–21, B ; .r; ; / is totally insensitive to zonal fields (described by spherical harmonic terms of order m D 0). Of course, and as must now be obvious to the reader, other useful uniqueness properties can also be derived by combining the knowledge of Br .r; ; / for a given value r1 of r and of B .r; ; / for another value r2 of r. Of particular relevance to the historical situation (when observations are only available at the Earth’s surface) is the case when both Br .r; ; / and B .r; ; / are simultaneously known for a given value R of r, where a R c. In that case, Eqs. 121 and 122 on one hand, and Eqs. 123 and 124 on another hand, hold for r D R. This again leads to a m m set of linear equations from which all the Gauss coefficients gnm ; hm n ; qn ; sn can be inferred, and the field once again fully determined.
Mathematical Properties Relevant to Geomagnetic Field Modeling
863
Finally, it is important to note that independently of the method used to define the field in a unique way within the source-free shell a < r < c, all Gauss coefficients are then recovered, and it therefore becomes possible to identify Bi (Eq. 16) and Be (Eq. 17).
6.2
Uniqueness Issues Raised by Directional-Only Observations
Now focus a little more on the possibility of completely defining a magnetic field when information is only available on a single surface within the source-free shell. It has been shown that if the field has both internal and external sources, knowing both Br .r; ; / and B .r; ; / over a sphere defined by r D R within the sourcefree shell is enough to achieve uniqueness and identify the fields of internal and external origin. Implicit is also the conclusion that just knowing one component of the field (be it Br ; B , or B ) would not be enough. At least two components are needed and in fact not just any two components, since, as previously seen, B does not provide as much information as Br or B . Of course, knowing the entire field B is even better. The field is then overdetermined (at least its nonzonal component), which is very useful since, in practice, the field is known just at a finite number of sites and not everywhere at the Earth’s surface (cf. chapter Satellite-to-Satellite Tracking (Low-Low/High-Low SST)). This in fact is the way Gauss first proved that the Earth’s magnetic field is mainly of internal origin (Gauss 1839, for a detailed account of how Gauss did proceed in practice, see, e.g., Langel 1987). But before Gauss first introduced a way to measure the magnitude of a magnetic field (which he did in 1832; see, e.g., Malin 1987), only inclination and declination observations were made. Such observations have a serious drawback. They cannot tell the difference between the real magnetic field and the same magnetic field multiplied by some arbitrary positive constant . But if such directional-only observations are available everywhere at the Earth’s surface, could it be that they nevertheless provide enough information for the Earth’s magnetic field to be completely characterized, to within the global positive factor ? Until recently, most authors felt that this was indeed the case, at least when a priori assuming the field to be of internal origin (e.g., Kono 1976). However directional-only observations are not linearly related to the Gauss coefficients, and the answer turns out to be more subtle, as first recognized by Proctor and Gubbins (1990). Relying on complex variables, mathematical tools very different from those used in this chapter, Proctor and Gubbins (1990) investigated axisymmetric fields (i.e., zonal fields, with only m = 0 Gauss coefficients) of internal origin and succeeded in exhibiting a family of different fields all sharing exactly the same direction everywhere on a spherical surface r D R enclosing all sources. Clearly, even a perfect knowledge of the direction of the field on the sphere r D R would not be enough to fully characterize a field belonging to such a family, even to within a global positive factor.
864
T.J. Sabaka et al.
The fields exhibited by Proctor and Gubbins (1990) are however special in several respects: they are axisymmetric, antisymmetric with respect to the equator, and of octupole type (displaying four loci of magnetic poles on the sphere r D R, one magnetic pole at each geographic pole, and two midlatitude axisymmetric lines of magnetic poles, a magnetic pole being defined as a point where the field is perpendicular to the surface). The nonuniqueness property they reveal may therefore very well not transpose to Earth-like situations involving magnetic fields strongly dominated by their dipole component. This led Hulot et al. (1997) to reconsider the problem in a more general context. These authors again assumed the field to entirely be of internal origin, but only requested it to be regular enough (i.e., physical). Using a potential theory type of approach (which is again not given in any detail here), they were able to show that if the direction of the field is known everywhere on a smooth enough surface †i enclosing all sources, and if those directions do not reveal any more than N loci of magnetic poles on †i , then all fields satisfying these boundary conditions belong to an open cone (in such an open cone, any nonzero positive linear combination of solutions is a solution) of dimension N 1. Applying this result to the fields exhibited by Proctor and Gubbins (1990), which display N D 4 loci of poles on †i (which is then the sphere r D R), shows that those fields would belong to open cones of dimension 3 (in fact, 2, when the equatorial antisymmetry is taken into account, see Hulot et al. 1997), leaving enough space for fields from this family to share the same directions on †i and yet not be proportional to each other. But applying this result to the historical magnetic field, which only displays N D 2 poles on †i (which is then the Earth’s surface), leads to a different conclusion. The Earth’s field belongs to an open cone of dimension 1, which shows that it can indeed be recovered from directional-only data on †i (the Earth’s surface) to within the already discussed global positive factor . Note however that this result only holds when all contributions from external sources are ignored. Similar results can be derived in the case, less relevant to the Earth, when the field is assumed to have all its sources outside the surface on which its direction is assumed to be known (Hulot et al. 1997). But to the authors knowledge, no results applying to the most general situation when both internal and external sources are simultaneously considered have yet been derived. Only a relatively weak statement can easily be made in the much more trivial case when the direction of the field is assumed to be known in a subshell of the source-free shell separating external and internal sources (and not just on a surface, Bloxham 1985; Proctor and Gubbins 1990; Lowes et al. 1995). In that case indeed, if two fields B.r/ and B0 .r/ share the same direction within the subshell, then a scalar function .r/ exists such that B0 .r/ D .r/B.r/ within it. But within that subshell, B.r/ and B0 .r/ must satisfy r B D 0; r B D 0; r B0 D 0, and r B0 D 0. This implies r B D 0 and r B D 0, i.e., that .r/ is a constant within the subshell. Since both fields B.r/ and B0 .r/ can be written in a unique way in the form Eqs. 15 to 21 within that source-free subshell, this means that all Gauss coefficients describing B0 .r/ are then proportional (by a factor ) to those describing B.r/.
Mathematical Properties Relevant to Geomagnetic Field Modeling
865
Hence B0 .r/ D B.r/ not only within the subshell within which observations are available but also beyond this shell, provided one remains within the source-free shell.
6.3
Uniqueness Issues Raised by Intensity-Only Observations
Measuring the full magnetic field B requires a lot of care. Nowadays, by far the most demanding step turns out to be the orientation of the measured field with respect to the geocentric reference frame. This is especially true in the context of satellite measurements (cf. chapter Satellite-to-Satellite Tracking (Low-Low/High-Low SST)). By contrast, measuring the intensity F D B of the field is comparatively easier, and when satellites first started making magnetic measurements from space, they only measured the intensity of the field. But to what extent can a magnetic field be completely determined (to within a global sign, of course) when only its intensity is measured? First consider the now familiar case when a source-free shell can be defined, and information (i.e., intensity F ) is only available on a single surface within that shell. Though no explicit results have yet been published (at least to the authors knowledge) in the general case when the field has both external and internal sources, it can be easily anticipated that this will be a very unfavorable situation: it is already known that the knowledge of only one of the Br ; B , or B components everywhere on this surface is not enough and that at least Br and B need to be simultaneously known. In fact, even in the case when the field is further assumed to only have internal sources (i.e., enclosed within the surface where intensity is assumed known), no general conclusion can be drawn. Some important specific results are however available. In particular, Backus (1968) showed that if the field is of internal origin and further assumed to be a finite sum of vector spherical harmonics (i.e., if all Gauss coefficients gnm and hm n are a priori known to be zero for n > N , where N is a finite integer, and the sum in Eq. 16 is therefore finite), then the field can indeed be completely determined (to within a global sign) by the knowledge of F everywhere on a spherical surface r D R enclosing all sources. But this result will generally not hold if the field is not a finite sum of vector spherical harmonics. This was first shown by Backus (1970), which exhibited what are now known as the Backus series (see also the comment by Walker (1992). These are fields BM .r; ; / of internal origin and order M (i.e., defined by Gauss coefficients with gnm D hm n D 0 if m ¤ M ), where M can be any positive integer. The Gauss (internal) coefficients of BM .r; ; / are defined by a recursion relation, which needs not be explicited here. Suffices to say that this relation is chosen so as to ensure that each field BM .r; ; / (1) has Gauss coefficients that converge fast enough with increasing degree n, for the convergence of the infinite sum (Eq. 16) defining BM .r; ; / to be ensured for all values of r R, and (2) satisfies BM .r; ; / BD .r; ; / D 0 everywhere on the surface r D R, where BD .r; ; / is the axial dipole of internal origin defined by the single Gauss coefficient g10 D 1
866
T.J. Sabaka et al.
in Eq. 16. These Backus series can then be used to define an infinite number of pairs of magnetic field of internal origin BM C D ˛BD C ˇBM and BM D ˛BD ˇBM , where .˛; ˇ/ can be any pair of real (nonzero) values. These pairs will automatically satisfy (as one can easily check) BM .r; ; / D BM C .r; ; / on the sphere r D R, while obviously BM ¤ BM C . A perfect knowledge of the intensity of the field on the sphere r D R would therefore not be enough to characterize a field belonging to any such pairs, even to within a global sign. An interesting generalization of this result to the case when BD does not need to be an axial dipole field has more recently been published by Alberto et al. (2004). If BN is any arbitrary field of internal origin defined by a finite number of Gauss coefficients with maximum degree N , a field B0 N can again always be found such that B0 N .r; ; / BN .r; ; / D 0 everywhere on the surface r D R. Many additional pairs of magnetic fields BN C D ˛BN C ˇB0 N and BN D ˛BN ˇB0 N , again sharing the same intensity on the sphere r D R, can thus be found. Those results are interesting. However, they do not provide a general answer to the question of the uniqueness of arbitrary fields of internal origin only constrained by intensity data on a surface enclosing all sources (when the field is not a finite sum of vector spherical harmonics). But does this really matter in practice? It may indeed be argued that the best model of the Earth’s magnetic field of internal origin that will ever be recovered will anyway be in the form of a finite number of Gauss coefficients (those compatible with the resolution matching the spatial distribution of the limited number of observations) and that because of the Backus (1968) uniqueness result previously discussed, this model would necessarily be determined in a unique way (to within a sign) by the knowledge of the intensity of the field at the Earth’s surface (assuming that some strategy of measurement has been used so as to minimize any contribution of the field of external origin). In practice, this indeed is the case. However, this practical uniqueness turns out to be very relative and misleading, as the study of Stern et al. (1980) illustrates. In this study, the authors use a data set of full vector magnetic field observations collected by the 1980 Magsat satellite (and carefully selected to minimize any local or external sources, which they then consider as a source of noise) once to produce a field model A best explaining the observed intensity and once again to produce a model B best explaining the observed vector field. They found that the two models differ very significantly. Both models predict similar intensity at the Earth’s surface (to within measurement error, typically a few nT), but they strongly disagree when predicting the full vector field, with model A leading to errors (up to 2,000 nT!) far more than tolerated by measurement errors on the measured vector field, contrary to model B. This disagreement can be traced back to the fact that the difference BB BA between the predictions BA and BB of the two models tends to satisfy (BB BA / .BB C BA / D 0. One way of interpreting this result is to note that any practical optimizing procedure used to look for a model A best fitting the intensity (and just the intensity) will be much less sensitive to an error ıB perpendicular to the observed field B (which then produces a second-order error .ıB/2 =B in the intensity) than to a comparable error along the observed field (which then produces a first-order error
Mathematical Properties Relevant to Geomagnetic Field Modeling
867
ıB in the intensity). By contrast, the optimizing procedure used to look for a model B best fitting the full vector B will make sure this error is kept small, whatever its direction. As a result, the difference BB BA will be largest in the direction perpendicular to the observed field, the value of which, to lowest order, is close to .BB C BA /=2. The two models are thus bound to lead to predictions such that (BB BA / .BB C BA / D 0. Since model A makes erroneous predictions in the direction perpendicular to the observed field, this effect is often referred to as the perpendicular error effect (Lowes 1975; Langel 1987). Some authors however also refer to this effect as the Backus effect (e.g., Stern et al. 1980; Langel 1987), quite correctly, since this error also seems to be closely related to the type of nonuniqueness exhibited by the fields constructed with the help of the Backus series. To see this, first recall that model B is constrained by full vector field observations. Were those perfect and available everywhere at the Earth’s surface, model B would be perfectly and uniquely determined. Improving the data set would thus lead model B to eventually match the true Earth model. Model A differs from model B in a macroscopic way, and it is very likely that improving the intensity data set would not lead model A to converge toward the true Earth model. This suggests that at least another model than the true Earth model could be found in the limit perfect intensity data is available everywhere at the Earth’s surface and that when only a finite imperfect intensity data set is available, a model recovered by optimizing the fit just to this intensity data set, such as model A, is a truncated and approximate version of this alternative model. Indeed, the magnitude and geographical distribution of the difference BB BA is very comparable to that of the Backus series (e.g., Stern and Bredekamp 1975). This undesired nonuniqueness property is problematic. To alleviate the resulting Backus/perpendicular-error effect, several practical solutions have therefore been proposed, such as adding a minimum of vector field data to the intensity-only data set (as first investigated by Barraclough and Nevitt 1976), or taking into account even poor determinations of the field direction, which indeed brings considerable improvement (Holme and Bloxham 1995). From a more formal point of view, the next conceptual improvement was however brought by Khokhlov et al. (1997). Again relying on a potential theory type of approach, these authors showed that a simple and unambiguous way of characterizing a regular enough field of internal origin only defined by the knowledge of its intensity F on a smooth enough surface †i enclosing all its sources is to locate its (possibly several) magnetic dip equators on †i (defined as curves across which the component of B normal to †i changes sign). In other words, if a field of internal origin is defined by both its intensity everywhere on †i and the location of its dip equator(s) on †i , it then is completely and uniquely determined (to within a global sign). Note that, indeed, any two fields of pairs built with the help of the Backus series (or their generalization by Alberto et al. 2004) do not share the same equators. The practical usefulness of this theoretical result was subsequently investigated by Khokhlov et al. (1999) and demonstrated in real situations by Ultré-Guérard et al. (1998) and, more recently, by Holme et al. (2005). However, the most obvious practical conclusion that should be drawn from these various results is that intensity-only strategies of measuring the
868
T.J. Sabaka et al.
Earth’s magnetic field on a planetary scale is a dangerous one. Indeed all advanced near-Earth magnetic field missions are now designed to make sure the full field B is measured (cf. Hulot et al. 2007). Besides, and as shown later, this turns out to be mandatory also because near-Earth satellites do not orbit in a source-free shell. Finally, and for completeness, the situation when the intensity of the field is no longer known on a surface, but within a volume within the source-free shell, should be considered (in which case, again a field with both internal and external sources may be considered). Then, as shown by Backus (1974), provided the source-free shell can be considered as connected (any two points in the shell can be joined by a smooth curve within the shell, which is the case of the spherical shell a < r < c), knowing the intensity of the field in an open region contained entirely within the source-free shell is enough to determine the field entirely up to its global sign.
6.4
Uniqueness of Magnetic Fields in a Shell Enclosing a Spherical Sheet Current
So far, the possibility of recovering a complete description of a magnetic field has only been addressed when some information is available on some surface or in some volume within a source-free shell, the sources of the field lying above and/or below the shell. This is typically the situation when one attempts to describe the Earth’s magnetic field within the neutral atmosphere, using ground-based, shipborne, and aeromagnetic observations. It was also noted that once the field is fully determined within such a source-free shell, then both the field of internal origin Bi and the field of external origin Be can be identified. Now, what if one considers a shell within which currents can be found? This is a situation that must be considered when attempting to also analyze satellite data, since satellites fly above the E-region where the ionospheric dynamo resides and within the F -region where additional currents can be found (cf. chapter Satellite-to-Satellite Tracking (Low-Low/High-Low SST)). Here, currents in the F -region are first ignored, and a spherical shell a < r < c is assumed which only contains a spherical sheet current (which would typically describe the currents produced by the ionospheric dynamo in the E-region) at radius r D b. Then the sources of the field are assumed to lie below r D a (sources referred to as J.r < a/ sources), on r D b (Js .r D b/ sources), and above r D c (J.r > c/ sources). What kind of information does one then need to make sure the field produced by those sources can be completely described everywhere within that shell? To address this question, first note that since J.r < a/ sources lie below r D a, the potential field they produce can be described by a set of Gauss coefficients gnm ; hm n which may be used in Eq. 16 to predict this field for any value r > a. In the same way, since J.r > c/ sources lie above r D c, the potential field they produce can be described by a set of Gauss coefficients qnm , snm which may be used in Eq. 17 to predict this field for any value r < c. Finally, since the Js .r D b/
Mathematical Properties Relevant to Geomagnetic Field Modeling
869
sources correspond to a spherical sheet current, anm ; bnm Gauss coefficients can also be introduced and used to predict the potential field produced by those sources thanks to Eq. 38 for r > b and Eq. 40 for r < b. Now, consider the lower subshell a < r < b. This shell is a source-free shell of the type considered so far. All the previous results relevant to the case when both internal and external sources are to be found therefore apply. Take the most relevant case when information about the field is available only on a surface (say, the Earth’s surface) within the subshell a < r < b. Then one must be able to know at least two components of the field everywhere on that surface, typically Br .r; ; / and B .r; ; / to fully characterize the field within that subshell. Once one knows these, one can recover all Gauss coefficients describing the field within that subshell. In particular, one will recover the gnm ; hm n Gauss coefficients of the field produced by the J.r < a/ sources, which is seen as the field of internal origin for that subshell. One can also recover the Gauss coefficients of the field of external origin for that subshell. But this field is the one produced by both Js .r D b/ sources and J.r > c/ sources. One will therefore recover the sum of their Gauss coefficients, i.e., (recall Eq. 40), qnm ..n C 1/=n/.a=b/2nC1anm and snm ..n C 1/=n/.a=b/2nC1bnm . Next, consider the upper subshell b < r < c. Obviously, the same reasoning as above can be made, and good use can again be made of, for instance, the knowledge of two components of the field on a surface within that subshell (say the surface covered by an orbiting satellite, assuming local sources can be neglected). But it must be acknowledged that the Js (r D b) sources are now seen as sources of internal origin. This then leads to the conclusion that the following sums of Gauss m coefficients will also be recovered, gnm C anm and hm n C bn (recall Eq. 40), together m m with the Gauss coefficients qn and sn . The reader will then notice that for each degree n and order m, four quantities (gnm ; qnm ..n C 1/=n/.a=b/2nC1anm , gnm C anm , and qnm ) will have been recovered to constrain the three Gauss coefficients gnm , qnm , and anm , while four additional m 2nC1 m m m quantities (hm bn , hm n , sn ..n C 1/=n/.a=b/ n C bn and sn ) will have been m m recovered to constrain the three Gauss coefficients hn , sn , and bnm . In each case, one has one constraint too many. One constraint could have therefore been dropped. Indeed, and as the reader can easily check, in such a situation when only a spherical sheet current is to be found in an otherwise source-free spherical shell a < r < c, the knowledge of two components (typically Br .r; ; / and B .r; ; /) on a surface within one of the subshell, and of only one component (typically Br .r; ; / or B .r; ; /) on a surface in the other subshell, is enough to fully characterize the field everywhere within the shell a < r < c. Beware however that all the limitations already identified in Sects. 6.1 and 6.3 when attempting to make use of either the B component or the intensity also hold in the present case. It is important to stress that the above results hold only because it was assumed that all sources to be found within the shell a < r < c lie on a spherical infinitely thin sheet. What if the sheet is itself a subshell of some thickness (defined by say b e < r < b C e)? In that case, one may still use Eq. 38 to describe the field the sources within that shell would produce for r > b C e.
870
T.J. Sabaka et al.
However, the corresponding Gauss coefficients anm , bnm may then no longer be used in Eq. 40 to predict the field the same sources would produce for r < b e, because the continuity equation (Eq. 39) no longer holds. As a result, if the spherical sheet current r D b does have some non-negligible thickness, one must back up from the previous conclusions. In that case, one really does need to have two independent sources of information (such as two components of the field on a surface) within each subshell, to fully characterize the field within each subshell. Also, the reader should note that the field is then still not fully defined within the current-carrying shell b e < r < b C e. This finally brings one to the issue of uniquely defining a magnetic field within a current-carrying shell.
6.5
Uniqueness of Magnetic Fields in a Current-Carrying Shell
If the most general case of a current-carrying shell a < r < c is now considered within which any type of currents can be found, the field may no longer be written in the form B.r; ; / D Bi .r; ; / C Be .r; ; / (Eq. 15). But it may be written in the form (Eq. 105, recall Sect. 4) sh B D Bipol C Bepol C Bsh pol C Btor ;
(128)
where Bipol describes the field produced by all internal sources (referred to as the J.r < a/ sources just above, only the toroidal type of which contribute to Bipol , as seen in Sect. 4), Bepol describes the field produced by all external sources (the J.r > c/ sources, only the toroidal type of which contribute to Bepol ), and Bsh pol and Bsh are the nonpotential poloidal and toroidal fields due to in situ toroidal tor sh Jsh tor and poloidal Jpol currents in the shell. Because of those currents, the field B may generally no longer be defined in a unique way everywhere in such a shell. However, if appropriate additional assumptions with respect to the nature of those local currents are introduced, some uniqueness results can again be derived. To see this, it is useful to start from the decomposition Eq. 107 of the magnetic field B.R; ; / on the sphere of radius r D R (recall Sect. 4): B.R; ; '/ D BiR .R; ; '/ C BeR .R; ; '/ C Bsh tor .R; ; '/
(129)
where Bsh tor .R; ; /; BiR (R; ; ), and BeR .R; ; / are given by respectively Eqs. 108–110. Because of the orthogonality of the vector spherical harmonics m;.c;s/ m;.c;s/ Cn ; …ni , and …m;.c;s/ , one may then write (recall Eqs. 63, 66, and 67) ne hB.R; ; '/; …m;c ni .; '/i D
R R 2
B.R; ; '/ …m;c .; '/ sin d d ' a nC2ni m gn .R/; D .n C 1/ R (130) a nC2 m;s m hB.R; ; '/; …ni .; '/i D .n C 1/ hn .R/; (131) R 1 4
D0 'D0
Mathematical Properties Relevant to Geomagnetic Field Modeling
hB.R; ; '/; …m;c ne .; '/i D n
R a
871
n1 qnm .R/;
(132)
snm .R/;
(133)
hB.R; ; '/; Cm;c n .; '/i D
n.n C 1/ m;c T .R/; 2n C 1 n
(134)
hB.R; ; '/; Cm;s n .; '/i D
n.n C 1/ m;s T .R/; 2n C 1 n
(135)
hB.R; ; '/; …m;s ne .; '/i D n
R a
n1
which shows that the complete knowledge of B.R; ; / on the spherical surface r D R already makes it possible to identify the Gauss coefficients gnm .R/; hm n .R/ of the potential field BiR .r; ; / produced above r D R by all sources below r D R, the Gauss coefficients qnm .R/; snm .R/ of the potential field BeR .r; ; / produced below r D R by all sources above r D R, and the coefficients Tnm;c .R/ and Tnm;s .R/ defining the toroidal field Bsh tor .R; ; / produced by the local (poloidal) sources at r D R. This uniqueness theorem, due to Backus (1986), is however not powerful enough in general to reconstruct the field B.r; ; / in a unique way everywhere within the shell a < r < c. But one may introduce additional assumptions that reasonably apply to the near-Earth environment (see Fig. 4 for a schematic sketch and chapter Satellite-to-Satellite Tracking (Low-Low/High-Low SST) in the handbook for more details). Assume that the shell a < r < c can be divided into two subshells a < r < b and b < r < c separated by a spherical sheet current at r D b. Assume that the lower subshell describes the neutral atmosphere and is therefore source-free, while the spherical sheet current describes the ionospheric E-region (which is again considered as infinitely thin). Finally assume that the upper subshell describes the ionospheric current-carrying F -region within which near-Earth satellites orbit. Since those currents are known to be mainly the so-called field-aligned currents at polar latitudes (i.e., aligned with the dominant poloidal field, see, e.g., Olsen 1997), it is not unreasonable to further assume that those currents have no toroidal components (they are mainly in radial direction). In other words, assume that no sh Jsh tor sources but only Jpol sources lie in the b < r < c upper subshell. These assumptions, together with the previous uniqueness theorem, can then be combined in a very powerful way. Indeed, since B.R; ; / is already assumed to be known on a sphere r D R m m in the upper subshell, the Gauss coefficients gnm .R/; hm n .R/; qn .R/; sn .R/ can be m m inferred from Eqs. 130 to 133. The qn .R/; sn .R/ Gauss coefficients then describe the field produced below r D R by all sources above r D R. But since between r D R and r D c only Jsh pol sources are to be found which only produce local toroidal fields, the only sources contributing to the field described by the qnm .R/; snm .R/ coefficients are the J.r > c/ sources. It thus follows that the qnm .R/; snm .R/ are the qnm ; snm Gauss coefficients describing the potential field produced below r D c
872
T.J. Sabaka et al.
J (r > c)
Jpol
lite
Satel
ries
Observato
J (r < a)
r =c
Js (r = b)
No sources
r =R r =b r =a
Fig. 4 Uniqueness of a magnetic field recovered from partial information within a currentcarrying shell. In this special case relevant to geomagnetism, it is assumed that any source can lie below r D a (internal J.r < a/ sources), and above r D c (external J.r > c/ sources), no sources can lie within the lower subshell (a < r < b, the neutral atmosphere), a spherical sheet current can lie at r D b (the E-region Js .r D b/ sources), and only poloidal sources can lie within the upper subshell (b < r < c, the F -region ionosphere). The knowledge of B on a sphere r D R in the upper subshell (as provided by, e.g., a satellite) and of enough components of B on the sphere r D a (as provided by, e.g., observatories at the Earth’s surface) is then enough to recover the field produced by most sources in many places (see text for details)
by the J.r > c/ sources. A similar use of the Gauss coefficients gnm .R/; hm n .R/ recovered from Eqs. 130 and 131 an then also be made. Those describe the field produced above r D R by all sources below r D R. But no sources between r D b and r D R contribute. Only J.r < a/ and Js .r D b/ sources do. It then also follows m m m m that the gnm .R/; hm n .R/ are the sums .gn C an ; hn C bn / of the Gauss coefficients describing the field produced above r D a by the J.r < a/ sources and above r D b by the Js .r D b/ sources. This then brings one back to a situation similar to the one previously encountered when considering a shell a < r < c just enclosing a spherical sheet current at r D b. If, in addition to knowing the field B.R; ; / on the spherical surface r D R, enough information is also available at the Earth’s m m m m surface (say at least Br ), all Gauss coefficients gnm ; hm n ; qn ; sn , and an ; bn can again be recovered. In which case, the field produced by the J.r < a/ sources can be predicted everywhere for r > a, the one produced by the J.r > c/ sources can be predicted everywhere for r < c, and the field produced by Js .r D b/ sources can be predicted everywhere (except r D b). In particular, the total field B.r; ; / is then completely defined within the lower subshell a < r < b. Note, however, and this is important, that the field Bsh tor .r; ; / (which is zero in the source-free lower subshell) is then still only known for r D R and cannot be predicted elsewhere in the upper subshell (defined by b < r < c). Of course, this is not the only set of assumptions one may introduce. One may first relax the infinitely thin assumption made for the spherical sheet current, which
Mathematical Properties Relevant to Geomagnetic Field Modeling
873
could then extend from r D b e to r D b C e, as was assumed earlier. Provided enough information is available within the lower subshell a < r < b (say the two components Br .r; ; / and B .r; ; / at the Earth’s surface), then again J.r < a/ sources can be predicted everywhere for r > a, the one produced by the J.r > b/ sources can be predicted everywhere for r < b, and the field produced by the spherical (thick) sheet sources can be predicted everywhere except within b e < r < b C e. Of course, the field Bsh tor .r; ; / would still only be known for r D R and would not be predicted elsewhere in the upper subshell. In fact, and as must now be obvious to the reader, this last point is precisely one of the several issues that make satellite data difficult to take advantage of. In particular, satellites are never on a rigorously circular orbit and do not exactly sample B.R; ; / (apart from the difficulties related to the proper sampling in space and time of the temporal variations of these currents). Rather they sample drifting elliptic shells within a spherical shell of average radius R but with some thickness h. When this shell is thin enough, one may however rely on the thin-shell approximation introduced in Sect. 4, in which case one may use Eq. 111. As was sh then noted, this approximation is correct to within Bsh pol and Btor corrections of order h=R. If the satellite indeed orbits within a region were only poloidal currents are to sh be found, Bsh pol D 0 and this correction only affects Btor . If h=R is indeed small, this correction is small enough, and all the reasoning above may be repeated. The practical applicability of such an approach for satellite magnetic measurements has been investigated by Backus (1986) and Olsen (1997). Olsen (1997) points out that for the Magsat 1980 mission, these numbers are h D 100 km R D 6;821 km, which seems to justify the thin-shell approximation. Then, indeed, magnetic signatures from both field-aligned currents in the polar latitudes and meridional coupling currents associated with the equatorial electrojet (EEJ) (Maeda et al. 1982) are detected in the Magsat data, each examples of Bsh tor produced by local Jsh sources. pol Of course, additional or alternative assumptions can also be used. Olsen (1997), for instance, considers only radial poloidal currents in the sampling shell of Magsat, an assumption which is basically valid except for midlatitude interhemispheric currents. From Eq. 68, one can obtain purely radial currents if the radial dependency of PJ is proportional to 1=r, thus eliminating the first term. This idea can also be extended to purely meridional currents (J D 0), either in the standard geocentric coordinates (Olsen 1997) or in a quasi-dipole (QD) coordinate system (Richmond 1995), as has been done by Sabaka et al. (2004) (the QD system is a warped coordinate system useful in describing phenomena which are organized according to the Earth’s main field; see Richmond (1995) and Sect. 3.1 of the chapter by Olsen et al. in the handbook). Two classes of admissible scalar functions are then found to contribute to meridional currents: (1) those which are purely radial and (2) those which are QD zonal, i.e., m D 0. Clearly only the second class contributes to the horizontal components of J, and so a nonvanishing first term in Eq. 68 for PJ is required. Finally, it should also be mentioned that advanced investigations of the CHAMP satellite (Maus 2007) data have recently provided several examples of situations
874
T.J. Sabaka et al.
sh revealing the presence of some Bsh pol fields (and Jtor sources) at satellite altitude, contradicting the assumptions described above (e.g., Lühr et al. 2002; Maus and Lühr 2006; Stolle et al. 2006). Although those fields are usually small (on the order of a few nT at most), they can be of comparable magnitude to the weakest signals produced by the smallest scales of the field of internal origin (the J.r < a/ sources), which sets a limit to the satellite’s ability to recover this field, despite the high quality of the measurements. No doubt that this limit is one of the greatest challenge the soon-to-be-launched (2011) ESA’s Swarm mission will have to face (Friis-Christensen et al. 2006, 2009).
7
Concluding Comments: From Theory to Practice
The present review was intended to provide the reader with the mathematical background relevant to geomagnetic field modeling. Often, mathematical rigor required that a number of simplifying assumptions be introduced with respect to the location of the various magnetic field sources and to the type and distribution of magnetic observations. In particular, these observations were systematically assumed to continuously sample idealized regions (be it an idealized spherical “Earth” surface or an idealized spherical “ionospheric” layer or shell). Also, all of the observations were implicitly assumed to be error free and synchronous in time, thereby avoiding the issue of the mathematical representation of the time variation of the various fields. Yet, those fields vary in time, sometimes quite fast in the case of the field produced by external sources, and in practice, observations are limited in number, affected by measurement errors and not always synchronous (satellites take some time to complete their orbits). These departures from the ideal situations considered in this chapter are a significant source of concern for the practical computation of geomagnetic field models based on the various mathematical properties derived here. But they can fortunately be handled. Provided relevant temporal parameterizations are introduced, appropriate data selection used, and adequate socalled inverse methods employed, geomagnetic field models defined in terms of time-varying Gauss coefficients of the type described here can indeed be computed. Details about the way this is achieved is however very much dependent on the type of observations analyzed and on the field contribution one is more specifically interested in. Examples of geomagnetic field modeling based on historical groundbased observations, with special emphasis on the main field produced within the Earth’s core, can be found in, e.g., Jackson et al. (2000) and Jackson and Finlay (2007). A recent example of a geomagnetic field model based on satellite data and focusing on the field produced by the magnetization within the Earth’s crust is provided by Maus et al. (2008). Additional examples based on the joint use of contemporary ground-based and satellite-born observations for the modeling of both the field of internal and external origin can otherwise be found in, e.g., Sabaka et al. (2004), Thomson and Lesur (2007), Lesur et al. (2008), Olsen et al. (2009) and in the review paper by Hulot et al. (2007), where many more references are
Mathematical Properties Relevant to Geomagnetic Field Modeling
875
provided. For approximation of the geomagnetic field, the conventional system of vector spherical harmonics is used. An approach based on locally supported vector wavelets is studied in the next chapter (chapter Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents). Acknowledgment This is IPGP contribution 2596.
References Abramowitz M, Stegun IA (1964) Handbook of mathematical functions. Dover, New York Alberto P, Oliveira O, Pais MA (2004) On the non-uniqueness of main geomagnetic field determined by surface intensity measurements: the Backus problem. Geophys J Int 159: 558–554. 10.1111/j.1365-246X.2004.02413.x Backus GE (1968) Applications of a non-linear boundary value problem for Laplace’s equation to gravity and geomagnetic intensity surveys. Q J Mech Appl Math 21:195–221 Backus GE (1970) Non-uniqueness of the external geomagnetic field determined by surface intensity measurements. J Geophys Res 75(31):6339–6341 Backus GE (1974) Determination of the external geomagnetic field from intensity measurements. Geophys Res Lett 1(1):21 Backus G (1986) Poloidal and toroidal fields in geomagnetic field modeling. Rev Geophys 24:75–109 Backus G, Parker R, Constable C (1996) Foundations of geomagnetism. Cambridge University Press, New York Barraclough DR, Nevitt C (1976) The effect of observational errors on geomagnetic field models based solely on total-intensity measurements. Phys Earth Planet Int 13:123–131 Blakely RJ (1995) Potential theory in gravity and magnetic applications. Cambridge University Press, Cambridge Bloxham J (1985) Geomagnetic secular variation. PhD thesis, Cambridge University Dahlen F, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton Edmonds A (1996) Angular momentum in quantum mechanics. Princeton University Press, Princeton Elsasser W (1946) Induction effects in terrestrial magnetism. Part I. Theory. Phys Rev 69(3–4):106–116 Ferrers NM (1877) An elementary treatise on spherical harmonics and subjects connected with them. Macmillan, London Friis-Christensen E, Lühr H, Hulot G (2006) Swarm: a constellation to study the Earth’s magnetic field. Earth Planets Space 58:351–358 Friis-Christensen E, Lühr H, Hulot G, Haagmans R, Purucker M (2009) Geomagnetic research from space. Eos 90:25 Gauss CF (1839) Allgemeine Theorie des Erdmagnetismus. Resultate aus den Beobachtungen des Magnetischen Vereins im Jahre 1838. Göttinger Magnetischer Verein, Leipzig Goldie AHR, Joyce JW (1940) In: Proceedings of the 1939 Washington Assembly of the Association of Terrestrial Magnetism and Electricity of the International Union of Geodesy and Geophysics vol 11(6). Neill & Co, Edinburgh Granzow DK (1983) Spherical harmonic representation of the magnetic field in the presence of a current density. Geophys J R Astron Soc 74:489–505 Harrison CGA (1987) The crustal field. In: Jacobs JA (ed) Geomagnetism, vol 1. Academic, London, pp 513–610 Holme R, Bloxham J (1995) Alleviation of the backus effect in geomagnetic field modelling. Geophys Res Lett 22:1641–1644 Holme R, James MA, Lühr H (2005) Magnetic field modelling from scalar-only data: resolving the Backus effect with the equatorial electrojet. Earth Planets Space 57:1203–1209
876
T.J. Sabaka et al.
Hulot G, Khokhlov A, Le Mouël JL (1997) Uniqueness of mainly dipolar magnetic fields recovered from directional data. Geophys J Int 129:347–354 Hulot G, Sabaka TJ, Olsen N (2007) The present field. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam Jackson J (1998) Classical electrodynamics. Wiley, New York Jackson A, Finlay CC (2007) Geomagnetic secular variation and its application to the core. In: Kono M, (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam Jackson A, Jonkers ART, Walker MR (2000) Four centuries of geomagnetic secular variation from historical records. Philos Trans R Soc Lond A 358:957–990 Kellogg OD (1954) Foundations of potential theory. Dover, New York Khokhlov A, Hulot G, Le Mouël JL (1997) On the Backus effect – I. Geophys J Int 130:701–703 Khokhlov A, Hulot G, Le Mouël JL (1999) On the Backus effect – II. Geophys J Int 137:816–820 Kono M (1976) Uniqueness problems in the spherical analysis of the geomagnetic field direction data. J Geomagn Geoelectr 28:11–29 Langel RA (1987) The main field. In: Jacobs JA (ed) Geomagnetism, vol 1. Academic, London, pp 249–512 Langel RA, Hinze WJ (1998) The magnetic field of the Earth’s lithosphere: the satellite perspective. Cambridge University Press, Cambridge Lesur V, Wardinski I, Rother M, Mandea M (2008) GRIMM: the GFZ reference internal magnetic model based on vector satellite and observatory data. Geophys J Int 173:382–294 Lorrain P, Corson D (1970) Electromagnetic fields and waves. WH Freeman, San Francisco Lowes FJ (1966) Mean-square values on sphere of spherical harmonic vector fields. J Geophys Res 71:2179 Lowes FJ (1974) Spatial power spectrum of the main geomagnetic field, and extrapolation to the core. Geophys J R Astron Soc 36:717–730 Lowes FJ (1975) Vector errors in spherical harmonic analysis of scalar data. Geophys J R Astron Soc 42:637–651 Lowes FJ, De Santis A, Duka B (1995) A discussion of the uniqueness of a Laplacian potential when given only partial information on a sphere. Geophys J Int 121:579–584 Lühr H, Maus S, Rother M (2002) First in-situ observation of night-time F region currents with the CHAMP satellite. Geophys Res Lett 29(10):127.1–127.4. 10.1029/2001 GL 013845 Maeda H, Iyemori T, Araki T, Kamei T (1982) New evidence of a meridional current system in the equatorial ionosphere. Geophys Res Lett 9:337–340 Malin S (1987) Historical introduction to geomagnetism. In: Jacobs JA (ed) Geomagnetism, vol 1. Academic, London, pp 1–49 Mauersberger P (1956) Das Mittel der Energiedichte des geomagnetischen Hauptfeldes an der Erdoberfläche und seine säkulare Änderung. Gerl Beitr Geophys 65:207–215 Maus S (2007) CHAMP magnetic mission. In: Gubbins D, Herrero-Bervera E (eds) Encyclopedia of geomagnetism and paleomagnetism. Springer, Heidelberg Maus S, Lühr H (2006) A gravity-driven electric current in the Earth’s ionosphere identified in champ satellite magnetic measurements. Geophys Res Lett 33:L02812. doi:10.1029/2005GL024436 Maus S, Yin F, Lühr H, Manoj C, Rother M, Rauberg J, Michaelis I, Stolle C, Müller R (2008) Resolution of direction of oceanic magnetic lineations by the sixth-generation lithospheric magnetic field model from CHAMP satellite magnetic measurements. Geochem Geophys Geosyst 9(7):Q07021 Merrill R, McElhinny M (1983) The Earth’s magnetic field. Academic, London Merrill R, McFadden P, McElhinny M (1998) The magnetic field of the Earth: paleomagnetism, the core, and the deep mantle. Academic, London Mie G (1908) Considerations on the optic of turbid media, especially colloidal metal sols. Ann Phys (Leipzig) 25:377–442 Morse P, Feshbach H (1953) Methods of theoretical physics. International series in pure and applied physics. McGraw-Hill, New York
Mathematical Properties Relevant to Geomagnetic Field Modeling
877
Olsen N (1997) Ionospheric F region currents at middle and low latitudes estimated from Magsat data. J Geophys Res 102(A3):4563–4576 Olsen N, Mandea M, Sabaka TJ, Tøffner-Clausen L (2009) CHAOS-2-a geomagnetic field model derived from one decade of continuous satellite data. Geophys J Int 199(3):1477–1487. doi:10.1111/j.1365-246X.2009.04386.x Proctor MRE, Gubbins D (1990) Analysis of geomagnetic directional data. Geophys J Int 100:69–77 Purucker M, Whaler K (2007) Crustal magnetism. In: Kono M (ed) Treatise on geophysics, vol 5. Elsevier, Amsterdam, pp 195–235 Richmond AD (1995) Ionospheric electrodynamics using magnetic Apex coordinates. J Geomagn Geoelectr 47:191–212 Sabaka TJ, Olsen N, Langel RA (2002) A comprehensive model of the quiet-time near-Earth magnetic field: phase 3. Geophys J Int 151:32–68 Sabaka TJ, Olsen N, Purucker ME (2004) Extending comprehensive models of the Earth’s magnetic field with Ørsted and CHAMP data. Geophys J Int 159:521–547. doi:10.1111/j.1365246X.2004.02421.x Schmidt A (1935) Tafeln der Normierten Kugelfunktionen. Engelhard-Reyher Verlag, Gotha Stern DP (1976) Representation of magnetic fields in space. Rev Geophys 14:199–214 Stern DP, Bredekamp JH (1975) Error enhancement in geomagnetic models derived from scalar data. J Geophys Res 80:1776–1782 Stern DP, Langel RA, Mead GD (1980) Backus effect observed by Magsat. Geophys Res Lett 7:941–944 Stolle C, Lühr H, Rother M, Balasis G (2006) Magnetic signatures of equatorial spread F , as observed by the CHAMP satellite. J Geophys Res 111:A02304. doi:10.1029/2005JA011184 Thomson AWP, Lesur V (2007) An improved geomagnetic data selection algorithm for global geomagnetic field modelling. Geophys J Int 169(3):951–963 Ultré-Guérard P, Hamoudi M, Hulot G (1998) Reducing the Backus effect given some knowledge of the dip-equator. Geophys Res Lett 22(16):3201–3204 Walker AD (1992) Comment on “Non-uniqueness of the external geomagnetic field determined by surface intensity measurements” by Georges E. Backus. J Geophys Res 97(B10):13991 Watson GN (1966) A treatise on the theory of Bessel function. Cambridge University Press, London Winch D, Ivers D, Turner J, Stening R (2005) Geomagnetism and Schmidt quasi-normalization. Geophys J Int 160(2):487–504
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents Christian Gerhards
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Relevant Function Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Spherical Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Green’s Function for the Beltrami Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Single Layer Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Two Approaches to Multiscale Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Wavelets as Frequency Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Locally Supported Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Application to Geomagnetic Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Crustal Field Modeling and Separation of Sources . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Reconstruction of Radial Current Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Multiscale Power Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
880 883 883 886 889 890 891 897 898 899 905 908 917 917
Abstract
This chapter gives a brief overview on the application of multiscale techniques to the modeling of geomagnetic problems. Two approaches are presented: one focusing on the construction of scaling and wavelet kernels in frequency domain and the other one focusing on a spatially oriented construction resulting in locally supported wavelets. Both approaches are applied exemplarily to the modeling of the crustal field, the reconstruction of radial current systems, and the definition of a multiscale power spectrum.
C. Gerhards () Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_18
879
880
1
C. Gerhards
Introduction
The last decade has severely improved the understanding of the Earth’s magnetic field due to high-precision vector satellite data supplied, e.g., by the Ørsted and CHAMP missions that were launched in 1999 and 2000, respectively. ESA’s new satellite constellation Swarm (cf. Friis-Christensen et al. 2006) is anticipated to conduct even more accurate measurements. The provided data generally contains contributions from various sources of the Earth’s magnetic field (cf. Fig. 1). The major contributions are due to dynamo processes in the Earth’s interior (core/main field), electric currents in the iono- and magnetosphere (external field), and static magnetization in the Earth’s lithosphere in combination with induction processes (crustal/lithospheric field). A better understanding and quantification of the different contributions requires sophisticated mathematical tools and a great effort in modeling. Current models comprising different parts of the geomagnetic field are, e.g., IGRF11 (cf. IAGA 2010), MF7 (2010), CHAOS-4 (cf. Olsen et al. 2014), and GRIMM-3 (2011). Further overviews on the geophysical background and satellite data situation can be found, e.g., in Langel (1987), Langel and Hinze (1998), Hulot et al. (2007, 2010), and Thébault et al. (2010). In this article, we are mainly concerned with parts of the geomagnetic field whose length and time scales are such that the displacement currents in the Maxwell equations can be neglected (cf. Backus et al. 1996). Therefore, it is reasonable to concentrate the modeling effort on the pre-Maxwell equations r ^ b D 0 j; r b D 0;
(1) (2)
where b denotes the magnetic field and j the electric current density (for all theoretical considerations the vacuum permeability 0 is chosen to be equal to one;
Fig. 1 Schematic description of the Earth’s magnetic field contributions
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
881
for the application to satellite data, we switch to the SI unit system). In regions with vanishing current density j D 0, such as the neutral atmosphere between the Earth’s surface RE D fx 2 R3 j jxj D RE g and the lower bound of the ionosphere RI D fx 2 R3 j jxj D RI g, the classical Gauss representation holds true: the magnetic field b can be expressed by a harmonic potential U via b D rU;
(3)
where U has the well-known representation U D
1 2nC1 X X
int ext an;k Hn;k .RI / C cn;k Hn;k .RI /;
(4)
nD1 kD1 int for adequate coefficients an;k , cn;k , and a fixed radius R 2 .RE ; RI /. By Hn;k .RI / ext and Hn;k .RI /, we denote the inner and outer harmonics, respectively, which are the 3 harmonic extensions of the spherical harmonics R1 Yn;k into int R D fx 2 R j jxj < 3 int int Rg and ext D fx 2 R j jxj > Rg, respectively. The part b D rU of b D rU R that consists of all outer harmonic contributions represents the magnetic field that is generated by source currents j in the interior of RE , while the part b ext D rU ext of b D rU that consists of all inner harmonic contributions resembles the magnetic field that is generated by sources in the exterior of RI . Satellite measurements, however, are generally conducted in regions of ionospheric currents, i.e., j 6D 0. Thus, the Gauss representation breaks down for modeling from such data sources. As a substitute, Gerlich (1972) and Backus (1986) suggest the Mie representation for geomagnetic applications. Any solenoidal vector field, such as the magnetic field b, can be decomposed into
b D pb C qb D r ^ LPb C LQb ;
(5)
with uniquely determined vector fields pb , qb (the operator L is the curl gradient acting at a point x 2 R3 via x ^ rx , where ^ denotes the vector product). The corresponding scalars Pb ; Q if they have vanishing R b are uniquely determined R 1 1 integral mean values, i.e., if 4 P .r /d !. / D Q b b .r /d !. / D 0. The 4 term pb is known as the poloidal part of the magnetic field, while qb is the toroidal part. The associated scalar functions Pb and Qb are often just called Mie scalars. Due to (1), the corresponding current density allows a Mie representation as well: j D pj C qj D r ^ LPj C LQj :
(6)
In combination with the pre-Maxwell equations, this yields a fundamental connection for the Mie scalars of the magnetic field and its source currents: Qb D Pj ;
(7)
Pb D Qj :
(8)
882
C. Gerhards
In other words, toroidal magnetic fields are caused by poloidal currents and poloidal fields by toroidal currents. Furthermore, the Mie scalars represent the analogon to the scalar potential U of the Gauss representation and can be expanded in terms of spherical harmonics on a fixed sphere R . An extension into the exterior space ext R or the interior space int R by outer and inner harmonics, respectively, is not possible without further information since Pb , Pj , and Qb , Qj are generally not harmonic. Within the framework described above, Fourier expansions with respect to spherical harmonics are the most popular and widespread tool to describe the geomagnetic field and related quantities. They form the basis of all models mentioned in the beginning of this introduction. An overview on spherical harmonic methods in geomagnetism can be found, e.g., in Backus et al. (1996) and the contribution Sabaka et al. (2014) of this handbook. These methods do well for global approximations and uniformly distributed data and have the advantage of physically relevant interpretations of the degree n in terms of multipoles and frequencies. However, the global nature of spherical harmonics makes them less suitable for the reconstruction of strongly localized features, such as crustal field anomalies or modeling from local or unevenly distributed data sets. Over the years, different methods have been developed to address these topics. Haines (1985) has introduced spherical cap harmonics, a basis system for spherical caps that is obtained by solving the Laplace equation with adequate boundary values. A revision of the approach can be found in Thébault et al. (2006). Spline functions, as used for the gravitational potential in Freeden (1981), have been formulated for geomagnetic purposes in Shure et al. (1982). So-called Slepian functions on the sphere, which are obtained by optimization of the spatial localization under certain constraints of band-limitation in frequency domain (or vice versa), are treated in Simons et al. (2006) and Simons and Plattner (2014). All of these approaches show the capability of modeling from local data, but the area of localization has to be fixed in advance. Multiscale approaches, on the other hand, are able to start on a global scale and then refine the localization step by step. Spherical versions have already been introduced in Dahlke et al. (1995), Schröder and Swelden (1995), Freeden and Windheuser (1996), Holschneider (1996), and Freeden et al. (1998). The application to geomagnetic problems, however, is rather recent (see, e.g., Bayer et al. 2001; Holschneider et al. 2003; Maier and Mayer 2003; Chambodut et al. 2005; Mayer and Maier 2006; Freeden and Gerhards 2010; Gerhards 2012). The goal of this chapter is to give an overview on the spherical multiscale methods developed at the Geomathematics Group of the University of Kaiserslautern and their application to problems in geomagnetism based on the geophysical foundations described in the beginning of the introduction. More precisely, in Sect. 2 we introduce necessary function systems such as spherical harmonics, Green’s function for the Beltrami operator, and the single layer kernel. Two different ways of constructing multiscale representations, one focused on the frequency domain and the other one focused on locally compact support in spatial domain, are provided in Sect. 3. The application to different problems in geomagnetic modeling, namely, crustal field modeling, reconstruction of radial current systems, and the definition of a multiscale power spectrum, is then treated in Sect. 4.
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
2
883
Relevant Function Systems
Fourier expansions on the sphere in terms of spherical harmonics are the classical approach to potential field modeling in geosciences, so that an introduction of this function system is inevitable. We are, however, in the first place interested in spherical harmonics as stepping stones leading from the frequency-oriented Fourier expansions to spatially oriented multiscale methods. Further important function systems on this path are Green’s function for the Beltrami operator and the single layer kernel. In many geomagnetic applications they make the explicit use of spherical harmonics unnecessary.
2.1
Spherical Harmonics
The (scalar) spherical harmonics Yn;k of degree n and order k denote the infinitely often differentiable eigenfunctions of the Beltrami operator to the eigenvalues n.n C 1/, i.e., n 2 N0 ; k D 1; : : : ; 2n C 1:
Yn;k D n.n C 1/Yn;k ;
(9)
They can be orthonormalized in the sense
Z .Yn;k ; Ym;l /L2 ./ D
Yn;k . /Ym;l . / d !. / D
0; if n 6D m or k 6D l; 1; if n D m and k D l:
(10)
Note that D 1 denotes the unit sphere. Whenever we mention spherical harmonics Yn;k in this chapter, we mean them to be L2 ./-orthonormalized in the above sense. It should be noted that, different from our convention, Schmidt seminormalized spherical harmonics are more commonly used in geomagnetic literature. Any square integrable function F 2 L2 ./ can be represented by the Fourier expansion
F D
1 2nC1 X X
F ^ .n; k/Yn;k ;
(11)
nD0 kD1
with Fourier coefficients Z ^
F .n; k/ D
F . /Yn;k . / d !. /;
(12)
where convergence of the series is meant with respect to the L2 ./-topology. Of some further importance for the analysis of harmonic potentials are the so-called inner and outer harmonics
884
C. Gerhards
1 r n Yn;k . / ; x 2 int R; R R 1 R nC1 ext Hn;k .RI x/ D Yn;k . / ; x 2 ext R ; R r
int Hn;k .RI x/ D
(13) (14)
x where x D r with r D jxj, D jxj , as always in this chapter. They are ext the uniquely determined harmonic functions in int R and R , respectively, with 1 spherical harmonics R Yn;k as boundary values on R . We know that the scalar potential U of the Gauss representation b D rU can be expressed in terms of (scalar) spherical harmonics and inner and outer harmonics (cf. expression (4)). The magnetic field b itself, however, is a vectorial quantity, so that it would be convenient to have vectorial counterparts of the (scalar) spherical harmonics at hand. To this end, we define the three spherical operators .1/ .2/ .3/ o F . / D F . /, o F . / D r F . /, and o F . / D L F . /, acting at 2 on a sufficiently often differentiable scalar function F on the sphere . The operator r denotes the surface gradient, related to the gradient via rx D @r@ C 1r r , while L denotes the surface curl gradient, acting at a point 2 via L D ^ r . The well-known spherical Helmholtz decomposition states that any vector field f 2 c .1/ ./ can be represented by
f D o.1/ F1 C o.2/ F2 C o.3/ F3 ;
(15)
with uniquely determined scalar functions F1 2 C .1/ ./ and F2 ; F3 2 C .2/ ./ satisfying 1 4
Z
1 F2 . / d !. / D 4
Z F3 . / d !. / D 0:
(16)
Therefore, it is straightforward to define a system of vector spherical harmonics of type i in the following way. .i /
Definition 1. The functions yn;k given by .i /
1
yn;k D ..in / / 2 o.i / Yn;k ;
i D 1; 2; 3; n 2 N0i ; k D 1; : : : ; 2n C 1
(17)
are called vector spherical harmonics of type i , degree n, and order k. For brevity, .1/ we use the notations 01 D 0, 02 D 03 D 1, and normalization constants n D 1, .2/ .3/ n D n D n.n C 1/. Similar as in the scalar case, any vectorial function f 2 l 2 ./ can be expressed as a Fourier series
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
f D
1 2nC1 3 X X X .i / .f .i / /^ .n; k/ yn;k ;
885
(18)
i D1 nD0i kD1
with Fourier coefficients
Z
.i /
.f .i / /^ .n; k/ D
f . / yn;k . / d !. /:
(19)
This way we have defined a vectorial orthonormal basis for l 2 ./ that decomposes a vector field f with respect to the Helmholtz representation, i.e., it decomposes f into a radial part (i D 1) and two tangential parts (i D 2; 3). There are further ways to define vector basis systems. A system especially suited for the separation of the geomagnetic field into interior and exterior sources requires the operators oQ .1/ D o.1/ .D C 12 / o.2/ , oQ .2/ D o.1/ .D 12 / C o.2/ , and oQ .3/ D o.3/ , with 1 1 2 : (20) D D C 4 The operator D is characterized by the spherical harmonics Yn;k being the infinitely often differentiable eigenfunctions to the eigenvalues n C 12 , i.e., 1 Yn;k ; n 2 N0 ; k D 1; : : : ; 2n C 1: DYn;k D n C (21) 2 Its inverse is called the (spherical) single layer operator (some more details are given in Sect. 2.3). We can now define an alternative orthonormal system of vector spherical harmonics. .i /
Definition 2. The functions yQn;k given by .i /
1
yQn;k D .Q .in / / 2 oQ .i / Yn;k ;
i D 1; 2; 3; n 2 N0i ; k D 1; : : : ; 2n C 1
(22)
are called vector spherical harmonics of type i , degree n, and order k. The .1/ .2/ normalization constants are Q n D .n C 1/.2n C 1/, Q n D n.2n C 1/, and .3/ Q n D n.n C 1/. Again, any vectorial function f 2 l 2 ./ can be expressed as a Fourier series f D
1 2nC1 3 X X X .i / .fQ.i / /^ .n; k/ yQn;k ;
(23)
i D1 nD0i kD1
with Fourier coefficients Z
.i /
.fQ.i / /^ .n; k/ D
f . / yQn;k . / d !. /:
(24)
886
C. Gerhards
The advantage of this system is its connection to outer and inner harmonics: 1 r n1 .2/ 1 .2/ .Q n / 2 yQn;k . /; x 2 int R; R2 R 1 R nC2 .1/ 1 .1/ ext rx Hn;k .RI x/ D 2 .Q n / 2 yQn;k . /; x 2 ext R : R r int .RI x/ D rx Hn;k
(25) (26)
A more detailed overview on scalar, vector, and here not mentioned tensor spherical harmonics can be found, e.g., in the monographs Müller (1966), Freeden et al. (1998), and Freeden and Schreiner (2009) . The second set of vector spherical harmonics and some of its applications are additionally treated, e.g., in Edmonds (1957), Backus et al. (1996), and Mayer and Maier (2006).
2.2
Green’s Function for the Beltrami Operator
As Green’s function with respect to the Beltrami operator, we define the rotationally invariant function G. I / satisfying G. I / D
1 ; 4
; 2 ; 6D ;
(27)
with a singularity of type G. I / D O.ln.1 //;
; 2 ; ! :
(28)
Uniqueness is guaranteed by the claim that G. I / has vanishing integral mean value on the unit sphere. Observing that the spherical harmonics Yn;k are eigenfunctions to the Beltrami operator with eigenvalues n.n C 1/, one can derive the Fourier representation
G. I / D
1 2nC1 X X nD1 kD1
D
1 Yn;k . /Yn;k . / n.n C 1/
1 X
2n C 1 Pn . /; 4n.n C 1/ nD1
(29)
; 2 ; 6D ;
where Pn denotes the Legendre polynomial of degree n. Of more relevance to us is that a closed representation G. I / D
1 1 ln.1 / C .1 ln.2//; 4 4
; 2 ; 6D
(30)
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
887
is available. Furthermore, making use of the defining properties of Green’s function, we obtain a simple integral relationship of a twice continuously differentiable function F to its Beltrami derivative F . Theorem 1. Let F be of class C .2/ ./. Then 1 F . / D 4
Z
Z F . /d !. / C
G. I / F . / d !. /;
2 :
(31)
As a consequence of the above representation, it is straightforward to see that Z G. I /H . / d !. /;
F . / D
2 ;
(32)
is the uniquely determined solution with vanishing integral mean value of the Beltrami equation F . / D H . /;
2 ;
(33)
where H is a continuous function having vanishing integral mean value itself. Since the Beltrami equation as well as differential equations involving the surface gradient r and the surface curl gradient L appear frequently in geomagnetic problems, Eq. (31) is a helpful companion in our later modeling. Using the expression (29) together with (32), we obtain the classical Fourier expansion of the solution F , while the closed expression (30) is a first step towards a more spatially oriented representation of F . The crucial idea for a multiscale representation with spatially locally supported wavelets is a regularization of G. I / around its singularity. A typical choice for the regularization G . I / with regularization parameter 2 .0; 2/ is to substitute the Green function on the spherical cap . / D f 2 j1 < g with center and radius by its Taylor polynomial centered at 1 . Taking the Taylor polynomial of degree 2 leads to a twice continuously differentiable function: Definition 3. By G . I /, 2 .0; 2/, we denote the regularized Green function 8 1 1 ln.1 / C 4 .1 ln.2//; ˆ ˆ < 4 G . I / D 1 2 .1 /2 C 1 .1 / 2 8 ˆ ˆ : 1 1 C 4 .ln./ ln.2// 8 ;
1 ; (34) 1 < :
Higher-degree Taylor polynomials yield smoother regularizations and become necessary when higher-order differential operators are involved. But for our later applications, it is generally sufficient to use (34). Only the example in Sect. 4.2 requires G . I / to be three times continuously differentiable (for more details on
888
C. Gerhards
higher-order regularizations, the reader is referred to Gerhards (2011a) and Freeden and Gerhards (2012)). At this point it should be emphasized that G. I / and its regularization only depend on the scalar product , so that they can be regarded as functions acting on the interval Œ1; 1/ and Œ1; 1 , respectively. This simplifies many calculations. A plot of the regularized Green function (34) can be found in Fig. 2. The next question is how a convolution with the regularized Green function behaves in comparison to a convolution with the original Green function, asymptotically as ! 0C. Elementary calculations yield the following relation for functions F 2 C .0/ ./: ˇZ ˇ Z ˇ ˇ lim sup ˇˇ G . I /F . /d !. / G. I /F . /d !. /ˇˇ D 0: !0C 2
(35) Vectorial kernels are achieved by application of the surface gradient and the surface curl gradient. Since the tangential derivatives of G. I / are still integrable on the sphere, a similar argumentation yields ˇZ ˇ lim sup ˇˇ ƒ G . I / f . /d !. / !0C 2
Z
ƒ G. I
ˇ ˇ / f . /d !. /ˇˇ D 0;
(36)
for ƒ being one of the operators r or L and f 2 c .0/ ./. Using the surface theorem of Gauss, twofold application of the surface differential operators implies tensorial kernels satisfying
Fig. 2 Regularized Green’s function # ! G . I cos.#// (left) and regularized single layer kernel # ! S .cos.#// (right), plotted with respect to the angular distance
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
889
ˇZ ˇ ;.1/ ƒ ˝ ƒ;.2/ lim sup ˇˇ G . I / f . /d !. /
!0C 2
;.1/ ƒ
ˇ Z ˇ ;.2/ ƒ G. I / f . /d !. /ˇˇ D 0;
(37)
for f 2 c .1/ ./ and ƒ;.1/ , ƒ;.2/ one of the operators r , L (by ˝ we mean the tensor product of two vectorial quantities). Various further relations, e.g., those involving the Beltrami operator , can be shown under sufficient smoothness assumptions on the data F and f . Those limit relations provided here should be sufficient to motivate the convergence of the multiscale representations treated later on. Such regularizations of Green’s function for the Beltrami operator have first been introduced in Freeden and Schreiner (2006) and Fehlinger et al. (2007, 2008). In geomagnetic applications they have been used in Freeden and Gerhards (2010) and Gerhards (2011a, 2012).
2.3
Single Layer Kernel
In this subsection we briefly come back to the operator D from (20), more precisely its inverse D 1 . Since the spherical harmonics form the eigenfunctions of D to the eigenvalues n C 12 , we get D 1 F D
1 2nC1 X X nD0 kD1
1 F ^ .n; k/Yn;k ; n C 12
(38)
for F 2 C .0/ ./. More interesting for our spatial considerations is again that D 1 can be rewritten as a convolution operator: D 1 F . / D
1 2
Z S . /F . / d !. /;
2 ;
(39)
where S is the (spherical) single layer kernel given by 1 S . / D p ; 2.1 /
; 2 ; 6D :
(40)
The function S coincides, up to a multiplicative constant, with the fundamental solution of the Laplace operator in R3 restricted to the unit sphere, which is also the reason for the name (spherical) single layer kernel. S and D 1 are not as naturally connected to the governing geomagnetic equations as Green’s function for the Beltrami operator and the spherical differential operators r , L , and ,
890
C. Gerhards
but they are required for a decomposition of the magnetic field with respect to interior and exterior sources involving the operators oQ .i / , i D 1; 2; 3. A regularization S , 2 .0; 2/, of S around its singularity can be achieved in analogy to the Green function for the Beltrami operator. For our purposes the following continuously differentiable example is sufficient: Definition 4. By S , 2 .0; 2/, we denote the regularized single layer kernel (
S . / D
1 p1 .1 / 2 ; 2 3 1 p 2 .1 2 2
1 ;
/ C
1 3 p 2 ; 2 2
1 < :
(41)
An illustration is given in Fig. 2. The behavior of convolutions with the regularized single layer kernel is also similar to those involving G . I /. Thus, we get for F 2 C .0/ ./: ˇZ ˇ Z ˇ ˇ ˇ lim sup ˇ S . /F . /d !. / S . /F . /d !. /ˇˇ D 0: !0C 2
(42)
If we want to apply r and L to the above expression, we need to be aware that
7! r S . / and 7! L S . / are not integrable on the sphere. Assuming that F 2 C .1/ ./ we still get ˇZ ˇ Z ˇ ˇ ˇ lim sup ˇ ƒ S . /F . /d !. / ƒ S . /F . /d !. /ˇˇ D 0; (43) !0C 2
where ƒ is one of the operators r , L . Corresponding relations for further applications of differential operators and combinations of the single layer kernel and Green’s function become necessary for some applications later on, but we skip them at this point due to the similar argumentation. More details can be found in Gerhards (2011a, 2012). In the different context of gravity disturbances, regularizations of the single layer kernel have been used as well in Freeden and Wolf (2008) and Wolf (2009).
3
Two Approaches to Multiscale Representations
As mentioned in the introduction, we want to substitute global Fourier expansions by multiscale representations involving scaling kernels ˆJ .; / and wavelet kernels ‰J .; / with certain spatial localization properties. In the following we motivate and describe these representations and the choice of the kernels. We present a construction scheme in frequency domain as well as spatial domain. Both approaches are based on the same principles in the sense that the wavelet
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
891
kernels arise as differences of scaling kernels. However, the frequency approach is somewhat more flexible in the choice of the scaling parameters and leads to a multiresolution analysis, while the spatial approach has its advantages in the locally supported wavelets and the closed representations in terms of elementary functions. Explicit examples are provided in the applications in Sect. 4.
3.1
Wavelets as Frequency Packages
In the scalar case, a so-called scaling kernel ˆJ .; / at scale J is formally expressed by
ˆJ . ; / D
1 2nC1 X X
ˆ^ J .n/Yn;k . /Yn;k . /
(44)
nD0 kD1
D
1 X 2n C 1 nD0
4
ˆ^ J .n/Pn . /;
; 2 ;
where ˆ^ J .n/ are coefficients (also called symbols) reflecting the spatial and frequency behavior of the kernel. Typically the choice is such that the kernels show a better spatial localization as the scale J increases. The fact that ˆ^ J .n/ is assumed to be independent of the order k makes the kernel zonal, i.e., it only depends on the scalar product and can be regarded as a function on the interval Œ1; 1 . This simplifies many calculations but is not crucial for the approach. The notion “wavelets as frequency packages” in the title of this subsection simply stands for the superposition of spherical harmonics of different degrees, where the influence of every frequency, i.e., the influence of every degree n, is given by ˆ^ J .n/. One can distinguish four main types of kernels: band-limited, non-band-limited, space-limited, and non-space-limited ones. Band-limited kernels are characterized by ˆ^ J .n/ D 0, n N , for some sufficiently large N . In other words, they are strongly localized in frequency domain. Perfect frequency localization would be given by the Legendre kernel defined via ˆ^ J .n/ D 1, n D m, and ˆ^ .n/ D 0, n D 6 m, for some fixed integer m. The non-band-limited J counterparts generally show a much stronger spatial localization. No space-limited kernel (i.e., ˆJ . ; / has locally compact support) can simultaneously be bandlimited. Perfect spatial localization is formally achieved by the Dirac kernel, setting ˆ^ J .n/ D 1 for all n 2 N0 (however, this is only to be understood in a distributional sense since (44) becomes singular for this particular choice of symbols). The variation of the symbols ˆ^ J .n/ for different scales J represents the trade-off between spatial and frequency localization (see illustration below).
892
C. Gerhards
Ideal frequency localization Legendre kernel
Ideal space localization Band-limited Non band-limited Non space-limited Space-limited
Dirac kernel
For a more precise quantitative categorization in terms of the uncertainty principle, the reader is referred to Freeden (1998) and the contribution Freeden and Schreiner (2014a) of this handbook. A band-limited example is given by the cubic polynomial (CP) kernel
ˆ^ J .n/
( 2 1 n2J 1 C n2.J 1/ ; n 2J 1; D 0; else;
(45)
a non-band-limited one by the Abel-Poisson kernel J
n 2 ˆ^ J .n/ D e
;
n 2 N0 :
(46)
The Abel-Poisson kernel has the advantageous property to be a non-band-limited kernel with a known closed representation. In general, such representations are not known, and the numerical evaluation of the kernel requires the truncation of the series representation at some degree, so that de facto a band-limited kernel is evaluated. Figure 3 illustrates the evolution of space and frequency localization of the two kernels above at different scales J . The increase in spatial localization with increasing scale J reflects a “zooming-in” capability for regions of higher data density. This motivates us to start at a low scale J0 to reconstruct coarse global features of the quantity under investigation via use of ˆJ0 .; / and then locally improve the approximation by adding contributions for higher scales J involving related wavelet kernels ‰J .; /. We now give a more precise description of the method outlined above. First, the symbols ˆ^ J .n/ need to satisfy the following conditions: 2 .n/ D n ; n 2 N0 , (i) limJ !1 ˆ^ P1 2nC1 J ^ 2 (ii) nD0 4 ˆJ .n/ < 1; J 2 Z, 2 ^ 2 (iii) ˆ^ ˆJ .n/ ; J 2 Z; n 2 N0 . J C1 .n/ The CP as well as the Abel-Poisson symbols satisfies all three conditions for n D 1, n 2 N0 , leading to so-called approximate identities. This, however, is only a special case. In many applications, fn gn2N0 denotes the set of symbols of a pseudodifferential operator ƒ relating the modeled quantity F to the given data G via ƒ G D F , i.e.,
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
893
Fig. 3 Illustration of # 7! ˆJ .cos.#// with respect to angular distance (left) and the corresponding symbols ˆ^ J .n/ for degrees n D 0; : : : ; 20 (right) for two different kernels: CP kernel (top) and Abel-Poisson kernel (bottom)
F Dƒ GD
1 2nC1 X X
n G ^ .n; k/Yn;k :
(47)
nD0 kD1
We do not go into detail on summability conditions for the symbols n and the corresponding Sobolev spaces. For that the reader is referred to Freeden et al. (1998) and references therein. We are simply aiming at outlining the ideas of the multiscale concept. Under these circumstances, assumptions (i)–(iii) yield F D lim ˆJ ˆJ G D lim J !1
J !1
1 2nC1 X X
2 ^ ˆ^ J .n/ G .n; k/ Yn;k :
(48)
nD0 kD1
For brevity, we write ˆJ G for the convolution of a (scalar) scaling kernel ˆJ .; / with a (scalar) function G 2 L2 ./, i.e.,
894
C. Gerhards
Z ˆJ G D
ˆJ .; /G. /d !. /:
(49)
˚
Condition (iii) implies that the scale spaces VJ D ˆJ ˆJ Gj G 2 L2 ./ , J 2 Z, are nested in the sense VJ VJ C1 L2 ./;
J 2 Z:
(50)
This nesting leads to a so-called multiresolution analysis. It implies the nice property that the approximation error of F improves for every increase of scale J . In order to make use of the previously mentioned “zooming-in” capability, we now define wavelet kernels of scale J by
‰J . ; / D
1 2nC1 X X
‰J^ .n/Yn;k . /Yn;k . /
(51)
nD0 kD1
D
1 X 2n C 1 nD0
4
‰J^ .n/Pn . /;
; 2 ;
where the symbols ‰J^ .n/ read ‰J^ .n/ D
2 ^ 2 12 ˆ^ .n/ ˆJ .n/ ; J C1
J 2 Z:
(52)
From this setting it is easily seen that ˆJ C1 ˆJ C1 G D ˆJ ˆJ G C ‰J ‰J G;
J 2 Z;
(53)
i.e., the wavelet contribution of scale J represents the remainder between the approximations at scales J and J C 1. In regard of (48), we are now able to state the following multiscale representation. Theorem 2. Let fˆJ gJ 2Z be a set of scaling kernels satisfying (i)–(iii), f‰J gJ 2Z the corresponding set of wavelet kernels, and J0 2 Z be fixed. Furthermore, assume that F; G 2 L2 ./ are related by ƒ G D F for some pseudodifferential operator ƒ (cf. expression (47)). Then
F D ˆJ0 ˆJ0 G C
1 X
‰J ‰J G;
J DJ0
where equality is meant in the sense of the L2 ./-topology.
(54)
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
895
Since a low scale J0 typically implies a concentration of the symbols ˆ^ J .n/ around low spherical harmonic degrees n, the scaling contribution ˆJ0 ˆJ0 G yields low frequency, i.e., coarse global features. More and more spatially localized features are added by the wavelet contributions ‰J ‰J G. The increasing localization of the wavelet kernels ‰J .; / also allows the use of local data sets to improve the approximation. This reflects the property previously denoted as “zooming-in.” It should be noted that the wavelet kernels mentioned in this subsection are typically not locally supported but only show an increasingly good localization, so that some error is made if only local information on G is used. Yet, if the scale J is sufficiently large, this error is negligible. Essentially the same concept as described above also holds true for the approximation of vector fields. We just briefly describe the necessary adaptations: the vector scaling kernel of type i D 1; 2; 3 and scale J is given by .i /
J . ; /
1 2nC1 X X .i / .i / D . J /^ .n/Yn;k . /yn;k . /;
; 2 ;
(55)
nD0i kD1 .i /
where the symbols . J /^ .n/, i D 1; 2; 3 fixed, satisfy conditions (i)–(iii). The use .i / .i / of the vector spherical harmonics yQn;k instead of yn;k is valid as well. The associated vector wavelet kernel of type i and scale J reads .i / J . ; /
1 2nC1 X X .
D
.i / ^ .i / J / .n/Yn;k . /yn;k . /;
; 2 ;
(56)
J 2 Z:
(57)
nD0i kD1
with the symbols .
.i / ^ J / .n/
D
.i /
. J C1 /^ .n/
2
.i / 2 12 . J /^ .n/ ;
.i /
Theorem 3. Let f J gi D1;2;3; J 2Z be a set of vector scaling kernels satisfying .i / the conditions (i)–(iii), f J gi D1;2;3; J 2Z the corresponding set of vector wavelet kernels, and J0 2 Z be fixed. Furthermore, assume that f; g 2 l 2 ./ are related by ƒ g D f for some vectorial pseudodifferential operator ƒ . Then
f D
3 X i D1
.i /
.i /
J0 ? J0 g C
1 3 X X
.i / J
?
.i / J
g;
(58)
i D1 J DJ0
where equality is meant in the sense of the l 2 ./-topology. .i /
The scalar-valued convolution of a vector scaling kernel J .; / with a vector field g appearing in (58) is defined by
896
C. Gerhards
Z
.i /
J g D
.i /
J .; / g. /d !. /;
(59) .i /
while the vector-valued convolution of a vector scaling kernel J .; / with a scalar function F reads Z .i / .i /
J ? F D
J . ; /F . /d !. /: (60)
Remark 1. The multiscale representations from Theorems 2 and 3 represent a bilinear approach. However, the concept can be formulated in a linear approach as well. The assumptions (i)–(iii) on the scaling kernel just need to be substituted by (i’) limJ !1 ˆ^ .n/ D n ; n 2 N0 , P1 2nC1J ^ 2 (ii’) < 1; J 2 Z, nD0 4 ˆJ .n/ ^ (iii’) ˆ^ .n/ ˆ .n/ 0; J 2 Z; n 2 N0 . J C1 J Then the (scalar) multiscale representation (54) turns into F D ˆJ0 G C
1 X
‰J G;
(61)
J DJ0
where the symbols of the wavelet kernels are defined by .i /
.i /
.i /
.‰J /^ .n/ D .ˆJ C1 /^ .n/ .ˆJ /^ .n/;
J 2 Z:
(62)
In the vectorial case the changes are slightly bigger since we now need tensorial scaling and wavelet kernels. More precisely, .i / ˆ J . ; /
1 2nC1 X X .i / .i / .i / D .ˆ J /^ .n/ yn;k . / ˝ yn;k . /;
; 2 :
(63)
nD0i kD1 .i /
The tensorial wavelet kernels ‰ J .; / are defined analogously. The vector-valued convolution of a tensor scaling kernel of type i D 1; 2; 3 with a vector field g is given via Z .i / .i / ˆJ ? g D ˆ J .; /g. /d !. /: (64)
As a consequence, (58) turns into
f D
3 X i D1
.i /
ˆ J0 ? g C
1 3 X X i D1 J DJ0
.i /
‰ J ? g:
(65)
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
897
The linear approach is mentioned since it forms the direct connection between the wavelets constructed as frequency packages in this subsection and the locally supported wavelets in the next subsection.
3.2
Locally Supported Wavelets
We start from the assumption that we already know a set of (scalar) scaling kernels ˆJ W ! R or a set of tensorial scaling kernels ˆ J W ! R3 3 satisfying (i) ˆJ . ; / D ˆJ C1 . ; /; ˆ J . ; / D ˆ J C1 . ; /; 1 J J C1 , (ii) limJ !1 sup 2 jˆJ . ; / G F . /j D 0; limJ !1 sup 2 jˆ J . ; / ? g f . /j D 0, where F , f denote the modeled scalar and vectorial quantities, respectively, and G, g the corresponding input data. For the parameter J 2 .0; 2/ we assume that it tends to zero as J ! 1. We suppose that all appearing functions F , f , G, g are at least continuous on the sphere (or of higher-order differentiability, depending on the problem). The main difficulty is to find problem-specific scaling kernels that actually satisfy (i) and (ii). The regularized Green function G J . I / and the regularized single layer kernel S J clearly satisfy condition (i) and form an important tool for their construction (more details can be found in the examples of Sect. 4). Property (i) implies that ˆJ C1 . ; / ˆJ . ; / and ˆ J C1 . ; / ˆ J . ; / vanish for 1 J , so that the wavelet kernels ‰J . ; / D ˆJ C1 . ; / ˆJ . ; /;
2 ;
(66)
‰ J . ; / D ˆ J C1 . ; / ˆ J . ; /;
2 ;
(67)
are locally supported on the spherical cap J . / D f 2 j1 < J g with radius J and center . The multiscale representation of the next theorem follows directly from (i) and (ii). Theorem 4. Let fˆJ gJ 2Z and fˆ J gJ 2Z be scaling kernels satisfying properties (i), (ii), f‰J gJ 2Z and f‰ J gJ 2Z the corresponding wavelet kernels, and J0 2 Z be fixed. Then we have
F D ˆJ0 G C
1 X
‰J G;
(68)
‰ J ? g;
(69)
J DJ0
f D ˆ J0 ? g C
1 X J DJ0
898
C. Gerhards
where equality is meant in the uniform sense of the C .0/ ./- and c .0/ ./-topology, respectively. Equations (66) and (67) show that the symbols ‰J^ .n/ of the wavelet kernel can ^ be achieved by just taking the difference of the symbols ˆ^ J C1 .n/ and ˆJ .n/ of the scaling kernels. Thus, we end up exactly with the linear approach mentioned in Remark 1, and we have the same desirable “zooming-in” capability of the multiscale representation. Since the wavelet kernels in Theorem 4 have locally compact support, the disregard of data outside the spherical cap J . / actually does not lead to any deterioration at all. However, different from the construction of the scaling kernels in frequency domain, the construction of kernels with closed representations satisfying (i) and (ii) is sometimes more tedious and problem specific (especially condition (i) would be difficult to realize just by choice of the symbols ˆ^ J .n/). Regularizations of Green’s function for the Beltrami operator and the single layer kernel are a great help but limit the approach to the operators , r , L , and D 1 . For other types of equations, one has to hope for similar closed representations of the fundamental solutions and their regularizations. In our examples, the function systems presented in Sects. 2.2 and 2.3 are sufficient and supply us with kernels that allow an easy numerical evaluation. Concerning further relations to the frequency-oriented approach, it has to be mentioned that scaling kernels satisfying the conditions (i) and (ii) from Sect. 3.2 typically do not satisfy condition (iii’) from Sect. 3.1. A calculation of the Fourier coefficients for the regularized Green function can be found, e.g., in Freeden and Schreiner (2009), and shows that they frequently turn negative. In other words, the scale spaces are not nested in the sense of (50), and the approximation of F does not have to improve for every single increase of scale J , although in our numerical examples it typically does. Remark 2. There are cases where a scalar quantity F is modeled from vectorial input data g (see, e.g., field-aligned currents in Sect. 4.2) or vice versa. The corresponding multiscale concepts are completely analogous to those presented up to now and have only been left out for reasons of clarity and comprehensibility.
4
Application to Geomagnetic Problems
Some geophysical background has already been presented in the introduction. In this section we indicate how multiscale techniques can be applied to crustal field modeling and the separation of the geomagnetic field with respect to its sources, to the reconstruction of field-aligned current systems, and to a power spectrum that reflects spatially localized properties. In all cases, the approach via frequency packages as well as the approach via locally supported wavelets is described.
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
4.1
899
Crustal Field Modeling and Separation of Sources
A first attempt to separate internal and external contributions of the Earth’s magnetic field has been undertaken in Gauss (1839), using that b D rU in source-free regions. Under these circumstances, the part of the representation (4) that involves inner harmonics is due to sources in the exterior of the Earth, while the part involving outer harmonics represents the magnetic field that originates inside the Earth. If the measurements are conducted in source regions (i.e., the current density j in (1) does not vanish), the Gauss representation breaks down and we have to use the Mie representation. This is the case with satellite data collected on a spherical orbit R in the ionosphere. Yet, the poloidal part pb of the magnetic field b can be split up into an internal and an external contribution satisfying rx ^
pbint .x/
D
qj .x/; x 2 int R; 0; x 2 ext R ;
rx pbint .x/ D 0;
x 2 R3 ;
(70) (71)
and rx ^
pbext .x/
D
rx pbext .x/ D 0;
0; x 2 int R; qj .x/; x 2 ext R ; x 2 R3 ;
(72) (73)
respectively, where qj denotes the toroidal part of the current density j . The remaining toroidal part qb of the magnetic field can be interpreted as the contribution due to poloidal currents crossing the sphere R , with R D RS being the satellite altitude. In Backus et al. (1996) it is shown that the above split-up and its interpretation in terms of the location of the sources is in accordance with the law of Biot-Savart. Summarizing, the desired separation of the magnetic field is given by b D pbint C pbext C qb :
(74)
Equations (70) and (71) imply that pbint D rU int in ext R , where the harmonic potential U int can be expressed by outer harmonics, as we already indicated for the ext Gauss representation (4). Analogously, pbext D rU ext in int is entirely R , and U represented by inner harmonics. This approach has also been discussed in Backus et al. (1996) and Olsen et al. (2010). The latter additionally provides an overview and further references on the extraction of external sources by physically motivated models and parametrizations (which, however, require a priori knowledge of the space-time structure of the magnetic field). We continue by applying the previously introduced multiscale representations to a separation based on (74). The use of wavelets as frequency packages for this problem is described in more detail in
900
C. Gerhards
Mayer (2003, 2006) and Mayer and Maier (2006), while locally supported wavelets are presented in Gerhards (2011a, 2012). If interested in crustal field modeling, only the internal contribution pbint is relevant at satellite attitude. So the goal is to extract this quantity from an adequate set of magnetic field data. By adequate we mean that the data has been collected during magnetically quiet times and that a model of the core field and magnetospheric field has already been subtracted. Then the split-up (74) can correct the preprocessed data for possibly remaining undesired external contributions, as can be seen in the examples from Mayer and Maier (2006) and Gerhards (2012), where a CHAMP data set from 2001 has been used. The careful data selection becomes necessary because of the poor space-time coverage of satellite data, so that the time variability of the external sources is not entirely accounted for. More information on the data processing can be found, e.g., in Langel and Hinze (1998) and Maus et al. (2003).
Wavelets as Frequency Packages We have already concluded that the potential U int contains only outer harmonic contributions. In consequence, the relation (26) states that pbint D rU int can .1/ be expanded in terms of the vector spherical harmonics yQn;k . Analogously, (25) .2/ indicates the relation of pbext to yQn;k . More precisely, 1 2nC1 X X .1/ 1 .1/ .bQR /^ .n; k/ yQn;k . /; R nD0
x 2 R ;
(75)
1 2nC1 X X .2/ 1 .2/ D .bQR /^ .n; k/ yQn;k . /; R nD1
x 2 R ;
(76)
1 2nC1 X X .3/ 1 .3/ .bQR /^ .n; k/ yQn;k . /; R nD1
x 2 R ;
(77)
pbint .x/ D
kD1
pbext .x/
kD1
qb .x/ D
kD1
where R D jxj and D
.i /
x . jxj
.i / By .bQR /^ .n; k/ we mean the Fourier coefficient
.bQR /^ .n; k/ D
Z b.y/ R
1 .i / yQ R n;k
y d !.y/: jyj
.i /
(78)
In other words, the vector spherical harmonics fyQn;k gi D1;2;3 form a basis system that decomposes the magnetic field with respect to its sources. A multiscale
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
901
reconstruction can then be achieved as described in Sect. 3.1 by setting the scaling kernel to .i /
J . ; / D
1 2nC1 X X .i / .i / . J /^ .n; k/ Yn;k . / yQn;k . /;
; 2 :
(79)
nD0i kD1 .i /
The symbols . J /^ .n; k/ could be those corresponding to the CP kernel (observe that for the current example n D 1 in condition (i) of Sect. 3.1). Defining the .i / wavelet kernels J accordingly and choosing a sufficiently large-scale Jmax , we end up with the approximation pbint
.1/
J0
?
.1/
J0
bC
Jmax X
.1/ J
?
.1/ J
b;
(80)
.2/ J
?
.2/ J
b;
(81)
.3/ J
?
.3/ J
b
(82)
J DJ0
.2/
.2/
pbext J0 ? J0 b C
Jmax X J DJ0
for the poloidal parts on R , and .3/
.3/
qb J0 ? J0 b C
Jmax X J DJ0
for the toroidal part on R . The scale Jmax typically depends on the data density and the quality of the input data in order to guarantee a sufficiently good numerical evaluation of the occurring integrals.
Locally Supported Wavelets .i / The connection of the operators oQ .i / to the vector spherical harmonics yQn;k implies that a representation b D oQ .1/ BQ 1 C oQ .2/ BQ 2 C oQ .3/ BQ3
(83)
of the magnetic field b decomposes it with respect to the sources. More precisely, oQ .1/ BQ 1 denotes the part due to sources inside the sphere R , oQ .2/ BQ 2 the part due to exterior sources, and oQ .3/ BQ 3 the part due to sources on the sphere or crossing the sphere. One can derive that the scalars BQ 1 , BQ2 , and BQ 3 are uniquely determined by the condition of vanishing integral mean values 1 4
Z
1 BQ 1 .y/ BQ 2 .y/ d !.y/ D 4 R
Z BQ 3 .y/ d !.y/ D 0 R
(84)
902
C. Gerhards 180° W 90° W 0°
180° W 90° W 0°
90° E 180° E
90° E 180° E
90° N
90° N 45° N
1
45° N
1 0.1 0.01 1e-005
0°
0° 0.1
45° S 90° S
t
90° S
2
⏐Φ1 (ε , ⋅)⏐ 180° W 90° W 0°
1e-010
45° S
90° E 180° E
(1)
2
⏐Ψ1 (ε , ⋅)⏐ 180° W 90° W 0°
90° N
90° E 180° E
90° N 10
45° N
45° N
1e-005
1
0°
45° S
0°
(1)
90° S
2
⏐Φ4 (ε , ⋅)⏐ 180° W 90° W 0°
1e-010
45° S
0.1 90° S
90° E 180° E
(1)
2
⏐Ψ4 (ε , ⋅)⏐ 180° W 90° W 0°
90° N
90° E 180° E
90° N
100 10 1 0.1 0.01
100
45° N
10 1 0.1 0.01
45° N 10
0°
0° 1
45° S
45° S
1e-005
0.1 90° S
(1)
90° S
2
⏐Φ8 (ε , •)⏐
(1)
2
⏐Ψ8 (ε , ⋅)⏐
.1/
Fig. 4 Absolute values of the scaling kernels ˆ J ."2 ; / (left), centered at "2 D .0; 1; 0/T , and the .1/ corresponding wavelet kernels ‰ J ."2 ; / (right), for scales J D 1; 4; 8
and that they are expressible by 1 BQ1 D D 1 B1 C 2 1 BQ2 D D 1 B1 C 2 Q B3 D B3 ;
1 1 1 D B2 B2 ; 4 2 1 1 1 D B2 C B2 ; 4 2
(85) (86) (87)
where B1 , B2 , and B3 correspond to the spherical Helmholtz decomposition b D o.1/ B1 C o.2/ B2 C o.3/ B3 (more information on such representations can be found, e.g., in Gerhards (2011b, 2012)). Theorem 1 and its consequences for solutions of the Beltrami equation imply a representation of the Helmholtz scalars B1 , B2 , and B3 in terms of Green’s function with respect to the Beltrami operator, so that, after some lengthy but elementary calculations, (83) and (85)–(87) lead to the scaling kernels 1 J 1 1 J D ˝
G . I / C S . / ˝ r S J . / 2 8 4 (88)
.1/ ˆ J . ; /
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
903
1 1 J 1 r S . ; / ˝ r ˝ r G J C r ˝ r D 1 S J . / 4 4 2 . I /; 1 J 1 1 J .2/ G . I / S . / C ˝ r S J . / ˆ J . ; / D ˝
2 8 4 (89) 1 J 1 r S . ; / ˝
r ˝ r D 1 S J . / C 4 4 1 r ˝ r G J . I /; 2 .3/
ˆ J . ; / D L ˝ L G J . I /;
; 2 :
(90)
.i /
The wavelets ‰ J .; / are defined by differences of the scaling kernels as indicated in (67). An illustration is given in Fig. 4 (the scaling parameter is chosen to be J D 2J for all examples in this chapter). It can clearly be seen that the wavelet kernels have locally compact support on a spherical cap of scale-dependent radius. Although not locally supported, the scaling kernels reveal a good spatial localization as well (note the logarithmic scaling of the plots). Eventually, the limit relations from Sects. 2.2 and 2.3 and some slight modifications tell us that a sufficiently large Jmax can be chosen so that .i nt / pb
.1/ ˆ J0
?bC
Jmax X
.1/
(91)
.2/
(92)
.3/
(93)
‰ J ? b;
J DJ0 .ext /
pb
.2/
ˆ J0 ? b C
Jmax X
‰ J ? b;
J DJ0 .3/
qb ˆ J0 ? b C
Jmax X
‰ J ? b:
J DJ0
In Fig. 5, the approximation of the radial part of the Earth’s crustal field over Africa by the multiscale representation (91) of pbint is indicated. The used data set contains CHAMP satellite measurements collected between January 2009 and May 2010 and has been kindly supplied and preprocessed (core field and magnetospheric contributions have been subtracted using the CHAOS-4 model, cf. Olsen et al. 2014) by Nils Olsen, DTU Space. A first trend approximation at scale J D 2 is achieved using coarse global data, while for increasing scales and increasing local support of the wavelets, a higher data density has been used. At scale J D 10, the numerical integration has been conducted on a 360 360 equiangular data grid. This is also the maximal scale Jmax resolvable for the given data situation since for higher scales the amount of data points in the support of the wavelet kernels would be too small
904
C. Gerhards 0.5
40° N
20 W
0
20 E
40 E
60 W 30 W 0 30 E 60 E 90 E
60 E 0.4
20° N
1
60° N
0.3 0.2
0.5
30° N
0.1 0
0°
nT
nT
0
0°
−0.1 −0.2
20° S
−0.3
−0.5
30° S
60° S
−1
−0.4
40° S
Φ2(1)∗b
Ψ2(1)∗b
−0.5 3
30 W
0
40 W 20 W 0 20 E 40 E 60 E
30 E 60 E
4
60° N 2
3
40° N
40° N 2 1
20° N
20° N
1
−1
−1
20° S
0
0°
nT
nT
0
0°
20° S −2
40° S
−2
−3
40° S
60° S −3
Ψ3(1)∗b
−4
Ψ5(1)∗b
20 20° W
0°
20° E
40° E
60° E
40° N
1.5
40° N
20° W
0°
20° E
40° E
60° E 15
1
10 20° N
20° N 0.5
0
0°
−5
−0.5 20° S −1 40° S
−1.5
Ψ8(1)∗b
nT
0
nT
0°
5
20° S
−10 −15
40° S
Φ10(1)∗b
−20
Fig. 5 Radial component of the internal contributions over Africa at scale 2 (top left) and 10 (bottom right) with wavelet contributions at intermediate scales from 2 to 8. The white areas correspond to the calculation regions for all of Africa, while the circles indicate the calculation region only for the marked location
to guarantee a reliable numerical evaluation. Furthermore, the series of pictures in Fig. 5 indicates that crustal field anomalies can be outlined more precisely by the different wavelet scales than by just using the final approximation at scale J D 10.
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
4.2
905
Reconstruction of Radial Current Systems
Field-aligned current systems in the polar regions were first treated in Birkeland (1908), while a system of field-aligned currents at low latitudes due to the equatorial electrojet was proposed in Untied (1967). As the name suggests, they are directed along the main magnetic field lines. In polar regions, this implies that field-aligned currents are nearly radial with respect to the spherical Earth. Radial current systems have the advantage that they can be calculated from the knowledge of the toroidal magnetic field on a single sphere R . A spherical harmonic approach applying the Mie representation is given in Olsen (1997) where also more references on the geophysical background can be found. Here we give representations in terms of the two multiscale approaches presented in Sect. 3. First, we deal with the approach via frequency packages, which can be found in more detail in Bayer et al. (2001) and Maier (2005), and then we turn to the construction of locally supported wavelets, as treated in Freeden and Gerhards (2010). General point of departure is the observation that the combination of (7) and (8) with the spherical Helmholtz and the Mie representation yields j .x/ D
Qb .r / r
@
r @r
.rQb .r // L x Pb .r /; r
x 2 R3 ;
(94)
x . In particular, the radial current density Jrad (i.e., where r D jxj and D jxj Jrad .x/ D j .x/) is connected to the toroidal magnetic field scalar Qb by
Jrad .x/ D
Qb .r / r
;
(95)
while Qb can be obtained from the measured magnetic field by L Qb D qb , or equivalently, Qb .r / D L b.r /:
(96)
Since (95) and (96) solely involve spherical differential operators, it is sufficient for the reconstruction of Jrad on a sphere R to know the magnetic field b only on R . For the tangential parts of the current density j , relation (94) shows that additionally the radial derivative of b needs to be known for the reconstruction of j . This quantity is not supplied by today’s satellite missions. The new Swarm mission with its constellation of two satellites flying side by side and another satellite at a higher altitude allows a somewhat rough estimation of the East-West gradient of the magnetic field (hopefully improving crustal magnetic field models (cf. Maus et al. 2006) and allowing a direct evaluation of the coarse features of field-aligned currents (cf. Ritter and Lühr 2006)) but still does not supply the radial derivative. Gradiometry methods as they are frequently used for gravitational problems (see, e.g., Rummel et al. 1993; Rummel 2010; Freeden and Schreiner 2014b) would be a step forward but are difficult to realize due to the more complex structure of the
906
C. Gerhards
magnetic field. A mathematical formulation for magnetic field modeling in terms of spherical harmonics can be found, e.g., in Kotsiaros and Olsen (2012).
Wavelets as Frequency Packages The toroidal part of the magnetic field is given by qb D L Qb so that it can be .3/ expanded in terms of vector spherical harmonics yn;k . In other words, defining .i /
J . ; / D
1 2nC1 X X .i / .i / . J /^ .n/ Yn;k . /yn;k . /;
; 2 ;
(97)
nD0i kD1 .i /
with . J /^ .n/, e.g., the symbols of the CP kernel, we get .3/
1 X
.3/
qb D J0 ? J0 b C
.3/ J
?
.3/ J
b
(98)
J DJ0
on R . For a representation of the toroidal scalar Qb , we use the kernel .3/
ˆJ . ; / D
1 2nC1 X X nD1 kD1
1 .3/ p . J /^ .n/ Yn;k . /Yn;k . /; n.n C 1/ .3/
; 2 ; (99)
.3/
observing that it satisfies L ˆJ . ; / D J . ; /. This implies .3/
.3/
Qb D ˆJ0 J0 b C
1 X
.3/
‰J
.3/ J
b
(100)
J DJ0
on R , where R D RS is supposed to be the satellite altitude. Equation (95) states that the kernel p n.n C 1/ .3/ ^ . J / .n/Yn;k . /Yn;k . / R kD1 (101) leads to the desired multiscale representation of the radial currents. For a sufficiently large-scale Jmax , we finally end up with .3/; ˆJ . ; /
1 2nC1 X X 1 .3/ D ˆJ . ; / D R nD1
.3/;
Jrad ˆJ0
.3/
J0 b C
Jmax X
.3/;
‰J
.3/ J
b:
(102)
J DJ0
Furthermore, we see p that the symbol n of the underlying pseudodifferential operator is given by n.n C 1/=R.
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
907
Locally Supported Wavelets The spatially oriented method starts from the same situation as above, i.e., we need a representation of the toroidal scalar Qb . Equation (96) and the properties of Green’s function for the Beltrami operator imply Z Qb .x/ D L G. I / b.R / d !. /; x 2 R ; (103)
where R D jxj and D Jrad .x/ D
x . jxj
1 R
From (95) we are led to
Z
L G. I / b.R / d !. /;
x 2 R :
(104)
Substituting Green’s function by its regularized version, and defining the scaling kernel
J . ; / D
1 J L G . I /; R
; 2 ;
(105)
limit relations like those in Sect. 3.2 yield Jrad J0 b C
Jmax X J
b;
(106)
J DJ0
for a sufficiently large Jmax . The choice of (105) indicates that the regularized Green function has to be three times continuously differentiable in this example (while the one described in Definition 3 is only twice continuously differentiable). An illustration of the scaling kernels is given in Fig. 6. It is interesting to see that they are locally supported, opposed to the general construction where only the wavelet kernels have locally compact support. On the other hand, this fact is not so surprising, considering that the current density j is obtained from the magnetic field b by differentiation, a local operation. As a consequence, the multiscale procedure of starting at a low scale J0 and then successively adding further wavelet contributions to deal with unevenly distributed data sets is not necessary. Yet, the fine-resolution of the wavelet contributions reveals features that cannot be seen in the final reconstruction for some large scale J . This is illustrated in Fig. 7: the wavelet contributions for J D 4; 5 show disturbances along the satellite tracks that are less prominent in the final reconstruction at scale J D 6. Besides the data density, this is an indicator at what scale to truncate the approximation. The disturbances along the satellite tracks are no geophysical effects but simply measurement/preprocessing artefacts. The data used for Fig. 7 has been collected by MAGSAT during 1 month centered around March 21, 1980. It has been preprocessed by Nils Olsen, DTU Space, using the geomagnetic reference field GSFC(12/83) (cf. Langel and Estes 1985) in order to obtain the part of the magnetic field that is due to ionospheric current systems. An application of the approach via locally supported wavelets to
908
C. Gerhards
90° N
180° W 90° W 0° 90° E 180° E
90° N
45° N
45° N
0°
0°
45° S
45° S 90° S
90° S 0
180° W 90° W 0° 90° E 180° E
0.5
1
1.5
2.5
2
3
3.5
0
4
5
10
2
20
15
25
30
2
φ2(ε , ⋅)
φ4(ε , ⋅) 90° N
180° W 90° W 0° 90° E 180° E
45° N
0°
45° S 90° S 0
50
100
150
200
250
2
φ6(ε , ⋅)
Fig. 6 Scaling kernels J ."2 ; /, for scales J D 2; 4; 6 (colors indicate the absolute value, arrows the direction)
recent CHAMP data together with a more sophisticated data selection (similar to Papitashvili et al. 2002) can be found in Gerhards (2011a).
4.3
Multiscale Power Spectrum
Let us assume we are only interested in the internal part of the geomagnetic field (i.e., core field and crustal field). Then we know from the Gauss representation that b D rU D
1 2nC1 X X
ext UR^ .n; k/rHn;k .RI /
(107)
nD1 kD1
in ext R , with R D RE being the mean Earth radius. The contribution of the spherical 1 2 harmonic degree n to the mean square value 4r 2 kbkl 2 . / is represented by the r degree variance 2 2nC1 1 X ^ ext Varn .r/ D U .n; k/rH .RI / R n;k 2 2 4r kD1
l .r /
(108)
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents 180°W 90°W 0° 90°E 180°E 90°N 45°N
0°
0°
45°S
45°S
−15
90°S
−10
−5
180°W 90°W 0° 90°E 180°E
90°N
45°N
90°S
−0
5
10
15
20
−25 −20 −15 −10 −5
2
5
10
15
20
nA/m ψ2∗b
180°W 90°W 0° 90°E 180°E
180°W 90°W 0° 90°E 180°E
90°N
45°N
45°N
0°
0°
45°S
−30
0 2
nA/m φ2∗b 90°N
909
45°S 90°S
−20
−10
0
10
20
−30
30
90°S
−20
2
nA/m ψ3∗b 90°N
90°N 45°N
0°
0°
45°S
45°S 90°S
10
10
20
30
40
nA/m ψ4∗b
45°N
0
0 2
180°W 90°W 0° 90°E 180°E
−40 −30 −20 −10
−10
20
30
40
−100
180°W 90°W 0° 90°E 180°E
90°S
−50
nA/m2 ψ5∗b
0
50
100
nA/m2 φ6∗b
Fig. 7 Radial current density at scale 2 (top left) and 6 (bottom right) with wavelet contributions at intermediate scales from 2 to 5
2n C 1 D 4R4
2nC4 2nC1 X 2 R UR^ .n; k/ ; .n C 1/ r kD1
which has been studied, e.g., as early as Mauersberger (1956) and Lowes (1974). The corresponding spectrum is often referred to as Mauersberger-Lowes spectrum or power spectrum, although it has been argued, e.g., in Maus (2008) that the term power spectrum is more suitable for a slightly modified quantity. The unusual prefactor 2nC1 in (108) is only due to the fact that we have used L2 ./-normalized 4R4 spherical harmonics and not Schmidt semi-normalized ones.
910
C. Gerhards
A significant feature of the geomagnetic field revealed by the degree variances is a sharp “knee” around degree n D 15, which is generally interpreted as the transition from the core field-dominated part to the crustal field-dominated part (see, e.g., the contributions Sabaka et al. (2014) and Olsen et al. (2014) of this handbook for some more details and illustrations). However, the global character of spherical harmonics implies that the degree variances essentially reflect global properties, i.e., they give no information on where a signal is originated. In Beggan et al. (2013) Slepian functions have been used to model the continental part of the crustal magnetic field separately from the oceanic part. The Slepian functions were designed such that they are spatially concentrated either on the continents or on the oceans. Approximating the magnetic field with these functions, and afterwards transforming the Slepian coefficients into spherical harmonic coefficients, allows the derivation of degree variances for each of the two regions. This approach reduces aliasing effects that might occur when trying to obtain such regionally reflected degree variances by methods based purely on spherical harmonics. Yet, each degree variance still reflects a spherical harmonic degree, i.e., a global frequency. Our philosophy in defining a local/regional power spectrum is somewhat different. We are aiming at a multiscale power spectrum where each multiscale variance reflects the influence of features of scale-dependent spatial extend. More precisely, we use the previously introduced multiscale representation to define such a spatially oriented power spectrum. High scales J reflect the contributions to jb.x/j2 that originate in the vicinity of the location x, while lower scales represent the influence of global sources or sources of a larger spatial extend. Integrating these contributions over all x 2 r or over all x in a subregion of the Earth’s surface yields the desired multiscale variances. This is described in more detail in the following paragraphs.
Wavelets as Frequency Packages .i / .i / Let J .; / and J .; / be the scaling and wavelet kernels as defined in Sect. 3.1, with respect to symbols n D 1. Then we know that b D rU D
2 X
.i /
.i /
J0 ? J0 b C
i D1
1 2 X X
.i / J
?
.i / J
b:
(109)
i D1 J DJ0
Kernels of type i D 3 are not required in the representation (109) because the gradient rx D @r@ C 1r r only generates kernels of type 1 and 2. A canonical extension of (108) to the multiscale setting is the multiscale variance 2 1 X VarJ .r/ D 2 4r i D1 D
.i / J
1 2nC1 1 XX 4r 2 nD1 kD0
?
.i / J
2
b 2
(110)
l .r /
4 ^ J .n/
br.1/
^
2 .n; k/
C
2 ^ : br.2/ .n; k/
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
911
.i/ ^ Fig. 8 The symbols J .n/ of the adapted CP scaling kernel (left) and the symbols of the corresponding wavelet kernel (right) for scales J D 3; 8; 15; 25; 40
.i/ ^ .n/ J
This concept has already been introduced in Freeden and Maier (2003) in the different context of signal-to-noise thresholding. At scale J D J0 the wavelet kernels in (110) need to be substituted by scaling kernels (we take J0 D 0 in the examples of this subsection). For the specific choice of symbols .i / ^
J .n/ D
1; n J; 0; else;
(111)
the multiscale variances (110) and the degree variances (108) coincide. In other words, VarJ .r/ is a generalization of Varn .r/. However, the original purpose was to obtain a multiscale power spectrum that reflects local/regional properties of the magnetic field. Thus, one would not choose (111) but a scaling kernel that generates spatially localizing wavelets, such as the CP kernel. In our examples we use a slightly adapted version of the CP kernel (cf. Figs. 8 and 9), namely, .i / ^
J .n/ D
(
1 n2
0;
J C5 4
2
1 C n2
J C5 4
; n 2
J C5 4
;
(112)
else:
The adaptation has simply been undertaken to reduce the changes between two conP .i / .i / secutive scales, so that we get a finer spectrum. The term j 2iD1 J ? J b.x/j2 represents signals originating in a scale-dependent region around x (determined by .i / x the localization of the kernel J . ; /, with D jxj ). Integrating over all x 2 r , we obtain the multiscale variance VarJ .r/. Therefore, VarJ .r/ does not actually 1 2 reflect the contribution of a specific region to 4r 2 kbkl 2 . / , but it rather reflects the r spatial extend of the contributing features. Low scales J are influenced by signals of global origin, while higher scales account for signals of local origin. Thus, a multiscale power spectrum of the form (110) yields a spatially oriented alternative
912
C. Gerhards 180°W 90°W
0°
90°E 180°E
90°N
90°N
180°W 90°W
0°
90°E 180°E
45°N
45°N
0°
0°
45°S
45°S 90°S 0.5
1
1.5
2.5
2
|ψ 3
(1)(ε2,•)
3
3.5
2
90°S 4
6
10
12
ψ 8(1)(ε2,•)
|
90°N
8
| 180°W 90°W
0°
14
16
18
20
|
90°E 180°E
45°N
0°
45°S 90°S 20 40
60
80 100 120 140 160 180 200
|ψ 15(1)(ε2,•)| Fig. 9 Absolute values of the adapted CP wavelet kernel
.1/ 2 J ." ; /
at scales J D 3; 8; 15
(in terms of its interpretation) to power spectra based on degree variances. If one is interested in the multiscale power spectra of subregions r r , such as continents and oceans, one should look at the modified multiscale variances 2 1 X VarJ;r .r/ D jr j i D1
.i / J
?
.i / J
2
b 2
;
(113)
l .r /
where jr j denotes the surface area of r . In this case, VarJ;r .r/ only accounts for signals originating in r , while the interpretation of the scales J in terms of the spatial extend of the signals within that region remains. One should be aware that, opposed to the Slepian functions used in Beggan et al. (2013), the scaling and wavelet kernels are (vectorial) radial basis functions that are not specifically adapted .i / to the region r . Thus, at the lower scales, with less localizing kernels J .; /, aliasing might occur in a similar manner as it does for degree variances based on spherical harmonics. However, at higher scales this effect is being reduced more and more due to the improving localization of the wavelet kernels. In this sense, VarJ;r .r/ is less prone to aliasing issues than degree variances (if solely based on spherical harmonics).
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
913
A distinct advantage of the multiscale approach over degree variances (whether based purely on spherical harmonics or on Slepian functions) is the information P .i / .i / obtained from the evolution of the expression j 2iD1 J ? J b.x/j2 over different scales J for a single location x. The hope is that the trade-off between spatial and frequency localization contained in the construction of the wavelet .i / kernels J .; / allows, at least to some extent, a better extraction of information on depth as well as surface localization of geomagnetic/gravity anomalies. Related results can be found in Fehlinger et al. (2008), Freeden et al. (2009). In the remainder of this section, we return to VarJ .r/ and VarJ;r .r/ from (110) and (113), respectively, and illustrate the application of the multiscale variances to the CHAOS-4 and MF7 models for different continental/oceanic regions. Spherical harmonic-based models are not the actual aim of multiscale variances (rather the application to actual data or high-resolution models), but they serve well for a first illustration. The degree variances (108) of CHAOS-4 and MF7 are shown in Fig. 10 for comparison. CHAOS-4 is a model of the entire geomagnetic field up to spherical harmonic degree Nmax D 100. We only regard the crustal field contribution at degrees n D 16; : : : ; 100. MF7 is purely a crustal magnetic field model for spherical harmonic degrees n D 16; : : : ; 133. In Fig. 11, we have indicated the global multiscale variances VarJ .R/ and the regional multiscale variances VarJ;R .R/ for the MF7 model, where R denotes either the continental or the oceanic shelf (we have used the same shelf boundaries as Beggan et al. 2013). First thing that we notice is that the multiscale power spectrum is significantly smoother than the power spectrum based on degree variances in Fig. 10. This is not surprising since .i / an increasing set of spherical harmonics contributes to the wavelet kernel J .; / at each scale J , causing this smoothing effect. The second thing to notice is the split-up between continental and oceanic regions, showing a significantly greater power over the continents than over the oceans (note that the regional multiscale variances are scaled by the surface area jR j of the regions under consideration, so that the different size of areas covered by continents and oceans, respectively, does not contribute to the power). An interesting quantity to study is the deviation of the regional multiscale variances from the global average, given by the ratios as illustrated in the right plot of Fig. 11. At low scales J 11; 12; 13, there is no significant difference between the continental and the oceanic multiscale power spectrum. This is due to the property that lower scales reflect features of rather global influence, affecting the continents and oceans in a similar manner. As the scale increases, the influence comes from a more and more local origin, causing the oceanic and continental power spectrum to drift apart. However, it can be seen that there are no major changes in the quotients for scales higher than J 30. This is simply due to the fact that the MF7 model is truncated at spherical harmonic degree Nmax D 133, while the wavelet kernels at these scales contain contributions up to much higher degrees. In Fig. 12, we compare the multiscale power spectra of the MF7 and the CHAOS4 model. The top left plot shows that the MF7 power is generally stronger and shifted
914
C. Gerhards MF7 (Nmax=133), CHAOS-4 (Nmax=100) - Degree Variances 60 CHAOS-4 MF7
50 40 30 20 10 0
0
20
40
60
80
100
120
n
Fig. 10 Degree variances Var n .R/ of the CHAOS-4 and MF7 model
MF7 (Nmax=133) - Multiscale Variances
50 40 30
1.5 1
20
0.5
10 0
MF7 (Nmax=133) - Ratio Local/Global
2
Continent Ocean Global
0
5
10
15
20 J
25
30
35
40
0
0
5
10
15
20 J
25
30
35
40
Fig. 11 Left: global multiscale variances VarJ .R/ and regional multiscale variances VarJ;R .R/ of the continental and oceanic shelf for the entire MF7 model. Right: the quotients VarJ;R .R/=VarJ .R/ of the regional and the global multiscale variances
MF7 (Nmax=133),CHAOS-4 (Nmax=100) - Multiscale Variances
MF7 (Nmax=133),CHAOS-4 (Nmax=100) - Ratio MF7/CHAOS-4
50
3 CHAOS-4 - Continent CHAOS-4 - Ocean CHAOS-4 - Global MF7 - Continent MF7 - Ocean MF7 - Global
40 30 20
2.5 2 1.5
10 0 0
1 5
10
15
20
25
30
35
40
0.5 0
5
10
15
J
20
25
30
35
40
J MF7 (Nmax=100),CHAOS-4 (Nmax=100) - Ratio MF7/CHAOS-4
MF7 (Nmax=100),CHAOS-4 (Nmax=100) - Multiscale Variances 1.15
30
1.1 20
1.05 1
10
0.95 0 0
5
10
15
20 J
25
30
35
40
0.9 0
5
10
15
20
25
30
35
40
J
Fig. 12 Left: global multiscale variances VarJ .R/ and regional multiscale variances VarJ;R .R/ for the entire MF7 model (top) and for the MF7 model truncated at spherical harmonic degree Nmax D 100 (bottom). As a reference, the CHAOS-4 multiscale variances are plotted in fainted CHAOS-4 CHAOS-4 .R/ and VarMF7 .R/ of the colors. Right: the quotients VarMF7 J .R/=Var J J;R .R/=Var J;R multiscale variances for the MF7 and the CHAOS-4 model
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
915
towards higher scales. The latter is due to the higher spherical harmonic degrees of the MF7 model that generate contributions at the higher scales. The ratio of the two models in the top right plot indicates that the MF7 and CHAOS-4 model coincide quite well up to scale J 23. After that the higher spherical harmonic degrees of the MF7 model, and therefore the more localized structures that can be modeled, gain influence and the power of the MF7 model dominates the CHAOS-4 model. If we restrict the MF7 model to spherical harmonic degree Nmax D 100 as well, the agreement between the two multiscale power spectra is apparently a lot better (bottom row of Fig. 12). Yet, at higher scales, the power of the MF7 model restricted to Nmax D 100 still seems to be moderately stronger than the power of CHAOS-4. Only around scale J 23 the CHAOS-4 model is slightly dominating. The latter probably reflects the stronger degree variances of CHAOS-4 in the degree range n D 70; : : : ; 90 (cf. Fig. 10). Last, we take a look at the multiscale power spectra of the separate continents. This is done for the CHAOS-4 model, for the MF7 model restricted to Nmax D 100, and for the entire MF7 model. The results are shown in Fig. 13. Again, the more interesting property to study is the deviation of the regional multiscale variances from the global average (right plots in Fig. 13) as it offers a more suitable illustration of the variability of the single continents. At low scales, Africa and Australia show a weaker power than the global average, and then the quotient increases steadily with a peak around scale J 19. All other continents are stronger than the global average over the entire multiscale power spectrum. At scales higher than J 25, all continents in the CHAOS-4 model and the MF7 model restricted to Nmax D 100 reveal flat ratios of approximately the same order. An exception is given by Antarctica that has a clearly dominating spectrum. Something interesting happens when going over from the models truncated at Nmax D 100 to the entire MF7 model with Nmax D 133 (top right plot in Fig. 13). The behavior at the higher scales diversifies. In particular, the power over Antarctica suddenly becomes weaker than that of all other continents. This probably reflects the calamities resulting from the polar gap of satellite data, which mainly affects spatially localized features.
Locally Supported Wavelets Starting from the representation in Theorem 1, the surface theorem of Gauss, and the substitution of Green’s function for the Beltrami operator by its regularization, we find that the scaling kernel ˆJ . ; / D
1 C G J . I /; 4
; 2 ;
(114)
leads to the multiscale representation
b D ˆJ0 b C
1 X J DJ0
‰J b:
(115)
916
50 40 30 20 10 0
C. Gerhards MF7 (Nmax=133) - Multiscale Variances
2.5
Eurasia Americas Africa Australia Antarctica Global
0
5
10
15
MF7 (Nmax=133) - Ratio Local/Global
2 1.5 1 20 J
25
30
35
40
0.5 0
MF7 (Nmax=100) - Multiscale Variances
5
10
15
20 J
25
30
35
40
MF7 (Nmax=100) - Ratio Local/Global
2.5
40 30
2
20
1.5
10
1
0
0.5 0
0
5
10
15
20 J
25
30
35
40
CHAOS-4 (Nmax=100) - Multiscale Variances
2.5
30
2
20
1.5
10
1
0
0.5 0
5
10
15
20 J
25
30
35
40
10
15
20 J
25
30
35
40
CHAOS-4 (Nmax=100) - Ratio Local/Global
40
0
5
5
10
15
20 J
25
30
35
40
Fig. 13 Left: global multiscale variances VarJ .R/ and regional multiscale variances VarJ;R .R/ of the separate continents for the entire MF7 model (top), for the MF7 model truncated at spherical harmonic degree Nmax D 100 (center), and for the CHAOS-4 model (bottom). Right: the quotients VarJ;R .R/=VarJ .R/ of the regional and the global multiscale variances
Since we are not actually solving any differential equation involving , the scaling kernel (114) is not the canonical choice for a regularization of the Dirac Kernel. We have simply picked this one because it complies with our previous considerations. The smoothed Haar scaling function as used, e.g., in Freeden et al. (2005) is another choice that leads to locally supported wavelet kernels that could be used here. With (115) at hand, we are led to the following definitions of multiscale variances: 1 k‰J bk2l2 .r / ; 4r 2 1 VarJ;r .r/ D k‰J bk2l2 .r / : jr j VarJ .r/ D
(116) (117)
Their interpretation is analogous to that for scaling kernels constructed as frequency packages.
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
5
917
Conclusion and Outlook
Today, a variety of mathematical methods are available to approximate and analyze geophysical quantities. Which of these methods is the most adequate strongly depends on the available data, the properties of the modeled quantity, and the intentions of the applicant. Spherical harmonic expansions are very popular as they yield good global results and a meaningful physical interpretability. In this chapter, we have presented alternative multiscale representations that are suitable for unevenly distributed data and strongly spatially varying quantities such as the Earth’s crustal magnetic field. The multiscale power spectrum and its application to the CHAOS-4 model and the MF7 model show that it can indicate local/regional differences at the higher scales. As a consequence, a future goal would be a more extensive application of multiscale methods to real data or high-resolution models in order to gain insights into the localized/regional contributions. In this sense, spatially oriented multiscale methods can supplement the already well-established frequency-oriented methods. Different from many other approaches, the evaluation of the multiscale representations presented in this chapter is based on numerical integration rather than the solution of systems of linear equations. On the one hand, this avoids problems with ill-conditioned matrices, and on the other hand, it requires adequate quadrature rules. An overview on numerical integration methods can be found in the contribution Hesse et al. (2014) of this handbook. The upcoming Swarm mission will be a further step forward in modeling the Earth’s magnetic field. More accurate measurements and new possibilities arising from the constellation of the three satellites are anticipated to reduce the spectral gap occurring between satellite data and ground measurements of the crustal magnetic field at or near the Earth’s surface. This is another point where the potential of multiscale methods could be used, namely, the combination of globally available satellite data with only locally available ground data.
References Augustin M, Bauer M, Blick C, Eberle S, Freeden W, Gerhards C, Ilyasov M, Kahnt R, Klug M, Möhringer S, Neu T, Nutz H, Ostermann I, Punzi A (2014) Modeling deep geothermal reservoirs: recent advances and future perspectives. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, Heidelberg Backus GE (1986) Poloidal and toroidal fields in geomagnetic field modeling. Rev Geophys 24: 75–109 Backus GE, Parker R, Constable C (1996) Foundations of geomagnetism. Cambridge University Press, Cambridge Bayer M, Freeden W, Maier T (2001) A vector wavelet approach to iono- and magnetospheric geomagnetic satellite data. J Atm Sol-Ter Phys 63:581–597 Beggan CD, Saarimäki J, Whaler K, Simons FJ (2013) Spectral and spatial decomposition of lithospheric magnetic field models using spherical Slepian functions. Geophys J Int 193: 136–148 Birkeland K (1908) The Norwegian aurora polaris expedition 1902–1903, vol. 1. H. Aschehoug, Oslo
918
C. Gerhards
Chambodut A, Panet I, Mandea M, Diament M, Holschneider M (2005) Wavelet frames: an alternative to spherical harmonic representation of potential fields. Geophys J Int 163:875–899 Dahlke S, Dahmen W, Schmitt W, Weinreich I (1995) Multiresolution analysis and wavelets on S 2 and S 3 . Numer Funct Anal Opt 16:19–41 Edmonds AR (1957) Angular momentum in quantum mechanics. Princeton University Press, Princeton Fehlinger T, Freeden W, Gramsch S, Mayer C, Michel D, Schreiner M (2007) Local modelling of sea surface topography from (geostrophic) Ocean flow. ZAMM 87:775–791 Fehlinger T, Freeden W, Mayer C, Schreiner M (2008) On the local multiscale determination of the Earth’s disturbing potential from discrete deflections of the vertical. Comput Geosci 12:473–490 Freeden W (1981) On approximation by harmonic splines. Manuscr Geod 6:193–244 Freeden W (1998) The uncertainty principle and its role in physical geodesy. In: Freeden W (ed) Progress in geodetic science. Shaker, Aachen Freeden W, Gerhards C (2010) Poloidal and toroidal field modeling in terms of locally supported vector wavelets. Math Geosci 42:817–838 Freeden W, Gerhards C (2012) Geomathematically oriented potential theory. Chapman & Hall/CRC, Boca Raton Freeden W, Maier T (2003) Spectral and multiscale signal-to-noise thresholding of spherical vector fields. Comput Geosci 7:215–250 Freeden W, Schreiner M (2006) Local multiscale modeling of geoidal undulations from deflections of the vertical. J Geod 78:641–651 Freeden W, Schreiner M (2009) Spherical functions of mathematical (geo-) sciences. Springer, Heidelberg Freeden W, Schreiner M (2014a) Special functions in mathematical geosciences – an attempt of categorization. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Freeden W, Schreiner M (2014b) Satellite gravity gradiometry (SGG): from scalar to tensorial solution. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Freeden W, Windheuser U (1996) Spherical wavelet transform and its discretization. Adv Comput Math 5:51–94 Freeden W, Wolf K (2008) Klassische Erdschwerefeldbestimmung aus der Sicht moderner Geomathematik. Math Semesterber 56:53–77 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geosciences). Oxford University Press, New York Freeden W, Michel D, Michel V (2005) Local multiscale approximations of geostrophic oceanic flow: theoretical background and aspects of scientific computing. Mar Geod 28:313–329 Freeden W, Fehlinger T, Klug M, Mathar M, Wolf K (2009) Classical globally reflected gravity field determination in modern locally oriented multiscale framework. J Geod 83:1171–1191 Friis-Christensen E, Lühr H, Hulot G (2006) Swarm: a constellation to study the Earth’s magnetic field. Earth Planets Space 58:351–358 Gauss CF (1839) Allgemeine Theorie des Erdmagnetismus. Resultate aus den Beobachtungen des Magnetischen Vereins im Jahre 1838. Göttinger Magnetischer Verein, Leipzig Gerhards C (2011a) Spherical multiscale methods in terms of locally supported wavelets: theory and application to geomagnetic modeling. PhD thesis, University of Kaiserslautern Gerhards C (2011b) Spherical decompositions in a global and local framework: theory and an application to geomagnetic modeling. Int J Geomath 1:1–52 Gerhards C (2012) Locally supported wavelets for the separation of spherical vector fields with respect to their sources. Int J Wavel Multires Inf Process 10. doi:10.1142/S0219691312500348 Gerlich G (1972) Magnetfeldbeschreibung mit verallgemeinerten poloidalen und toroidalen Skalaren. Z Naturforsch 8:1167–1172
Multiscale Modeling of the Geomagnetic Field and Ionospheric Currents
919
GRIMM-3 (2011) GFZ reference internal magnetic model 3. http://www.gfz-potsdam.de/en/res earch/organizational-units/departments/department-2/earths-magnetic-field/topics/field-models/ grimm-x/grimm-3. Accessed date 26 Aug 2014 Haines GV (1985) Spherical cap harmonic analysis. J Geophys Res 90:2583–2591 Hesse K, Sloan IH, Womersley R (2014) Numerical integration on the sphere. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Holschneider M (1996) Continuous wavelet transforms on the sphere. J Math Phys 37:4156–4165 Holschneider M, Chambodut A, Mandea M (2003) From global to regional analysis of the magnetic field on the sphere using wavelet frames. Phys Earth Planet Int 135:107–124 Hulot G, Sabaka TJ, Olsen N (2007) The present field. In: Kono M (ed) Treatise on geophysics, vol. 5. Elsevier, Amsterdam Hulot G, Finlay CC, Constable CG, Olsen N, Mandea M (2010) The magnetic field of planet earth. Space Sci Rev 152:159–222 IAGA (International Association of Geomagnetism and Aeronomy), Working Group V-MOD (2010) International geomagnetic reference field: the eleventh generation. Geophys J Int 183:1216–1230 Kotsiaros S, Olsen N (2012) The geomagnetic field gradient tensor. Int J Geomath 3:297–314 Langel RA (1987) The main field. In: Jacobs JA (ed) Geomagnetism, vol 1. Academic, London Langel RA, Estes RH (1985) The near-earth magnetic field at 1980 determined from MAGSAT data. J Geophys Res 90:2495–2510 Langel RA, Hinze WJ (1998) The magnetic field of the Earth’s lithosphere: the satellite perspective. Cambridge University Press, Cambridge Lowes FJ (1974) Spatial power spectrum of the main geomagnetic field, and extrapolation to the core. Geophys J R Astron Soc 36:717–730 Maier T (2005) Wavelet-Mie-representation for solenoidal vector fields with applications to ionospheric geomagnetic data. SIAM J Appl Math 65:1888–1912 Maier T, Mayer C (2003) Multiscale downward continuation of the crustal field from CHAMP FGM data. In: Reigber C, Lühr H, Schwintzer P (eds) First CHAMP mission results for gravity, magnetic and atmospheric studies. Springer, Heidelberg Mauersberger P (1956) Das Mittel der Energiedichte des Geomagnetischen Hauptfeldes an der Erdoberfläche und seine sekuläre Änderung. Gerlands Beitr Geophys 65:135–142 Maus S (2008) The geomagnetic power spectrum. Geophys J Int 174:135–142 Maus S, Hemant K, Rother M, Lühr H (2003) Mapping the lithospheric magnetic field from CHAMP scalar and vector magnetic data. In: Reigber C, Lühr H, Schwintzer P (eds) First CHAMP mission results for gravity, magnetic and atmospheric studies. Springer, Heidelberg Maus S, Lühr H, Purucker M (2006) Simulation of the high-degree lithospheric field recovery for the Swarm constellation of satellites. Earth Planets Space 58:397–407 Mayer C (2003) Wavelet modeling of ionospheric currents and induced magnetic fields from satellite data. PhD thesis, University of Kaiserslautern Mayer C (2006) Wavelet decomposition of spherical vector fields with respect to sources. J Fourier Anal Appl 12:345–369 Mayer C, Maier T (2006) Separating inner and outer Earth’s magnetic field from CHAMP satellite measurements by means of vector scaling functions and wavelets. Geophys J Int 167:1188–1203 MF7 (2010) Magnetic field model MF7. http://www.geomag.us/models/MF7.html. Accessed date 28 Aug 2014 Müller C (1966) Spherical harmonics. Lecture notes in mathematics, vol 17. Springer, Berlin Olsen N (1997) Ionospheric F-region currents at middle and low latitudes estimated from MAGSAT data. J Geophys Res 102:4563–4576 Olsen N, Glassmeier K-H, Jia X (2010) Separation of the magnetic field into external and internal parts. Space Sci Rev 152:135–157 Olsen N, Hulot G, Sabaka TJ (2014) The geomagnetic field – from observations to separation of the various field contributions. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg
920
C. Gerhards
Olsen N, Lühr H, Finlay CC, Sabaka TJ, Michaelis I, Rauber J, Tøffner-Clausen L (2014) The CHAOS-4 geomagnetic field model. Geophys J Int 197:815–827 Papitashvili VO, Christiansen F, Neubert T (2002) A new model of field-aligned currents derived from high-precision satellite magnetic field data. Geophys Res Lett 29. doi:10.1029/2001GL014207 Ritter P, Lühr H (2006) Curl-B technique applied to Swarm constellation for determining fieldaligned currents. Earth Planets Space 58:463–476 Rummel R (2010) GOCE: gravitational gradiometry in a satellite. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics. Springer, Heidelberg Rummel R, van Gelderen M, Koop R, Schrama E, Sanso F, Brovelli M, Miggliaccio F, Sacerdote F (1993) Spherical harmonic analysis of satellite gradiometry. Publications on geodesy, vol 39. Nederlandse Commissie voor Geodesie, Delft Sabaka T, Hulot G, Olsen N (2014) Mathematical properties relevant to geomagnetic field modeling. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Schröder P, Swelden W (1995) Spherical wavelets on the sphere. In: Approximation theory VIII. World Scientific, Singapore Shure L, Parker RL, Backus GE (1982) Harmonic splines for geomagnetic modeling. Phys Earth Planet Int 28:215–229 Simons FJ, Plattner A (2014) Scalar and vector slepian functions, spherical signal estimation and spectral analysis. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics 2nd edn. Springer, Heidelberg Simons FJ, Dahlen FA, Wieczorek MA (2006) Spatiospectral localization on a sphere. SIAM Rev 48:504–536 Thébault E, Schott JJ, Mandea M (2006) Revised spherical cap harmonics analysis (R-SCHA): validation and properties. J Geophys Res 111. doi:10.1029/2005JB003836 Thébault E, Purucker E, Whaler KA, Langlais B, Sabaka TJ (2010) The magnetic field of the Earth’s lithosphere. Space Sci Rev 155:95–127 Untied J (1967) A model of the equatorial electrojet involving meridional currents. J Geophys Res 72:5799–5810 Wolf K (2009) Multiscale modeling of classical boundary value problems in physical geodesy by locally support wavelets. PhD thesis, University of Kaiserslautern
Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions in Geomagnetic Induction Jin Sun
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The Electromagnetic Green’s Functions in a Homogeneous Medium . . . . . . . . . . . . . . . 3 Toroidal-Poloidal Decomposition Under Cartesian Geometry . . . . . . . . . . . . . . . . . . . . . 4 Interaction with a Half-Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 The Quasi-static Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Toroidal-Poloidal Decomposition Under Spherical Geometry . . . . . . . . . . . . . . . . . . . . . 7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
922 922 925 929 931 934 937 938
Abstract
In this chapter, we make use of electromagnetic Green’s functions to analyze quasi-static electromagnetic fields outside and inside the domain of external current sources. Scalar potential representation of the magnetic field in a sourcefree domain is readily derived from the physical currents using Green’s functions. It may be shown that, although the electric field generated by external sources has a poloidal mode component, this component causes no induction inside a conductive ground and vanishes at the surface of a conductive ground in the transverse directions. An effective external current source represented as a toroidal source is thus justified for low-frequency natural source induction such as natural source magnetotellurics and geomagnetic global inductions.
J. Sun () Institute of Geophysics, ETH Zürich, Zürich, Switzerland e-mail: [email protected]; [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_66
921
922
1
J. Sun
Introduction
Models for the spatial structure of magnetic fields resulting from complex current systems in the ionosphere and magnetosphere are critical for a range of geophysical research specialties, including electromagnetic (EM) induction studies of deep Earth conductivity, magnetotelluric (MT) studies in exploration, fundamental studies of the near-Earth space environment, and applied studies of currents induced in electrical power grids. To a good approximation, no electric current flows in the neutral atmosphere, so if frequencies are low enough to justify the quasi-static approximation, simple arguments can be used to demonstrate that the magnetic fields at and just above the Earth’s surface can be represented as the gradient of a magnetic scalar potential (Backus 1986). External sources including ionosphere and magnetosphere currents are represented by external magnetic scalar potentials parameterized by a series of spherical harmonics, with coefficients known as Gaussian coefficients. Internal sources from the Earth’s core and lithosphere, as well as those induced by the external sources, are represented similarly. Magnetic scalar potential representation of the magnetic field has obvious shortcomings: First, it is invalid within source domains, such as ionosphere and magnetosphere, and the region in between, where field-aligned currents flow. Second, the relationship of this representation to the physical current sources is not obvious. Third, the electric field, which serves as the true driving force of EM induction, is not readily derived from the magnetic scalar potential. To overcome these difficulties, we make use of the EM Green’s functions of Maxwell’s equations to explicitly derive EM fields in terms of physical sources. The magnetic scalar potential representation may readily be derived from the Green’s function representation under the quasi-static limit. Within the source domain where this representation is invalid, an extra contribution is obtained in terms of a surface integration of the currents. Furthermore, EM induction effects from external current sources may be shown to be generated only by the so-called toroidal mode component, which is precisely the component associated with the magnetic field represented by the magnetic scalar potential, thus completely justifying its use in geomagnetic induction studies.
2
The Electromagnetic Green’s Functions in a Homogeneous Medium
In a linear homogeneous conductive medium with uniform permittivity ", permeability , and conductivity 00 , Maxwell’s equations with e i ¨t harmonic time dependence are given by r E D i !H;
(1)
r H D J C 0 E;
(2)
Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions. . .
923
where E is the electric field, H is the magnetic field, J is the source current density and 0 D 00 i ! is the complex conductivity. Taking the divergence of (1) and (2), respectively, one obtains r H D 0; rED
(3)
r J : 0
(4)
Considering the continuity equation of electric charges, r J i ! D 0;
(5)
where ¡ is the electric charge density, (4) is seen to be also given by r ED
i ! : 0
(6)
Obviously, the curls of the fields (1), (2) along with the continuity equation (5) imply the divergences of the fields (3), (4), or (6). On taking the curl of (1) and considering (2), one obtains the vector Helmholtz equation for the electric field, r r E k02 E D i !J; where k0 D identity
(7)
p i !0 has a positive imaginary part. To solve (7), invoke the vector
r r A D rr A r 2 A;
(8)
where A is any vector field, and consider (4), (7) is reduced to the scalar Helmholtz equation r 2 E C k02 E D i !J
1 rr J: 0
(9)
We seek solutions to (9) satisfying radiation condition, which we assume for all fields and potentials encountered in this chapter. Moreover, we impose a rather strong variant of the radiation condition, i.e., all fields and potentials decay exponentially toward infinity, even though the decay rate may be very small. Physically, this assumption corresponds to a lossy medium, i.e., 00 > 0 even for the free space, where 00 becomes vanishingly small but not exactly zero. Solution to (9) satisfying such radiation condition is standard, given by
924
J. Sun
Z E.r/ D
1 0 0 0 0 d r g.r; r / i !J.r / C r r J.r / ; 0 3 0
0
(10)
where r is the position vector. Under Cartesian coordinates, r D x xO C y yO C zOz, where x; O y; O zO are unit basis vectors of the Cartesian coordinate system. The scalar Green’s function g(r,r0 ) satisfies the scalar Helmholtz equation, 2 r C k02 g r; r0 D ı r r0
(11)
as well as the radiation condition. Obviously, g .r; r0 / depends only on the difference r r0 , i.e., g .r; r0 / D g .r r0 /. Applying Fourier transform with respect to r r0 to (11) yields
Q D 1 k 2 C k02 g.k/
(12) 0
where k is the wave vector of the Fourier plane wave component e i k.rr / , k D kkk is the amplitude of the wave vector, and g.k/ Q D
k2
1 k02
(13)
is the three-dimensional (3D) Fourier spectrum of g.r r0 /, i.e., R
0
D
1 .2/3 1 .2/3
D
ei k0 jrr j 4jrr0 j ;
g .r r0 / D
R 0
i k.rr / d 3 k g.k/e Q i k.rr0 /
d 3 k e k 2 k 2 0
(14)
where the last equality may be obtained by converting to spherical coordinates and integrating with respect to the angles, followed by evaluating the integration with respect to k using Cauchy’s residue theorem along with the radiation condition (see, e.g., Merzbacher 1998, Chap. 13). To obtain an explicit form of the electric Green’s function, apply vector integration by parts twice on the second term of (10) to get Z E.r/ D i !
d 3 r 0 Ge0 r; r0 J.r0 /;
(15)
where the tensor electric Green’s function is formally given by 1 Ge0 r; r0 D I 2 rr 0 g r r0 ; k0
(16)
where the symmetry property of the scalar Green’s function rg .r r0 / D r 0 g .r r0 / was used and I is the identity tensor given by I D xO xO C yO yO C zOzO.
Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions. . .
925
It is noted that (13) represents a non-integrable singular kernel, with the singularity located at r D r0 . Application of the electric Green’s function must be regularized, e.g., by excluding the singularity from the 3D integration and account for its contribution separately. (See, e.g., Chew 1995 for discussions about the singularity of the Green’s function; also see Sect. 3 for an example.) Solution of the magnetic field may be obtained by taking the curl of (10) and considering (1), leading to Z H.r/ D d 3 r 0 Gh0 r; r0 J r0 ; (17) where the magnetic Green’s function is given by Gh0 r; r0 D r Ig r; r0 :
(18)
Unlike the electric Green’s function, the magnetic Green’s function is integrable, since the curl of the non-integrable second term in (10) vanishes. Using integration by parts, it is easily verified that (17) is in the same form as Biot-Savart Law (see, e.g., Jackson 1998), to which it reduces under the limits ! !0 and 00 !0.
3
Toroidal-Poloidal Decomposition Under Cartesian Geometry
It is well known from Helmholtz decomposition theorem that a sufficiently wellbehaved vector field V satisfying radiation condition may be decomposed into the sum of the curl of a vector potential and the gradient of a scalar potential, V D r A r';
(19)
where A is the vector potential and ' is the scalar potential. Using integration by parts, it is easily verified that Z d 3 rr A r' D 0 (20) for scalar and vector potentials satisfying radiation condition, indicating orthogonality between curl fields and gradient fields under the usual inner product defined as dot product between vector fields integrated over the 3D space. The orthogonality (20) also implies uniqueness of the decomposition (19). Uniqueness of the scalar and vector potentials is guaranteed by the radiation condition. The curl field r A can further be decomposed into the sum of a toroidal field and a poloidal field (Schmitt and Wahl 1992). Unlike the Helmholtz decomposition, the form of toroidal-poloidal decomposition depends on the choice of a pilot vector. Under Cartesian coordinates, the pilot vector is often chosen as the unit vector zO, leading to the decomposition
926
J. Sun
r A D r zOˆ C r r zO‰;
(21)
where ˆ, ‰ are the toroidal and the poloidal potentials, and r zO, r r zO are the toroidal and the poloidal differential operators operating on these potentials. These potentials defined as in (21) are sometimes known as Hertzian potentials. The toroidal field is in the transverse (xy) direction. The poloidal field is the curl of a toroidal field, and its curl is also a toroidal field, as r r r zO‰ D r zOr 2 ‰
(22)
is easily verified. Similar to (20), orthogonality between toroidal and poloidal fields can be verified, implying uniqueness of the decomposition (21). Uniqueness of the toroidal and poloidal potentials may be guaranteed by radiation condition. The Helmholtz decomposition and toroidal-poloidal decomposition may be applied to the electric Green’s function (16). First, observe that the first term of (16) applying to an arbitrary gradient field leads to R
d 3 r 0 Ig .r r0 / r 0 ' .r0 / R D R d 3 r 0 g .r r0 / r 0 ' .r0 / D d 3 r 0 rg .r r0 / ' R D r d 3 r 0 g .r r0 / ';
(23)
where r 0 operates on primed coordinates, and integration by parts has been used. Clearly, application of the electric Green’s function on a gradient field leads to a gradient field. Similar observations can be made about toroidal and poloidal fields. Therefore, the electric Green’s function may be written in a diagonal form Ge0 .r; r0 / D r zOr 0 zOˆ0 .r; r0 / C k12 r r zOr 0 r 0 zO‰0 .r; r0 / 0
(24)
C k12 rr 0 '.r; r0 /; 0
where the scalar kernels may be solved for by equating (24) with (16), i.e., r zOr 0 zOˆ0 .r; r0 / C k12 r r zOr 0 r 0 zO‰0 .r; r0 / 0
C k12 rr 0 '0 .r; r0 / 0
D Ig.r r0 /
1 rr 0 g.r K02
(25)
r0 /:
Making the ansatz that ˆ0 , ‰ 0 , and ' 0 depend only on r r0 and applying 3D Fourier transform to (25), kT is the transverse wave vector in the xy directions, with amplitude kT , so that k D kT C kz zO, where kz is the z
Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions. . .
927
component of k. Note that the partial differential operators simplify with 0 0 respect to the Fourier components, e.g., re i k.rr / D i ke i k.rr / , etc. one obtains Q 0 .k/ C .Oz kT / .Oz kT / ˆ
1 k02
2 Q 0 .k/ zOkT kz kT zOkT2 kz kT ‰
Q C k12 kk'Q0 .k/ D .xO xO C yO yO C zOzO/g.k/ 0
1 kkg.k/; Q k02
(26)
where Q denotes the Fourier spectrum Considering (13), (26) may be solved as 1 Q 0 .k/ D ˆ ; kT2 .k 2 k02 / Q 0 .k/ D ‰
k02 2 k 2 kT .k 2
'Q0 .k/ D
k02 /
;
(27)
1 : k2
The inverse Fourier transforms may be evaluated by first integrating over kz , leading to ˆ0 .r r0 / D ‰0 .r r0 / D '0 .r r0 / D
i 8 2 i 8 2 i 8 2
R R R
0
d 2 kT d 2 kT
ei ŒkT .rT rT /Ck0z .kT /jzz kT2 k0z .kT /
0 ei kT .rT rT / 2 kT i kT .rT r0T /
d 2 kT e
h
0 j
k0z .kT /
i
;
0 ei k0z .kT /jzz j
0 ekT jzz j
kT
Cie
kT jzz0 j
kT
i ;
(28)
;
where rT D x xO C y yO is the transverse position vector, the spatial frequency k D kT Cqkz zO, where kT is the amplitude of its transverse component, and k0z .kT / D k02 kT2 is the z component with a positive imaginary part. Note that in (28), the integrands are unbounded at kT D 0, rendering these integrations undefined. However, by substituting into (24) and taking the partial differential operators inside the integrals, the apparent singularities at kT D 0 vanish, and (24) remains well defined. The rest of this chapter assumes that all such singular representations of the scalar kernels be understood in this manner. Upon substituting into (24), it can be shown that the second term of ‰0 cancels with ' 0 . Therefore, these two kernels are effectively ‰0 .r r0 / D 0
i 8 2
R
d 2 kT e
i ŒkT .rT r0T /Ck0z .kT /jzz0 j kT2 k0z .kT /
;
(29)
'0 .r r / D 0; and the toroidal-poloidal decomposition of the electric Green’s function becomes
928
J. Sun
Ge0 .r; r0 / D r zOr 0 zOˆ0 .r r0 / C k12 r r zOr 0 r 0 zO‰0 .r r0 /:
(30)
0
Taking the curl of (30) yields the toroidal-poloidal decomposition of the magnetic Green’s function Gh0 .r; r0 / D r r zOr 0 zOˆ0 .r r0 / Cr zOr 0 r 0 zO‰0 .r r0 /:
(31)
It is noted that the scalar kernels (28) are not differentiable across the z0 D z plane, rendering (30) and (31) undefined at the plane. Therefore, an exclusion volume in the form of a transverse thin layer enclosing the z0 D z plane may be introduced, and (31) may be written as Gh0 .r; r0 / D P:V:r r zOr 0 zOˆ0 .r r0 / CP:V:r zOr 0 r 0 zO‰0 .r r0 /:
(32)
where P . V . stands for principal value excluding an exclusion volume enclosing the z0 D z plane with vanishing thickness. To define (30) unambiguously and correctly account for the contribution from the exclusion volume, apply (30) to an arbitrary current source J, leading to hR R 3 0 e z d r G0 .r; r0 / J.r0 / D lim 1 d 3 r 0 Ge0 .r; r0 / J.r0 / ız!0 R1 (33) C zC d 3 r 0 Ge0 .r; r0 / J.r0 / i R zC 3 0 e 0 0 C z d r G0 .r; r / J.r / ; where z D z ız, zC D z C ız. The first 2 terms represent contributions from sources below and above the z0 D z plane, and the last term corresponds to the singular contribution from sources within the exclusion thin layer. To account for this contribution, we substitute (16) into the last term of (33), and notice the first term of (16) has vanishing contribution as ız !0. Thus, R zC
d 3 r 0 Ge0 .r; r0 / J.r0 / ız!0 z Rz D k12 lim zC d 3 r 0 rr 0 g.r r0 / J.r0 / 0 ız!0 Rz D k12 lim zC d 3 r 0 r 0 Œrg.r r0 /J.r0 / 0 ız!0 lim
(34)
rg.r r0 /r 0 J.r0 / R D k12 lim z ;zC d 2 r 0 nO 0 Œrg.r r0 /J.r0 / ; 0
ız!0
where nO 0 is the outward-directed surface norm of the exclusion volume, and the second term in the second equality vanishes as ız !0, as we assume no surface charge is concentrated within the volume. It can further be shown with some algebra that
Toroidal-Poloidal Decompositions of Electromagnetic Green’s Functions. . .
929
Z lim
ız!0 z ;zC
d 2 r 0 nO 0 Œrg.r r0 /J.r0 / D zOzO J.r/:
(35)
Therefore, (32) may be written as Ge0 .r; r0 / D P:V:r zOr 0 zOˆ0 .r r0 / C k12 P:V:r r zOr 0 r 0 zO‰0 .r r0 / 0
(36)
k12 ı.r r0 /OzzO: 0
The electric field generated by the current source J is given by R E.r/ D i ! z0 >z d 3 r 0 r zOr 0 zOˆ0 .r r0 / J.r0 / R C 10 z0 >z d 3 r 0 r r zOr 0 r 0 zO‰0 .r r0 / J.r0 / R Ci ! z0 z d 3 r 0 r zOr 0 r 0 zO‰0 .r r0 / J.r0 / R C z0 0: @B 1 curl B C D0 curl @t
in B;
(21)
BT D 0
on @B;
(22)
n BS D 0
on @B:
(23)
Multiplying Eq. 21 by B and making use of Green’s theorem Z Z 1 1 1 curl a b dV D .curl a curl b/dV curl a .n b/ dS; B B @B (24) where a and b are differentiable vector fields and “” stands for the scalar product of vectors results in
Z
curl
Z
@ 1 .curl B curl B/ dV C 2 @t
Z
Z
1 curl B .n B/ dS: B B @B (25) Decomposing B into its toroidal and spheroidal parts, B D BT C BS , the surface integral on the right-hand side then reads as Z @B
1 curl B .n B/ dS D
Z @B
.B B/ dV D
1 curl B Œ.n BT / C .n BS / dS:
(26)
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
995
Both of the integrals on the right-hand side are equal to zero because of conditions (22) and (23). Moreover, integrating Eq. 25 over time and using the initial condition B D 0 at t D 0, one obtains 2
Z
Z
T
Z
.B B/ dV D B
B
0
1 .curl B curl B/ dV
dt:
(27)
Since > 0 in B, the term on the right-hand side is always equal to or less than zero. On the other hand, the energy integral on the left-hand side is positive or equal to zero. Equation (27) can therefore only be satisfied if the difference field B is equal to zero at any time t > 0. Hence, Eq. 9, together with the boundary conditions (14) and (17) and the initial condition (18) satisfying divergence-free condition (5), ensures the uniqueness of the solution.
3.2
Special Case: EM Induction in an Axisymmetric Case
Adopting the above assumptions for the case of EM induction for CHAMP magnetic data (see Sect. 2), the discussion is confined to the problem with axisymmetric electrical conductivity D .r; #/ and assumed that the variations of magnetic field recorded by the CHAMP magnetometer are induced by a purely zonal external source. Under these two restrictions, both the inducing and induced parts of magnetic induction B are an axisymmetric vector field that may be represented in terms of zonal spherical vector harmonics Y`j .#/ (Varshalovich et al. 1989, Sect. 7.3). Using these harmonics, the toroidal-spheroidal decomposition of magnetic induction B can be expressed in the form B.r; #/ D BT .r; #/ C BS .r; #/ C1 1 jX X Bj` .r/Y`j .#/; D
(28)
j D1 `Dj 1
j
j ˙1
where Yj .#/ and Yj
.#/ are zonal toroidal and spheroidal vector spherical j
j ˙1
harmonics, respectively, and Bj .r/ and Bj .r/ are spherical harmonic expansion coefficients of the toroidal (BT ) and spheroidal (BS ) parts of B, respectively. j j ˙1 Moreover, the r and # components of Yj .#/ and the ' component of Yj .#/ are identically equal to zero, which means that the toroidal-spheroidal decomposition of B is also decoupled with respect to the spherical components of BT and BS . The toroidal-spheroidal decomposition (28) can be introduced such that the toroidal part BT is divergence-free and the r component of the curl of the spheroidal part BS vanishes: div BT D 0;
er curl BS D 0:
(29)
996
Z. Martinec
It should be emphasized that the axisymmetric geometry of the problem allows one to abbreviate the notation and drop the angular-order index m D 0 for scalar and vector spherical harmonics. The IBVP formulated in the previous section is now examined for the axisymmetric case. First, the product B is decomposed into the toroidal and spheroidal parts. Considering two differential identities div.B/ D div B C grad B; curl.B/ D curl B C grad B
(30)
for BT and BS , respectively, and realizing that (i) for an axisymmetric electrical conductivity , the ' component of the vector grad is identically equal to zero, and (ii) for purely zonal behavior of BT and BS , the r and # components of BT and the ' component of BS are identically equal to zero, it is found that the products grad BT D 0 and er (grad BS ) = 0. Then div.BT / D 0; er curl.BS / D 0:
(31)
In other words, the product of with BT and BS results again in toroidal and spheroidal vectors, respectively. Note that such a “decoupled” spheroidal-toroidal decomposition of B is broken once either of the two basic assumptions of the axisymmetric geometry of the problem is violated. Furthermore, since the rotation of a spheroidal vector is a toroidal vector and vice versa, the IBVP for an axisymmetric case can be split into two decoupled IBVPs: (1) the problem formulated for the spheroidal magnetic induction BS :
1 curl BS curl
@BS D0 @t
in B;
(32)
div BS D 0
in B;
(33)
n BS D bt
on @B;
(34)
C
with an inhomogeneous initial condition BS D B0S at t D 0, and (2) the problem for the toroidal magnetic induction BT :
1 curl BT curl
C
@BT D0 @t
in B;
(35)
with homogeneous initial and boundary conditions. Attention is hereafter turned to the first IBVP, since the latter one has only a trivial solution BT D 0 in B. Note again that such a “decoupled” spheroidal-toroidal decomposition of the boundaryvalue problem for EM induction cannot be achieved if the conductivity depends also on the longitude ' and/or when the external excitation source has not only zonal, but also tesseral and sectoral spherical components.
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
997
For convenience, the toroidal vector potential A (A is not labeled by subscript T since its spheroidal counterpart is not used in this text) that generates the spheroidal magnetic induction BS is introduced: BS D curl A;
div A D 0:
(36)
By this prescription, the divergence-free constraint (33) is automatically satisfied and the IBVP for the spheroidal magnetic induction BS is transformed to the IBVP for the toroidal vector potential A D A.r; #/. In the classical mathematical formulation, the toroidal vector potential A 2 C 2 .B/ C 1 .h0; 1// is searched for such that BS D curlA and @A 1 curl curl A C D0 @t
in B;
(37)
div A D 0
in B;
(38)
on @B
(39)
in B;
(40)
n curl A D bt Ajt D0 D A0
where the conductivity 0 is a continuous function in B; 2 C .B/, > 0 is the constant permittivity of a vacuum, bt .#; t/ 2 C 2 .@B/ C 1 .h0; 1//, and A0 is the generating potential for the initial magnetic induction B0S such that B0S D curlA0 . At the internal interfaces, where the electrical conductivity changes discontinuously, the continuity of the tangential components of magnetic induction and electric intensity is required. The various functional spaces used in this approach are listed in Table 1. The magnetic diffusion equation (37) for A follows from the Maxwell’s equation (3). To satisfy Faraday’s law (4), the electric intensity has to be a toroidal vector, E D ET , of the form ET D
@A : @t
(41)
Table 1 List of functional spaces used C .D / C 1 ((0, 1// C 2 .D / L2 .D /
Space of the continuous functions defined in the domain D (D is the closure of D / Space of the functions for which the classical derivatives up to first order are continuous in the interval (0, 1) Space of the functions for which the classical derivatives up to second order are continuous in D Space of square-integrable functions in D
998
Z. Martinec
Note that the electric intensity ET is not generated by the gradient of a scalar electromagnetic potential, since the gradient of a scalar results in a spheroidal vector that would contradict the requirement that ET is a toroidal vector. Under the prescription (41), the continuity condition (19) of the tangential components of the electric intensity can be ensured by the continuity of the toroidal vector potential A, A D A0 on @B;
(42)
since A has only nonzero tangential components.
3.3
Gauss Representation of Magnetic Induction in the Atmosphere
As mentioned above, the Earth’s atmosphere in the vicinity of the Earth is assumed to be nonconducting, with the magnetic induction B0 generated by the magnetic scalar potential U , which is a harmonic function satisfying Laplace’s equation (12). Under the assumption of axisymmetric geometry (see Sect. 2), its solution is given in terms of zonal solid scalar spherical harmonics r j Yj .#/ and r j 1 Yj .#/: U .r; #; t/ D a
1 X r j j D1
a
.e/ Gj .t/
C
a j C1 r
.i / Gj .t/
Yj .#/
for r a;
(43)
where a is the radius of a conducting sphere B which is equal to a mean radius of .e/ .i / the Earth, and Gj .t/ and Gj .t/ are the time-dependent, zonal spherical harmonic Gauss coefficients of the external and internal magnetic fields, respectively. Using the following formula for the gradient of a scalar function f .r/Yj .#/ in spherical coordinates (Varshalovich et al. 1989, p. 217), s gradŒf .r/Yj .#/ D
j C1 d j j 1 C f .r/Yj .#/ 2j C 1 dr r s j d j C1 j C1 f .r/Yj .#/; 2j C 1 dr r
(44)
the magnetic induction in a vacuum .r a/ may be expressed in terms of solid vector spherical harmonics as 1 p r j 1 X .e/ j 1 B0 .r; #; t/ D j .2j C 1/ Gj .t/Yj .#/ a j D1 a j C2 p .i / j C1 C .j C 1/.2j C 1/ Gj .t/Yj .#/ : r
(45)
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
999
This formula again demonstrates the fact that the toroidal component of the magnetic field in a vacuum vanishes, B0;T D 0. For the following considerations, it is convenient to express the magnetic induction B0 in terms of the toroidal vector potential A0 such that B0 D curl A0 . Using the rotation formulae (222) and (223) for vector spherical harmonics, the spherical harmonic representation of the toroidal vector potential in a vacuum reads as s # "s 1 X j r j .e/ j C 1 a j C1 .i / j Gj .t/ Gj .t/ Yj .#/: A0 .r; #; t/ D a j C 1 a j r j D1 (46) j The representation (46) of A0 by solid spherical harmonics r j Yj .#/ and j
r j 1 Yj .#/ is consistent with the fact that A0 satisfies the vector Laplace equation, r 2 A0 D 0, as seen from Eqs. 37 and 38 for D 0. As introduced above, @B is a sphere (of radius a) with the external normal n coinciding with the spherical base vector er , that is, n D er . Taking into account expression (213) for the polar components of the vector spherical harmonics, the horizontal northward X component of the magnetic induction vector B0 at radius r a is 1 X
X .r; #; t/ WD e# B0 D
Xj .r; t/
j D1
@Yj .#/ ; @#
(47)
where e# is the spherical base vector in the colatitudinal direction. The spherical harmonic coefficients Xj are expressed in the form Xj .r; t/ D
r j 1 a
.e/
Gj .t/ C
a j C2 r
.i /
Gj .t/:
(48)
Similarly, the spherical harmonic representation of the vertical downward Z component of the magnetic induction vector B0 at radius r a is Z.r; #; t/ WD er B0 D
1 X
Zj .r; t/Yj .#/;
(49)
j D1
where the spherical harmonic coefficients Zj are Zj .r; t/ D j
r j 1 a
.e/
Gj .t/ .j C 1/
a j C2 r
.i /
Gj .t/:
(50)
Equations 48 and 50 show that the coefficients Xj and Zj are composed of two .e/ different linear combinations of the spherical harmonics Gj of the external electro.i /
magnetic sources and the spherical harmonics Gj of the induced electromagnetic
1000
Z. Martinec
field inside the Earth. Consequently, there is no need to specify these coefficients separately when Xj and Zj are used as the boundary-value data for the forward and adjoint modeling of EM induction, respectively. Making use of Eq. 45 and formula (219) for the cross product of er with j ˙1 the spheroidal vector spherical harmonics Yj .#/, the tangential component of magnetic induction B0 in a vacuum .r a/ has the form
er B0 .r; #; t/ D
1 r j 1 a j C2 X p .e/ .i / j j .j C 1/ Gj .t/ C Gj .t/ Yj .#/; a r j D1
(51) which, in view of Eq. 48, can be rewritten in terms of the spherical harmonic coefficients Xj (r, t):
er B0 .r; #; t/ D
1 X p j j .j C 1/Xj .r; t/Yj .#/:
(52)
j D1
In the other words, the axisymmetric geometry allows the determination of er B0 from the horizontal northward X component of the magnetic induction vector B0 . In particular, the ground magnetic observation vector bt , defined by Eq. 17, can be expressed as
bt .#; t/ D D
1 h i X p .e/ .i / j j .j C 1/ Gj .t/ C Gj .t/ Yj .#/ j D1 1 X
p j j .j C 1/Xj .a; t/Yj .#/:
(53)
j D1
Likewise, the satellite magnetic observation vector Bt , defined by Bt WD n B0 j@A ;
(54)
where @A is the mean-orbit sphere of radius r D b, can be expressed in terms of the external and internal Gauss coefficients and spherical harmonic coefficients Xj .b; t/, respectively, as " # j C2 1 p X b j 1 .e/ a .i / j j .j C 1/ Gj .t/ C Gj .t/ Yj .#/ Bt .#; t/ D a b j D1 D
1 p X j j .j C 1/Xj .b; t/Yj .#/: j D1
(55)
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
4
1001
Forward Method of EM Induction for the X Component of CHAMP Magnetic Data
The forward method of EM induction can, at least, be formulated for two kinds of the boundary-value data. Either the X component of the CHAMP magnetic data (considered in this section) or the Gauss coefficients of the external magnetic field (the next section) along the track of CHAMP satellite is specified. Most of considerations in this section follow the papers by Martinec (1997), Martinec et al. (2003), and Martinec and McCreadie (2004).
4.1
Classical Formulation
The IBVP (37)–(40) assumes that magnetic data bt are prescribed on the Earth’s surface. For satellite measurements, this requires the continuation of magnetic data from satellite-orbit altitudes down to the Earth’s surface. Since the downward continuation of satellite magnetic data poses a fundamental problem, a modification of the IBVP (37)–(40) such that the X component of CHAMP magnetic data is used directly as boundary values at satellite altitudes is given in this section. The solution domain is extended by the atmosphere A surrounding the conducting sphere B. Since the magnetic signals from night-time, mid-latitude tracks only are considered, it is assumed that there are no electric currents in A. This assumption is not completely correct, but it is still a good approximation (Langel and Estes 1985b). Moreover, A is treated as a nonconducting spherical layer with the inner boundary coinciding with the surface @B of the conducting sphere B with radius r D a and the outer boundary coinciding with the mean-orbit sphere @A of radius r D b. The classical mathematical formulation of the IBVP of global EM induction for satellite magnetic data is as follows. Find the toroidal vector potential A in the conducting sphere B and the toroidal vector potential A0 in the nonconducting atmosphere A such that the magnetic induction vectors in B and A are expressed in the forms B D curl A and B0 D curl A0 , respectively, and, for t > 0, it holds that 1 @A curl curl A C D0 @t
in B;
(56)
div A D 0
in B;
(57)
curl curl A0 D 0
in A;
(58)
div A0 D 0
in A;
(59)
on @B;
(60)
A D A0
n curl A D n curl A0
on @B;
(61)
1002
Z. Martinec
n curlA0 D Bt
on @A;
(62)
Ajt D0 D A0
in B [ A;
(63)
where the mathematical assumptions imposed on the functions A, A0 , , , Bt , and A0 are the same as for the IBVP (37)–(40), see Table 1. The continuity condition (60) on A is imposed on a solution since the intention is to apply a different parameterization of A in the sphere B and the spherical layer A. The term Bt represents the tangential components of the magnetic induction B0 at the satellite altitudes and n is the unit normal to @A. The axisymmetric geometry allows (see Sect. 7) to determine Bt from the horizontal northward X component of the magnetic induction vector B measured by the CHAMP vector magnetometer.
4.2
Weak Formulation
Ground Magnetic Data The IBVP (37)–(40) for the ground magnetic data bt is now reformulated in a weak sense. The solution space is introduced as V WD fAjA 2 L2 .B/; curl A 2 L2 .B/; div A D 0 in Bg;
(64)
where the functional space L2 .B/ is introduced in Table 1. The weak formulation of the IBVP (37)–(40) consists of finding A 2 V C 1 ..0; 1// such that at a fixed time it satisfies the following variational equation: a.A; ıA/ C b.A; ıA/ D f .ıA/
8ıA 2 V;
(65)
where the bilinear forms a.; /, b.; / and the linear functional f ./ are defined as follows: Z
1 a.A; ıA/ WD
.curl A curl ıA/dV;
Z .r; #/
b.A; ıA/ WD B
f .ıA/ WD
(66)
B
1
@A ıA dV; @t
(67)
Z .bt ıA/dS:
(68)
@B
It can be seen that the assumptions imposed on the potential A are weaker in the weak formulation than in classical formulation. Moreover, the assumptions concerning the electrical conductivity and the boundary data bt can also be
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1003
made weaker in the latter formulation. It is sufficient to assume that the electrical conductivity is a square-integrable function in B; 2 L2 .B/, and the boundary data at a fixed time is a square-integrable function on @B; bt 2 L2 .@B/ C 1 ..0; 1//: To show that the weak solution generalizes the classical solution to the problem (37)–(40), it is for the moment assumed that the weak solution A is sufficiently smooth and belongs to A 2 C 2 .B/. Then, the following Green’s theorem is valid: Z
Z
Z
.curl A curlıA/dV D B
.curl curl A ıA/dV B
.n curl A/ ıA dS: @B
(69)
In view of this, the variational equation (65) can be rewritten as follows: 1
Z .curl curl A ıA/dV B
1
Z
Z .n curl A/ ıA dS C @B
B
D
1
Z
@A ıA dV @t .bt ıA/dS:
@B
(70) Taking first Eq. 70 only for the test functions ıA 2 C01 .B/, where C01 .B/ is the space of infinitely differentiable functions with compact support in B, and making use of the implication Z f 2 L2 .B/;
B
.f ıA/dV D 0 8ıA 2 C01 .B/ ) f D 0 in B;
(71)
Eq. 37 is proved. To obtain the boundary condition (39), the following implication is used: Z .f ıA/dS D 0 8ıA 2 C 1 .B/ ) f D 0 on @B; (72) f 2 L2 .@B/; @B
where C 1 .B/ is the space of infinitely differentiable functions in B. It can be seen that if a weak solution of the problem exists and is sufficiently smooth, for instance, if A 2 C 2 .B/, then this solution satisfies the differential equation (37) and the boundary condition (39), all taken at time t. Thus, the weak solution generalizes the classical solution C 2 .B/ since the weak solution may exist even though the classical solution does not exist. However, if the classical solution exists, it is also the weak solution (Kˇrížek and Neittaanmäki 1990).
Satellite Magnetic Data Turning the attention now to the weak formulation of IBVP (56)–(63) for satellite magnetic data Bt , the intention is to apply different parameterizations of the potentials A and A0 . In addition to the solution space V for the conducting sphere B, the solution space V0 for the nonconducting atmosphere A is introduced:
1004
Z. Martinec
V0 WD fA0 jA0 2 C 2 .A/; divA0 D 0 in Ag:
(73)
Note that the continuity condition (60) is not imposed on either of the solution spaces V and V0 . Instead, the Lagrange multiplier vector and a solution space for it are introduced: V WD fj 2 L2 .@B/g:
(74)
The weak formulation of the IBVP (56)–(63) consists of finding (A, A0 , ) 2 (V , V0 , V / C 1 ((0, 1// such that at a fixed time they satisfy the following variational equation: a.A; ıA/Cb.A; ıA/Ca0 .A0 ; ıA0 / C c.ıA ıA0 ; / C c.A A0 ; ı/ D F .ıA0 / 8ıA 2 V; 8ıA0 2 V0 ; 8ı 2 V ; (75) where the bilinear forms a., / and b., / are defined by Eqs. 66 and 67, and the additional bilinear forms a0 ., / and c., / and the new linear functional F ./ are defined as follows: Z 1 a0 .A0 ; ıA0 / WD .curl A0 curl ıA0 /dV; (76) A Z c.A A0 ; / WD .A A0 / dS; (77) @B
F .ıA0 / WD
1
Z .Bt ıA0 /dS:
(78)
@A
To show that the weak solution generalizes the classical solution to the problem (56)–(63), it is again assumed that the weak solution A is sufficiently smooth and belongs to A 2 C 2 .B/. By making use of Green’s theorem (69), the variational equation (75) can be written as 1
Z Z 1 @A ıA dV .curl curl A ıA/dV .n curlA/ ıA dS C @B Z @t B Z B 1 1 C .curl curl A0 ıA0 /dV .n curl A0 / ıA0 dS ZA Z @A 1 C .n curlA0 / ıA0 dS C .ıA ıA0 / dS Z @B Z @B 1 .A A0 / ıdS D .Bt ıA0 /dS: C @B @A (79) Z
Now, taking Eq. 79 for the test functions ıA 2 C01 .B/ and making use of the implication (71), Eq. 56 is proved. Likewise, taking Eq. 79 for the test functions ıA0 2 C01 .A/ and using the implication (71) for the domain A, Eq. 58 is proved.
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1005
To obtain the continuity conditions (60) and (61), the implication (72) is used for the test functions ıA 2 C 1 .B/. The boundary condition (62) can be obtained by an analogous way if the implication (72) is considered for @A. It can therefore be concluded that if a weak solution of the problem exists and is sufficiently smooth, for instance, if A belongs to the space of functions whose second-order derivatives are continuous in B, then this solution satisfies the differential equations (56) and (58), the interface conditions (60) and (61), and the boundary conditions (62), all taken at time t.
4.3
Frequency-Domain and Time-Domain Solutions
Two approaches of solving the IBVPs of EM induction with respect to the time variable t are now presented. The variational equation (65) is first solved in the Fourier-frequency domain, assuming that all field variables have a harmonic time dependence of the form e i !t . O the weak formulation of EM induction Denoting the Fourier image of A by A, for ground magnetic data in the frequency domain is described by the variational equation: O ı A/ O C i !b1 .A; O ı A/ O D f .ı A/8ı O O 2 V; a.A; A
(80)
O ı A/ O is defined by where the new bilinear form b1 .A; Z O ı A/ O WD b1 .A;
O ı A/dV: O .r; #/.A
(81)
B
O the solution is transformed back to the time domain by Having solved Eq. 80 for A, applying the inverse Fourier transform. Alternatively, the IBVP for ground magnetic data can be solved directly in the time domain, which is the approach applied in the following. There are several choices for representing the time derivative of the toroidal vector potential A in the bilinear form b., /. For simplicity, the explicit Euler differencing scheme will be chosen and @ A/@t will be approximated by the differences of A at two subsequent time levels (Press et al. 1992): @A A.r; #; ti C1 / A.r; #; ti / DW @t ti C1 ti
i C1
A i A ; ti
(82)
where i A denotes the values of A at discrete time levels 0 D t0 < t1 < < ti C1 < . The variational equation (65), which is now solved at each time level ti , i D 0; 1; : : :, has the form
1006
Z. Martinec
1 1 A; ıA C b1 i C1 A; ıA D b1 i A; ıA C f i C1 bt ; ıA 8ıA 2 V; ti ti (83) where the bilinear form b1 ., / is defined by Eq. 81. The same two approaches can be applied to the IBVP of EM induction for satellite magnetic data. Here, the time-domain approach is only presented. The variational equation (75) is discretized with respect to time and solved at each time level ti : a
i C1
a
1 A; ıA C b1 i C1 A; ıA Ca0 i C1 A0 ; ıA0 C c ıA ıA0 ;i C1 ti i C1 1 i C1 Cc A A0 ; ı D b1 i A; ıA C F i C1 Bt ; ıA0 ti 8ıA 2 V; 8ıA0 2 V0 ; 8ı 2 V :
i C1
(84)
4.4
Vector Spherical Harmonic Parameterization over Colatitude
For the axisymmetric geometry of external sources and the conductivity model, it has been shown that the induced electromagnetic field is axisymmetric and the associated toroidal vector potential is an axisymmetric vector. It may be represented j in terms of zonal toroidal vector spherical harmonics Yj .#/. Their explicit forms are as follows (more details are given in the Appendix): j
Yj .#/ WD Pj 1 .cos #/e ;
(85)
where Pj 1 .cos #/ is the associated Legendre function of degree j and order m D 1 and e' is the spherical base vector in the longitudinal direction. An important j property of the functions Yj .#/ is that they are divergence-free: h i j div f .r/Yj .#/ D 0;
(86)
where f .r/ is a differentiable function. The required toroidal vector potential A and test functions ı A inside the j conducting sphere B can be represented as a series of the functions Yj .#/:
j
A.r; #; t/ ıA.r; #/
D
1 X j D1
(
) j Aj .r; t/ j Yj .#/; j ıAj .r/
(87)
j
where Aj .r; t/ and ı Aj .r/ are spherical harmonic expansion coefficients. The j
divergence-free property of functions Yj .#/ implies that both the toroidal vector
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1007
potential A and test functions ı A are divergence-free. Therefore, the parameterization (87) of potentials A and ı A automatically satisfies the requirement that the functions from the solution space V be divergence-free. The parameterization (87) is also employed for the Lagrange multipliers .#; t/ and the associated test functions j j ı.#/ with the expansion coefficients j .t/ and ıj , respectively. Introducing the spherical harmonic representation of the zonal toroidal vector A, the curl of A is a zonal spheroidal vector:
curl A D
C1;2 1 jX X
Rj` .AI r/Y`j .#/;
(88)
j D1 `Dj 1
where Rj` .AI r/ are given by Eq. 223 in the Appendix. The substitution of Eqs. 87 and 88 into Eqs. 66 and 81 leads to the spherical harmonic representation of the bilinear forms a., / and b1 ., /: 1 j C1;2 Z a 1X X a.A; ıA/ D Rj` .AI r/Rj` .ıAI r/r 2 dr; j D1 0 Z a `Dj 1 b1 .A; ıA/ D E.A; ıAI r/r 2 dr;
(89)
0
where the orthogonality property (217) of vector spherical harmonics has been employed and E denotes the angular part of the ohmic energy: E.A; ıAI r/ D Z 1 1 X X j1 j .r; #/ Aj1 .r; t/Pj1 ;1 .cos #/ ıAj22 .r/Pj2 ;1 .cos #/ sin #d #: 2 0
j1 D1
j2 D1
(90) Likewise, substituting Eqs. 53 and 87 into Eq. 68 results in the spherical harmonic representation of the linear functional f ./: f .ıA/ D
4.5
1 h i a2 X p .e/ .i / j j .j C 1/ Gj .t/ C Gj .t/ ıAj .a/: j D1
(91)
Finite-Element Approximation over the Radial Coordinate
Inside the conducting sphere B, the range of integration 0, a is divided over the radial coordinate into P subintervals by the nodes 0 D r1 < r2 < < rP < rP C1 D a. The piecewise-linear basis functions defined at the nodes by the relation k .ri / D ıki can be used as the basis function of the Sobolev functional space
1008
Z. Martinec
W21 .0; a/. Note that only two basis functions are nonzero in the interval rk r rkC1 , namely, k .r/
rkC1 r ; hk
D
kC1 .r/
D
r rk ; hk
(92) j
where hk D rkC1 rk . Since both the unknown solution Aj .r; t/ and test j
functions ıAj .r/ are elements of this functional space, they can be parameterized by piecewise linear finite elements k .r/ such that (
j
Aj .r; t/ j ıAj .r/
) D
P C1 X
(
kD1
j;k
Aj .t/ j;k ıAj
) k .r/:
(93)
The finite-element representation of curl A coefficients then reads as s
1 j C1 j;k C .r/ Aj k hk r j C1 1 j;kC1 ; C C kC1 .r/ Aj hk r
j 1 Rj .AI r/
D
s j C1 Rj .AI r/
D
j C1 2j C 1
j 2j C 1
1 j hk r
k .r/
j;k
Aj C
1 j hk r
kC1 .r/
;
j;kC1
Aj
(94) where rk r rkC1 . Since the electrical conductivity .r; #/ 2 L2 .B/, the radial dependence of can be approximated by piecewise constant functions: .r; #/ D k .#/;
rk r rkC1 ;
(95)
where k .#/ does not depend on the radial coordinate r and may be further approximated by piecewise constant functions in colatitude #. (However, this approximation will not be denoted explicitly.) The integrals over r in Eq. 89 can be divided into P subintervals: a.A; ıA/ D
b1 .A; ıA/ D
1
1 j C1;2 P R P P P rkC1 j D1 `Dj 1 kD1
P Z X
rkC1
kD1 rk
rk
Rj` .AI r/Rj` .ıAI r/r 2 dr;
E.A; ıAI r/r 2 dr;
(96)
(97)
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1009
and the integration over r is reduced to the computation of integrals of the type Z
rkC1 i .r/ j .r/r
2
dr;
(98)
rk
where the indices i and j are equal to k and/or k C 1. These integrals can be evaluated numerically, for example, by means of the two-point Gauss-Legendre p numerical quadrature with the weights equal to 1 and the nodes x1;2 D ˙1= 3 (Press et al. 1992, Sect. 4.5). For instance, the quadrature formula for the integral in Eq. 97 can be written in the form b1 .A; ıA/ D
P X 2 X
E.A.r˛ /; ıA.r˛ /I r˛ /
kD1 ˛D1
r˛2 hk ; 2
(99)
where r˛ WD 12 .hk x˛ C rk C rkC1 /, ˛ D 1; 2. The integration over colatitude # in the term E, see Eq. (90), can also be carried out numerically by the Gauss-Legendre quadrature formula. Computational details of this approach can be found in Orszag (1970) or Martinec (1989).
4.6
Solid Vector Spherical Harmonic Parameterization of A0
Our attention is now turned to the parameterization of the toroidal vector potential A0 and test functions ı A0 in an insulating atmosphere A. Equation 46 shows that A0 , and also ıA0 , can be represented as a series of the zonal toroidal vector j j spherical harmonics Yj .#/ with the spherical expansion coefficients A0;j and ı j
A0;j , respectively, of the form (
j
A0;j .r; t/ j ıA0;j .r/
"s
) Da
s
j r j j C1 a
j C 1 a j C1 j r
(
(
.e/
)
.i /
)#
Gj .t/ .e/ ıGj Gj .t/ .i / ıGj
for a r b:
.e/
(100)
.i /
The zonal scalar-magnetic Gauss coefficients Gj .t/ and Gj .t/ are considered known in the case of the IBVP for ground magnetic data as they constitute the ground magnetic data bt on B; see (53). However, for satellite magnetic data, the .e/ .i / coefficients Gj .t/ and Gj .t/ in the insulating atmosphere A are, in addition to j;k
coefficients Aj .t/, unknowns and are sought by solving the IBVP (56)–(63). The .e/
.i /
associated test-function coefficients are denoted by ıGj and ıGj . Applying the operator curl on the parameterization (100) and substituting the result into Eq. (76) results in the parameterization of the bilinear form a0 ., /:
1010
Z. Martinec
" # 1 b 2j C1 a3 X .e/ .e/ a0 .A0 ; ıA0 / D j 1 Gj .t/ıGj j D1 a a 2j C1 .i / .i / .j C 1/ 1 Gj .t/ıGj ; b
(101)
where a and b are the radii of the spheres @B and @A, respectively. The continuity condition (60), that is, A D A0 on @B, is now expressed in terms of spherical harmonics. Substituting for the spherical harmonic representations (87) of A and (100) of A0 , respectively, into Eq. (60) results in the constraint between the .e/ .i / external coefficients Gj .t/, the internal coefficients Gj .t/ of the toroidal vector j
potential A0 in the atmosphere A, and the coefficients Aj .a; t/ of the toroidal vector potential A in the conducting sphere B: "s j Aj .a; t/
Da
s j .e/ G .t/ j C1 j
# j C 1 .i / Gj .t/ : j
(102)
This continuity condition can be used to express the bilinear form c., /, defined by Eq. 77, in terms of spherical harmonics as follows: # " s 1 X j j 1 j .e/ .i / j c.A A0 ; / D Aj .a; t/ C Gj .t/ Gj .t/ j .t/; a j C 1 j C 1 j D1 (103) j j
where .t/ are zonal toroidal vector spherical harmonic expansion coefficients of the Lagrange multiplier . Finally, making use of Eqs. (55) and (100), the linear functional F ./ defined by Eq. 78 can be expressed in the form 1 b2 X p j F .ıA0 / D j .j C 1/Xj .b; t/ıA0;j .b/; j D1
(104)
j
where the spherical harmonic coefficients ıA0;j .b/ of the test functions ıA0 .b; #/ are given by Eq. 100 for r D b: "s j ıA0;j .b/
Da
j j C1
s # j b j C 1 a j C1 .i / .e/ ıGj ıGj : a j b
(105)
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1011
In view of this, the functional F ./ thus reads as # " 1 a j C1 ab 2 X b j .e/ .i / F .ıA0 / D Xj .b; t/ j ıGj .j C 1/ ıGj : j D1 a b (106)
5
Forward Method of EM Induction for the External Gauss Coefficients .obs/
The case where the time series of the CHAMP-derived coefficients Xj
.t/ and
.obs/ Zj .t/
coefficients (see Sect. 7) are converted to a time series of spherical harmonic coefficients of external and internal fields at the CHAMP satellite altitude is .e;obs/ .i;obs/ now considered. To obtain these coefficients, denoted by Gj .t/ and Gj .t/, the Gaussian expansion of the external magnetic potential is undertaken at the satellite orbit of radius r D b, which results in Eqs. (48) and (50) where the radius r equals to b. The straightforward derivation then yields i 1 h .obs/ .obs/ .j C 1/Xj .t/ C Zj .t/ ; 2j C 1 i 1 h .obs/ .i;obs/ .obs/ jXj .t/ Zj .t/ : Gj .t/ D 2j C 1 .e;obs/
Gj
.t/ D
.e;obs/
The satellite observables Gj
ground-based Gauss coefficients
.i;obs/
.t/ and Gj
.e/ Gj .t/
and
.t/ are related to the original,
.i / Gj .t/
.e;obs/
(107)
by
.e/
.t/ D .b=a/j 1 Gj .t/; Gj .i;obs/ .i / Gj .t/ D .a=b/j C2 Gj .t/: .e/
(108)
.i /
When the Gauss coefficients Gj .t/ and Gj .t/ are computed from the satellite .e;obs/
.i;obs/
observables Gj .t/ and Gj .t/ by inverting Eq. 108, the aim is to solve the downward continuation of satellite magnetic data from the satellite’s orbit to the Earth’s surface. It is, in principal, a numerically unstable problem, in particular for higher-degree spherical harmonic coefficients, since noise contaminated the .e;obs/ .i;obs/ .t/ and Gj .t/ is amplified by a factor of .b=a/j satellite observables Gj .e/
.i /
when computing the ground-based Gauss coefficients Gj .t/ and Gj .t/. Hence, the IBVP of EM induction is assumed to be solved only for low-degree spherical harmonics, typically up to spherical harmonic degree jmax D 5. In this case, Martinec and McCreadie (2004) showed that the downward continuation of the
1012
Z. Martinec
satellite-determined coefficients from the CHAMP satellite orbit to the ground is .e/ .i / numerically stable. This fact will be adopted and low-degree Gj .t/ and Gj .t/ are assumed to be calculated from the satellite observables by inverting (108). However, it should be noted that future satellite missions, such as SWARM (Olsen et al. 2006b), may provide reliable information about higher-degree spherical harmonic coefficients. Then, their downward continuation from a satellite’s altitude to the ground will become numerically unstable and the forward and adjoint IBVP of EM .e;obs/ .i;obs/ .t/ and Gj .t/. induction will need to be formulated directly for Gj
5.1
Classical Formulation .e/
.i /
Given the Gauss coefficients Gj .t/ and Gj .t/ as observations, the forward IBVP of EM induction can be reformulated. From several possible combinations of these coefficients, it is natural to consider that the external Gauss coefficients .e/ Gj .t/ are used as the boundary-value data for the forward EM induction .i /
method, while the internal Gauss coefficients Gj .t/ are used for the adjoint EM induction method. The modification of the forward method is now derived. The first modification concerns the solution domain. While in the previous case for the X and Z components of the CHAMP magnetic data, the solution domain consists of a conducting sphere B surrounded by an insulating atmosphere A, .e/ .i / in the present case where the Gauss coefficients Gj .t/ and Gj .t/ are used as observations, it is sufficient to consider only the conducting sphere B as the solution domain. Note, however, that the solution domain will again consist of the unification .e;obs/ .i;obs/ of B and A when the satellite observables Gj .t/ and Gj .t/ are taken as observations. Another modification concerns the boundary condition (16). Making use of Eq. 45 and formulae (218) and (219) for the scalar and vector products of e r with j ˙1 the spheroidal vector spherical harmonics Yj .#/, the continuity of the normal and tangential components of the magnetic induction vector B on the boundary @B, see Eqs. 15 and 20, is of the form .e/
.i /
Œn curl A.a/ j D jGj C .j C 1/Gj ; p j .e/ .i / Œn curl A.a/ j D j .j C 1/ Gj C Gj :
(109) (110)
Combining these equations such that the external and internal Gauss coefficients are separated, and making the scalar product of er with Eq. 88 for r D a, that is, p j .j C 1/ j Œn curl A.a/ j D Aj .a/; a
(111)
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1013
results in s
j .e/ .2j C 1/Gj ; D j C1 s j C1 j C1 j j .i / Aj .a/ C Œn curl A.a/ j D .2j C 1/Gj : a j
j j j Aj .a/ C Œn curl A.a/ j a
(112)
The last two equations represent the boundary conditions, which will be used in the forward and adjoint IBVP of EM induction, respectively. .e/ The forward IBVP of EM induction for the Gauss coefficient Gj .t/ can now be formulated as follows. Given the conductivity .r; #/ of sphere B, the toroidal vector potential A is searched for such that, for t > 0, it holds that 1 @A curl curl A C D 0 in B @t
(113)
with the boundary condition Œn curl
j A.a/ j
s j j j .e/ .2j C 1/Gj Aj .a/ D a j C1
on @B
(114)
and the inhomogeneous initial condition Ajt D0 D A0
5.2
in B:
(115)
Weak Formulation
The IBVP (113)–(115) can again be reformulated in a weak sense. By this it is meant that A 2 V C 1 ..0; 1// is searched for such that at a fixed time it satisfies the following variational equation: a1 .A; ıA/ C b.A; ıA/ D f1 .ıA/
8ıA 2 V;
(116)
where the solution space V is defined by Eq. 64. The new bilinear form a1 ., / and the new linear functional f1 ./ are expressed in terms of the original bilinear form j a., / and the coefficient Aj .a; t/ as follows: aX j j jA .a; t/ıAj .a/; j D1 j 1
a1 .A; ıA/ D a.A; ıA/ 1 a2 X f1 .ıA/ D j D1
(117)
s j .e/ j .2j C 1/Gj .t/ıAj .a/: j C1
(118)
1014
Z. Martinec
It should be emphasized that there is a difference in principle between the original variational equation (65) and the modification (116) in prescribing the boundary data on the surface @B. Equation 65 requires the prescription of the tangential components of the total magnetic induction in a vacuum on @B. Inspecting the functional f ./ in Eq. (91) shows that this requirement leads to the necessity to .e/ .i / define the linear combinations Gj .t/ C Gj .t/ for j D 1, 2, . . . , as input boundary data for solving Eq. 65. In contrast to this scheme, the functional f1 ./ on the right.e/ hand side of Eq. (116) only contains the spherical harmonic coefficients Gj .t/. .e/
Hence, to solve Eq. 116, only the spherical harmonic coefficients Gj .t/ of the external electromagnetic source need to be prescribed, while the spherical harmonic .i / coefficients Gj .t/ of the induced magnetic field within the earth are determined after solving Eq. (116) by means of Eq. 112: s .i / Gj
j 1 .e/ Gj D j C1 a
j j A .a/: j C1 j
(119)
The former scheme is advantageous in the case where there is no possibility of separating the external and internal parts of magnetic induction observations by spherical harmonic analysis. The latter scheme can be applied if such an analysis can be carried out or in the case when the external magnetic source is defined by a known physical process.
6
Time-Domain, Spectral Finite-Element Solution
Finally, the spectral finite-element solution to the IBVP of EM induction for CHAMP magnetic data is introduced. For the sake of simplicity, the case where the .e/ spherical harmonic coefficients Gj .t/ of the external electromagnetic source are considered as input observations is treated first. Introducing the finite-dimensional functional space as 8 9 jmax P C1 < = X X j;k j ıAj k .r/Yj .#/ ; (120) Vh WD ıA D : ; j D1 kD1
where jmax and P are finite cutoff degrees, the Galerkin method for approximating the solution of variational equation (116) at a fixed time ti C1 consists in finding i C1 Ah 2 Vh such that a1
i C1
1 1 Ah ; ıAh C b1 i C1 Ah ; ıAh D b1 i Ah ; ıAh ti ti .e/ C f1 i C1 Gj ; ıAh 8ıAh 2 Vh : (121)
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1015
The discrete solution i C1 Ah of this system of equations is called the time-domain, spectral finite-element solution. For a given angular degree j (and a fixed time ti C1 ), j;k there are P C 1 unknown coefficients i C1 Aj in the system (121) that describe the solution in the conducting sphere B. Once this system is solved, the coefficient i C1 .i / Gj of the induced magnetic field is computed by means of the continuity condition (119). The time-domain, spectral finite-element solution can similarly be introduced to the IBVP of EM induction for the CHAMP magnetic data in the case where the spherical harmonic expansion coefficients Xj .t/ of the X component of the magnetic induction vector B0 measured at satellite altitudes are considered as input observations. Besides the functional space Vh , the finite-dimensional functional subspaces of the spaces V0 and V are constructed by the following prescriptions: 8 9 s "s # jmax < = a j C1 X j r j j C 1 .e/ .i / j V0;h WD ıA0 D a ıGj ıGj Yj .#/ ; : ; j C1 a j r j D1
V;h
8 9 jmax < = X j j WD ı D ıj Yj .#/ : : ;
(122) (123)
j D1
The Galerkin method for approximating the solution of the variational equation (75) at a fixed time ti C1 consists in finding i C1 Ah 2 Vh ,i C1 A0;h 2 V0;h and i C1 h 2 V;h , satisfying the variational equation a
1 i C1 Ah ; ıAh C b1 Ah ; ıAh Ca0 i C1 A0;h ; ıA0;h Cc ıAh ıA0;h ;i C1 h ti 1 b1 i Ah ; ıAh C F i C1 Bt ; ıA0;h Cc i C1 Ah i C1 A0;h ; ıh D ti
i C1
8ıAh 2 Vh ; 8ıA0;h 2 V0;h ; 8ıh 2 V;h : (124) For a given angular degree j (and a fixed time ti C1 ), the unknowns in Eq. 124 consist j;k of P C 1 coefficients i C1 Aj describing the solution in the conducting sphere B, the coefficients
i C1
.e/
Gj
and
i C1 j j
i C1
.i /
Gj describing the solution in a nonconducting
spherical layer A, and ensuring the continuity of potentials i C1 A and i C1 A0 on the Earth’s surface @B. In total, there are P C 4 unknowns in the system for a given j . Martinec et al. (2003) tested the time-domain, spectral finite-element method .e/ for the spherical harmonic coefficients Gj .t/, described by the variational equation (121), by comparing the results with the analytical and semi-analytical solutions to EM induction in two concentrically and eccentrically nested spheres
1016
Z. Martinec
of different, but constant electrical conductivities. They showed that the numerical .e/ code implementing the time-domain, spectral finite-element method for Gj .t/ performs correctly, and the time-domain, spectral finite-element method is particularly appropriate when the external current excitation is transient. Later on, Martinec and McCreadie (2004) made use of these results and tested the time-domain, spectral finite-element method for satellite magnetic data, described by the variational equation (124), by comparing it with the timedomain, spectral finite-element method for ground magnetic data. They showed that agreement between the numerical results of the two methods for synthetic data is excellent.
7
CHAMP Data Analysis
7.1
Selection and Processing of Vector Data
The data analyzed in this chapter were recorded by the three-component vector magnetometer on board of CHAMP. To demonstrate the performance of the forward method, from all records spanning more than 8 years, the 1-year-long time series from January 1, 2001 (track No. 2610), to January 10, 2002 (track. No. 8402), has been selected. Judging from the Dst index (Fig. 4), there were about ten events when the geomagnetic field was significantly disturbed by magnetic storms or substorms. In order to minimize the effect of strong day-side ionospheric currents, night-side data recorded by the satellite between 18:00 and 6:00 local-solar time are only used. In the first step of the data processing, the CHAOS model of the Earth’s magnetic field (Olsen et al. 2006a) is used to separate the signals corresponding to EM induction by storm-time magnetospheric currents. Based on the CHAOS model, the main and crustal fields up to degree 50 and the secular variation up to degree 18 are removed from the CHAMP data. In the next step, the horizontal magnetic components .X; Y / are rotated from geographic coordinates to dipole coordinates, assuming that the north geomagnetic pole is at 78.8ıN, 70.7ıW. Since an axisymmetric geometry of external currents and mantle electrical conductivity is assumed, the dipolar longitudinal component Y is not considered hereafter and X and Z are used to describe the northward and downward magnetic components in dipolar coordinates, respectively. Figure 1 shows an example of the original and rotated data from CHAMP track No. 6755.
7.2
Two-Step, Track-by-Track Spherical Harmonic Analysis
The input data of the two-step, track-by-track spherical harmonic analysis are the samples of the X component of the residual magnetic signal along an individual satellite track, that is, data set .#i ; Xi /, i D 1; : : :; N , where #i is the geomagnetic colatitude of the i th measurement side and N is the number of data points. The magnetic data from low and mid-latitudes within the interval .#1 ; #2 / are
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1017
Track No. 6755
Xg(103 nT)
50
0
50 0 X (nT)
Yg(103 nT)
–50 10
0
–100 –10
–150 100
50
50 Z (nT)
Z(103 nT)
–50
0
0 –50 –100
–50 0
60
120
Geographic colatitude
180
–150
0
60
120
180
Geomagnetic colatitude
Fig. 1 CHAMP satellite magnetic data along track No. 6755 (red line on global map shows the satellite track), which samples the initial phase of a magnetic storm on September 26, 2001, above the East Pacific Ocean. Left panels: the original CHAMP data plotted along geographical colatitude. Xg , Yg , and Z components point, respectively, to the geographic north, the geographic east, and downwards. Right panels: Black lines denote X and Z CHAMP components after the removal of the CHAOS model and the rotation of the residual field to dipole coordinates. The red lines show the results of the two-step, track-by-track spherical harmonic analysis, including the extrapolation into the polar regions using data from the mid-colatitude interval (40ı , 140ı ), as marked by dotted lines
only considered in accordance with the assumption that global EM induction is driven by the equatorial ring currents in the magnetosphere. Hence, observations from the polar regions, which are contaminated by signals from field-aligned currents and polar electrojets, are excluded from the analyzed time series. The satellite-track data .#i ; Xi / are referenced to the time when CHAMP passes the magnetic equator. In view of parameterization (47), N observational equations for data Xi are considered in the form jmax X j D1
Xj .t/
@Yj .#k / C ei D Xi ; i D 1; : : : ; N; @#
(125)
1018
Z. Martinec
where Xj .t/ are the expansion coefficients to be determined by a least-squares method and jmax is the cutoff degree. The measurement errors ek are assumed to have zero means, have uniform variances 2 , and are uncorrelated: D 0; Eei D 2; var ei cov.ei ; ej / D 0 for i ¤ j;
(126)
where E, var, and cov are the statistical expectancy, the variance, and the covariance operator, respectively. The spherical harmonic analysis of satellite-track magnetic measurements of the X component of the magnetic induction vector is performed in two steps.
Change of the Interval of Orthogonality In the first step, the data Xi are mapped from the mid-latitude interval # 2 .#1 ; #2 / onto the half-circle interval # 0 2 .0; / by the linear transformation # 0 .#/ D
# #1 ; #2 #1
(127)
and then adjusted by a series of Legendre polynomials: 0
X .# 0 / D
N X
Xj0 Yj .# 0 /:
(128)
j D0
Likewise, the samples of the Z component of the residual magnetic signal along an individual satellite track, that is, data set .#i ; Zi /, are first mapped from the midlatitude interval # 2 .#1 ; #2 / onto the half-circle interval # 0 2 .0; / and then expanded into a series of Legendre polynomials: 0
0
Z.# / D
N X
Zj0 Yj .# 0 /:
(129)
j D0
The expansion coefficients Xj0 and Zj0 are determined by fitting the models (128) and (129) to mid-latitude magnetic data Xi and Zi , respectively. Since the accuracy of the CHAMP magnetic measurements is high, both long-wavelength and shortwavelength features of the mid-latitude data are adjusted. That is why the cutoff degree N 0 is chosen to be large. In the following numerical examples, N 0 D 25, while the number of datum points is N D 1;550. Because of data errors, the observational equations based on the models (128) and (129) are inconsistent and an exact solution to these systems does not exist. The solution to each system of equations is estimated by a least-squares method. Since this method is well documented in the literature (e.g., Bevington 1969), no details are given.
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1019
Extrapolation of Magnetic Data from Mid-latitudes to PolarRegions When the analysis of mid-latitude data Xi is complete, the signal that best fits the mid-latitude data is extrapolated to the polar regions. To do it, it is required that the original parameterization (125) of the X component matches that found in the previous step: ˇ N0 X @Yj .#/ ˇˇ Xj .t/ Xk0 Yk .# 0 /; ˇ 0 D @# #.# / j D1 jmax X
(130)
kD0
where # D #.# 0 / denotes the inverse mapping to (127) and the coefficients Xk0 are known from the previous step. To determine Xj .t/, the orthonormality property of Yk .# 0 / is used and the extrapolation condition (130) is rewritten as a system of linear algebraic equations:
2
jmax X
Z Xj .t/
j D1
# 0 D0
ˇ @Yj .#/ ˇˇ Yk .# 0 / sin # 0 d # 0 D Xk0 @# ˇ#.# 0 /
(131)
for k D 0; 1; : : : ; N 0 . In a similar way, the extrapolation condition for the Z component can be expressed as
2
jmax X j D1
Z Zj .t/
# 0 D0
Yj .#.# 0 //Yk .# 0 / sin # 0 d # 0 D Zk0 :
(132)
In contrast to the previous step, only long-wavelength features of mid-latitude data are extrapolated to the polar regions; thus, jmax N 0 . In the following numerical examples, only the range 2 jmax 6 is considered, depending on the character of the mid-latitude data. This choice implies that both systems of linear equations are overdetermined and are solved by a least-squares method. The least.obs/ squares estimates of the coefficients Xj .t/ and Zj .t/ will be denoted by Xj .t/ .obs/
.obs/
.obs/
and Zj .t/, respectively. Respective substitutions of Xj .t/ and Zj .t/ into Eqs. 47 and 49 yield smooth approximations of the X and Z components inside the colatitude interval .#1 ; #2 / as well as undisturbed extrapolations into the polar regions (0ı ,#1 / [ .#2 ,180ı).
Selection Criteria for Extrapolation The crucial points of the extrapolation are the choice of the truncation degree jmax of the parameterization (125) and the determination of the colatitude interval .#1 ; #2 / where the data are not disturbed by the polar currents. Martinec and McCreadie (2004) and Velímský et al. (2006) imposed three criteria to determine these two parameters. First, the power of the magnetic field from the external ring currents is concentrated in the low-degree harmonic coefficients, particularly in the j D 1 term, and the leakage of electromagnetic energy into higher-degree terms
1020
Z. Martinec
caused by the Earth’s conductivity and electric-current geometry monotonically decreases. This criterion is applied in such a way that the analysis begins with degree jmax D 1, increases it by one, and plots the degree-power spectrum of the .obs/ coefficients Xj .t/. While the degree-power spectrum is a monotonically decreasing function of angular degree j , increasing the cutoff degree jmax is continued. .obs/ Once the degree-power spectrum of Xj .t/ no longer decreases monotonically, the actual cutoff degree is taken from the previous step for which the degree-power spectrum still monotonically decreases. The degree-power spectrum of coefficients .obs/ Xj .t/ for the final choice of cutoff degree jmax is shown in the third-row panels of Fig. 5. This criterion can be interpreted as follows. The largest proportion of the magnetospheric ring-current excitation energy is concentrated in the low-degree harmonic coefficients, particularly in the j D 1 term. The leakage of the electromagnetic energy from degree j D 1 to higher degrees is caused by lateral heterogeneities in the electrical conductivity of the Earth’s mantle. The more pronounced the lateral heterogeneities, the larger the transport of energy from degree j D 1 to higher degrees. Accepting the criterion of a monotonically decreasing degreepower spectrum means therefore that the Earth’s mantle is regarded as only weakly laterally heterogeneous. Second, the first derivative of the X component with respect to colatitude does not change sign in the polar regions. This criterion excludes unrealistic oscillatory behavior of the X component in these regions caused by a highdegree extrapolation. Third, if the least-squares estimate of the X component of CHAMP data over the colatitude interval (#1 5ı ; #2 C 5ı ) differs by more than 10 nT compared to the estimate over the interval .#1 ; #2 /, the field due to the polar currents is assumed to encroach upon the field produced by nearequatorial ring currents, and the narrower colatitude interval .#1 ; #2 / is considered to contain only the signature generated by the near-equatorial currents. Applying these criteria to the CHAMP-track data iteratively, starting from degree j D 1 and the colatitude interval (10ı , 170ı ) and proceeding to higher degrees and shorter colatitude intervals, it is found that the maximum cutoff degree varies from track to track, but does not exceed jmax D 6 and the colatitude interval is usually (40ı , 140ı). The extrapolation of the Z component from the field at low and mid-latitudes is more problematic than that for the X component. This is because (i) the second selection criterion cannot be applied since the Z component does not approach zero at the magnetic poles as seen from parameterization (49) and (ii) the Z component of CHAMP magnetic data contains a larger portion of high-frequency noise than the X component, which, in principle, violates the assumption of the third selection criterion. Figure 5 shows that the leakage of electromagnetic energy from j D 1 to higher-degree terms is not monotonically decreasing for the Z component. That is .obs/ why the least-squares estimates Zj .t/ are extrapolated to polar regions from the colatitude interval .#1 ; #2 / and up to the spherical degree jmax determined for the X component.
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1021
Examples of Spherical Harmonic Analysis of the CHAMP Magnetic Data Presented here are four examples of the spherical harmonic analysis of the CHAMP magnetic data recorded in the period from September 25 to October 7, 2001. This period is chosen because it includes a magnetic storm followed by a magnetic substorm, as seen from the behavior of the Dst index (see Fig. 2). For demonstration purposes, four CHAMP-track data sets are chosen: the data recorded along track No. 6732 as an example of data analysis before a magnetic storm occurs; track No. 6755 represents the main phase of a magnetic storm; track No. 6780 represents the recovery phase of a storm; and track No. 6830 represents the appearance of a substorm. In Fig. 3, the X component of the original CHAMP magnetic data reduced by the main magnetic field and the lithospheric magnetic field is shown. The top panels show the residual magnetic signals for the night-time mid-latitudes and the filtered signals after the first step of the spherical harmonic analysis has been performed. The mid-latitude data Xi are adjusted by the model (128) rather well by choosing N 0 D 25. For the sake of completeness, the second-row panels of Fig. 3 show the degree-power spectrum of the coefficients Xj0 . The degree-power spectrum of .obs/ the coefficients Xj .t/ for the cutoff degree jmax chosen according to the first selection criterion is shown in the third-row panels of Fig. 3. The bottom panels of Fig. 3 show the residual signals over the whole night-time track derived from the CHAMP observations and the signals extrapolated from lowlatitude and mid-latitude data. First, the well-known fact can be seen that the original magnetic data are disturbed at the polar regions by sources other than equatorial ring currents in the magnetosphere. Second, since there is no objective criterion for evaluating the quality of the extrapolation of the X component to the polar regions, it is regarded subjectively. For the track data shown here, but also for the other data for the magnetic storm considered, the extrapolation of the X component from mid-latitudes to the polar regions works reasonably well, provided that the cutoff
Time after storm onset (days) 50
0
3
6
9
12
nT
0 –50 –100 –150 –200
6,740 6,760 6,780 6,800 6,820 6,840 6,860 6,880 6,900
Track number Fig. 2 The Dst index for the magnetic storm that occurred between September 25 and October 7, 2001. The arrows mark the satellite tracks chosen to demonstrate the two-step, track-by-track spherical harmonic analysis of satellite magnetic data
Fig. 3 Examples of the two-step, track-by-track spherical harmonic analysis of magnetic signals along four satellite tracks. The top panels show the X component of the residual magnetic signals at the night-time mid-latitudes derived from the CHAMP magnetic observations (thin lines) and the predicted signals after the first step of the spherical harmonic analysis has been completed (thick lines). The number of samples in the original signals is N D 1;550. The second- and third-row panels show the degree-power spectrum of the coefficients Xj0 and Xj .t /, respectively. The cutoff degree of the coefficients Xj0 is fixed to N 0 D 25, while the cutoff degree jmax of the coefficients Xj .t / is found by the criteria discussed in the text. The bottom panels show the X component of the residual magnetic signals over the whole night-time tracks (thin lines) and the signals extrapolated from mid-latitude data according to the second step of the spherical harmonic analysis (thick lines). The longitude when the CHAMP satellite crosses the equator of the geocentric coordinate system is 55.19ı , 127.19ı , 97.15ı , and 174.23ı for tracks No. 6732, 6755, 6780, and 6830, respectively
1022 Z. Martinec
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1023
degree jmax and the colatitude interval .#1 ; #2 / are chosen according to the criteria introduced above. The procedure applied to the 2001-CHAMP-track data results in time series of .obs/ .obs/ spherical harmonic coefficients Xj .t/ and Zj .t/ for j D 1; : : : ; 4. As an example, the resulting coefficients for degree j D 1 are plotted in Fig. 4 as functions of time after January 1, 2001. As expected, there is a high correlation between the .obs/ .obs/ first-degree harmonics X1 .t/ and Z1 .t/ and the Dst index for the days that experienced a magnetic storm.
7.3
Power-Spectrum Analysis
Although the method applied in this chapter is based on the time-domain approach, .obs/ .obs/ it is valuable to inspect the spectra of the Xj .t/ and Zj .t/ time series. Figure 5 shows the maximum-entropy power-spectrum estimates (Press et al. 1992, Sect. 7) of the first four spherical harmonics of the horizontal and vertical components. It can be seen that the magnitudes of the power spectra of the X component monotonically decrease with increasing harmonic degree, which is a consequence of the first selection criterion applied in the two-step, track-by-track analysis. For instance, the power spectrum of the second-degree terms is about two orders of magnitude smaller than that of the first-degree terms. As already mentioned, and also seen in Fig. 5, this is not the case for the Z component, where the magnitude of the maximum-entropy power-spectrum of the Z component is larger than that of the X component for j > 1, which demonstrates that the Z component of the CHAMP magnetic data contains a larger portion of high-frequency noise than the X component. Despite analyzing only night-side tracks, there is a significant peak at the period of 1 day in the power spectra of the higher-degree harmonics (j 2), but, surprisingly, missing in the spectra of the first-degree harmonic. To eliminate the induction effect of residual dawn/dusk ionospheric electric currents, the night-side local-solar time interval is shrunk from (18:00, 6:00) to (22:00, 4:00). However, a 1day period signal remains present in the CHAMP residual signal (not shown here). .obs/ To locate a region of potential inducing electric currents, time series of Xj .t/ and .obs/
Zj
.t/ coefficients are converted to time series of spherical harmonic coefficients
.e;obs/
.i;obs/
Gj .t/ and Gj .t/ of the external and internal fields counted with respect to the CHAMP satellite altitude by applying Eq. 107. The maximum-entropy power.e;obs/ .i;obs/ spectrum estimates of the external and internal coefficients Gj .t/ and Gj .t/ are shown in Fig. 6. It can be seen that these spectra for degrees j D 24 also have .e;obs/ a peak at a period of 1 day. This means that at least part of Gj .t/ originates .i;obs/
in the magnetosphere or even the magnetopause and magnetic tail, while Gj .t/ may originate from the residual night-side ionospheric currents and/or the electric currents induced in the Earth by either effect.
1024
Z. Martinec
X1, Z1, Dst (nT)
300 200 100 0 –100 –200 –300 360
380
400
420
440
460
480
500
520
540
560
580
600
620
640
660
740
760
X1, Z1, Dst (nT)
300 200 100 0 –100 –200 –300 460 300 X1, Z1, Dst (nT)
200 100 0 –100 –200 –300 560
X1, Z1, Dst (nT)
300 200 100 0 –100 –200 –300 660
680
700 720 Time (MD2000) in days .obs/
.obs/
Fig. 4 Time series of the spherical harmonic coefficients X1 .t / (red) and Z1 .t / (blue) of horizontal and vertical components obtained by the two-step, track-by-track spherical harmonic analysis of CHAMP data for the year 2001. A mean and linear trend have been removed following Olsen et al. (2005). The coefficients from the missing tracks are filled by cubic spline interpolation applied to the detrended time series. Note that the sign of the X1 component is opposite to that of the Dst index (black line). Time on the horizontal axis is measured from midnight of January 1, 2000
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1025
9.0
6.8
5.6
4.8
3.7
2.6
1e+06
1 day
X-component
j=1
PME(nT2)
100,000 10,000 1,000
j=2
100
j=3
10
j=4
1 Z-component 100,000
PME(nT2)
10,000
1,000
100
10
2
4
6
8
10
12
14
Period (days) Fig. 5 The maximum-entropy power-spectrum estimates of the spherical harmonic coefficients .obs/ .obs/ of Xj .t / (top panel) and Zj .t / (bottom panel) components. Degrees j D 1; 2; 3, and 4 are shown by black, red, blue, and green lines, respectively, The spectra have peaks at higher harmonics of the 27-day solar rotation period, that is, at periods of 9, 6.8, 5.6, 4.8 days, etc.
Figure 6 also shows that, while the periods of peak values in the external and internal magnetic fields for degree j D 1 correspond to each other, for the higherdegree spherical harmonic coefficients, such a correspondence is only valid for some periods, for instance, 6.8, 5.6, or 4.8 days. However, the peak for the period of 8.5 days in the internal component for j D 2 is hardly detectable in the external field. This could be explained by a three-dimensionality effect in the electrical conductivity of the Earth’s mantle that causes the leakage of electromagnetic energy from degree j D 1 to the second and higher-degree terms. This leakage may partly shift the characteristic periods in the resulting signal due to interference between signals with various spatial wavelengths and periods.
1026
Z. Martinec
9.0
6.8
5.6
4.8
3.7
2.6
1e+06
1 day
External component
j=1
PME(nT2)
100,000 10,000 1,000
j=3
100 j=2
10 j=4
1
Internal component 10,000
PME(nT2)
1,000
100
10 1
2
4
6
8
10
12
14
Period (days) .e;obs/
Fig. 6 As Fig. 5, but for the external and internal Gauss coefficients Gj counted with respect to the CHAMP satellite altitude
8
.i;obs/
.t / and Gj
.t /,
Adjoint Sensitivity Method of EM Induction for the Z Component of CHAMP Magnetic Data
In this section, the adjoint sensitivity method of EM induction for computing the sensitivities of the Z component of CHAMP magnetic data with respect to the mantle’s conductivity structure is formulated. Most of considerations in this section follow the paper by Martinec and Velímský (2009).
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
8.1
1027
Forward Method
The forward method of EM induction for the X component of CHAMP magnetic data was formulated in Sect. 4. In this case, the solution domain G for EM induction modeling is the unification of the conducting sphere B and the insulating spherical layer A, that is, G D B [ A with the boundary @G coinciding with the meanorbit sphere, that is, @G D @A. The forward IBVP (56)–(63) for the toroidal vector potential A in G can then be written in the abbreviated form as 1 @A curl curl A C D0 @t
in G
(133)
with the boundary condition n curl A D Bt
on @G;
(134)
and the inhomogeneous initial condition Ajt D0 D A0
in G:
(135)
Note that the conductivity D 0 in the insulating atmosphere A implies that the second term in Eq. 133 vanishes in A and Eq. 133 reduces to Eq. 58.
8.2
Misfit Function and Its Gradient in the Parameter Space
Let the conductivity .r; #/ of the conducting sphere B now be represented in terms of an M -dimensional system of r- and #-dependent base functions and denote the expansion coefficients of this representation to be 1 , 2 , . . . , M . Defining the conductivity parameter vector E WD .1 ; 2 ; : : : ; M /, the dependence of the conductivity .r; #/ on the parameters E can be made explicit as D .r; #I E /:
(136)
In Sect. 4, it is shown that the solution of the IBVP for CHAMP magnetic data enables the modeling of the time evolution of the normal component Bn WD n B of the magnetic induction vector on the mean-orbit sphere @G along the satellite tracks. These predicted data Bn .E / can be compared with the observations .obs/ .obs/ D Z of the normal component of the magnetic induction vector by the Bn CHAMP onboard magnetometer. The differences between observed and predicted values can then be used as a misfit for the inverse EM induction modeling. The adjoint method of EM induction presented hereafter calculates the sensitivity of the forward-modeled data Bn .E / on the conductivity parameters E by making use of .obs/ the differences Bn Bn .E / as boundary-value data.
1028
Z. Martinec .obs/
Let the observations Bn be made for times t 2 .0; T / such that, according to .obs/ assumption (2) in Sect. 2, Bn .#; ti / at a particular time ti 2 .0; T / corresponds to the CHAMP observations along the i th satellite track. The least-squares misfit is then defined as &2 .E / WD
b 2
Z
T
Z
0
@G
2 w2b Bn.obs/ Bn .E / dS dt;
(137)
where the weighting factor wb D wb .#; t/ is chosen to be dimensionless such that the misfit has the SI unit m3 sT2 =Œ ; Œ D kg m s2 A2 . If the observations .obs/ Bn contain random errors which are statistically independent, the statistical variance of the observations may be substituted for the reciprocal value of w2b (e.g., Bevington 1969, Sects. 6–4). The # dependence of wb allows the elimination of the track data from the polar regions which are contaminated by signals from fieldaligned currents and polar electrojets, while the time dependence of wb allows the elimination of the track data for time instances when other undesirable magnetic effects at low and mid-latitudes contaminate the signal excited by equatorial ring currents in the magnetosphere. The sensitivity analysis or inverse modeling requires the computation of the partial derivative of the misfit with respect to the model parameters, that is, the derivatives @&2 /@m , m D 1; : : :; M , often termed the sensitivities of the misfit with respect to the model parameters m (e.g., Sandu et al. 2003). To abbreviate the notation, the partial derivatives with respect to the conductivity parameters are ordered in the gradient operator in the M -dimensional parameter space:
rE WD
M X mD1
O m
@ ; @m
(138)
where the hat in O m indicates a unit vector. .obs/ Realizing that the observations Bn are independent of the conductivity .obs/ E the gradient of &2 .E D 0, / is parameters E , that is, rE Bn rE &2 D
b
Z
T
Z Bn .E /rE Bn dS dt;
0
(139)
@G
where Bn .E / are the weighted residuals of the normal component of the magnetic induction vector:
Bn .E / WD w2b Bn.obs/ Bn .E / :
(140)
The straightforward approach to find rE &2 is to approximate @&2 /@m by a numerical differentiation of forward model runs. Due to the size of the parameter space, this procedure is often extremely computationally expensive.
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
8.3
1029
The Forward Sensitivity Equations
The forward sensitivity analysis computes the sensitivities of the forward solution with respect to the conductivity parameters, that is, the partial derivatives @A=@m ; m D 1; : : :; M . Using them, the forward sensitivities rE Bn are computed and substituted into Eq. 139 for rE &2 . To form the forward sensitivity equations, also called the linear tangent equations of the model (e.g., McGillivray et al. 1994; Cacuci 2003; Sandu et al. 2003, 2005), the conductivity model (136) is considered in the forward model Eqs. 133–135. Differentiating them with respect to the conductivity parameters E yields 1 @r A @A curl curlrE A C E C rE D0 @t @t
in G
(141)
with homogeneous boundary condition n curlrE A D 0 on @G
(142)
and homogeneous initial condition rE Ajt D0 D 0
in G;
(143)
where rE Bt D rE A0 D 0 have been substituted because the boundary data Bt and the initial condition A0 are independent of the conductivity parameters E . In the forward sensitivity analysis, for each parameter m and associated forward solution A, a new source term rE @A=@t is created and the forward sensitivity equations (141)–(143) are solved to compute the partial derivative @A/@m . The forward sensitivity analysis is known to be very effective when the sensitivities of a larger number of output variables are computed with respect to a small number of model parameters (Sandu et al. 2003; Petzold et al. 2006). In Sect. 9, the adjoint sensitivity method of EM induction for the case when the Gauss coefficients are taken as observations will be dealt with. In this case, the boundary condition (142) has a more general form: n curlrE A L.rE A/ D 0 on @G;
(144)
where L is a linear vector operator acting on a vector function defined on the boundary @G. For the case studied now, however, L D 0.
8.4
The Adjoint Sensitivity Equations
The adjoint method provides an efficient alternative to the forward sensitivity analysis for evaluating rE &2 without explicit knowledge of rE A, that is, without solving the forward sensitivity equations. Hence, the adjoint method is more
1030
Z. Martinec
efficient for problems involving a large number of model parameters. Because the forward sensitivity equations are linear in rE A, an adjoint equation exists (Cacuci 2003). The adjoint sensitivity analysis proceeds by forming the inner product of Eqs. 141 O #; t/, then integrated over G and 144 with an yet unspecified adjoint function A.r; and @G, respectively, and subtracted from each other: Z
Z O dV 1 O dS curl curl rE A A .n curl rE A/ A @G G Z Z Z 1 @rE A O @A O O C L.rE A/ A dS C rE A dV C A dV D 0; @G @t @t G G (145) where the dot stands for the scalar product of vectors. O In the next step, the integrals in Eq. 145 are transformed such that rE A and A interchange. To achieve this, the Green’s theorem is considered for two sufficiently smooth functions f and g in the form 1
Z
Z
Z
curl f curl g dV D G
curl curl f g dV G
.n curl f/ g dS:
(146)
@G
Interchanging the functions f and g and subtracting the new equation from the original one results in the integral identity Z
Z
Z
curl curl f g dV G
.n curl f/ g dS D
curl curl g f dV ZG
@G
.n curl g/ f dS: @G
(147)
O can be exchanged in the first two integrals in By this, the positions of rE A and A Eq. 145: Z 1 O O rE A dS curl curlA rE A dV .n curl A/ @G G Z Z R @rE A @A 1 O O dV D 0: O A dV C A C @G L.rE A/ A dS C rE @t @t G G (148) To perform the same transformation in the fourth integral, Eq. 148 is integrated over the time interval t 2 .0; T /, that is, 1
Z
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1031
Z T Z O rE A dV dt 1 O rE A dSdt curl curlA .n curl A/ 0 0 G @G Z T Z Z T Z 1 @r A O dSdt C O dV dt C L.rE A/ A E A @t G 0 Z T 0Z @G @A O dV dt D 0: A rE C @t G 0 (149) Then the order of integration is exchanged over the spatial variables and time in the fourth integral and performs the time integration by parts: 1
Z
Z
T 0
T
Z
@rE A O O t DT rE A Aj O t D0 A dt D rE A Aj @t
Z
T
rE A 0
O @A dt: @t
(150)
The second term on the right-hand side is equal to zero because of the homogeneous initial condition (143). Finally, Eq. 149 takes the form Z T Z 1 O O rE A dS dt curl curl A rE A dV dt .n curl A/ 0 @G 0 ZG Z Z T 1 O dSdt C O t DT dV C L.rE A/ A rE A Aj 0 @G G Z T Z Z T Z O @A @A O dV dt D 0: dVdt C A rE A rE @t @t G G 0 0 (151) Remembering that rE Bn is the derivative that is to be eliminated from rE &2 , the homogeneous equation (151) is added to Eq. 139 (note the physical units of Eq. 151 O are the same as rE &2 , namely, m3 sT2 /[], provided that the physical units of A are the same as of A, namely, Tm): 1
Z
Z
T
Z Z 1 T O O curl curl A rE A dV dt .n curl A/ G @G 0 0 Z Z Z 1 T O dSdt C O t DT dV rE A dS dt C L.rE A/ A rE A Aj 0 @G G RT R O 0 G @@tA rE A dV dt Z TZ Z Z b T @A O A dV dt C rE Bn rE Bn dS dt: @t 0 @G G 0 (152)
1 rE & D
Z
T
Z
2
O has been considered arbitrary so far. The aim is now to The adjoint function A O transforms to the wellimpose constraints on it such that the originally arbitrary A defined adjoint toroidal vector potential. The volume integrals over G proportional to rE A are first eliminated by requiring that
1032
Z. Martinec
O 1 O @A D 0 in G; curl curlA @t
(153)
O with the terminal condition on A: O t DT D 0 Aj
in G:
(154)
O on @G is derived from the requirement that the surface The boundary condition for A integrals over @G in Eq. 152 cancel each other, that is, Z
Z
Z
O rE A dS .n curlA/
O dS C b L.rE A/ A
@G
@G
Bn rE Bn dS D 0 @G
(155) at any time t 2 .0; T /. This condition will be elaborated on in the next section. Under these constraints, the gradient of &2 .E / takes the form Z
T
Z
rE &2 D
rE 0
8.5
G
@A O dV dt: A @t
(156)
Boundary Condition for the Adjoint Potential
To relate rE A and rE Bn in the constraint described by Eq. 155 and, subsequently, to eliminate rE A from it, A needs to be parameterized. In the colatitudinal direction, A will be represented as a series of the zonal toroidal vector spherical harmonics j Yj .#/ in the form given by Eq. 87, which is also employed for the adjoint potential O A:
A.r; #; t/ O #; t/ A.r;
D
1 X j D1
(
) j Aj .r; t/ j Yj .#/: j AOj .r; t/
(157)
In the radial direction, inside a conducting sphere B of radius a, the spherical j harmonic expansion coefficients Aj .r; t/ are parameterized by P C 1 piecewiselinear finite elements k .r/ on the interval 0 r a as shown by Eq. 93: j
Aj .r; t/ D
P C1 X
j;k
Aj .t/
k .r/:
(158)
kD1
In an insulating atmosphere A, the spherical harmonic expansion coefficients j Aj .r; t/ are parameterized in the form given by Eq. 100:
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
"s j Aj .r; t/
Da
j r j .e/ Gj .t/ j C1 a
s
# j C 1 a j C1 .i / Gj .t/ : j r
1033
(159)
The same parameterizations as shown by Eqs. 158 and 159 are taken for the j coefficients AOj .r; t/. The first aim is to express the gradient rE &2 in terms of spherical harmonics. Since the upper-boundary @G of the solution domain G is the mean-orbit sphere of radius b, the external normal n to @G coincides with the unit vector er , that is, n D er . Applying the gradient operator rE on the equation Bn D er curl A and using Eq. 224 yields 1 Xp j j .j C 1/rE Aj .r; t/Yj .#/: r j D1 1
rE Bn .r; #; t/ D
(160)
Moreover, applying a two-step, track-by-track spherical harmonic analysis on the residual satellite-track data Bn defined by Eq. 140, these observables can, at a particular time t 2 .0; T /, be represented as a series of the zonal scalar spherical harmonics: Bn .#; tI E / D
1 X
Bn;j .tI E /Yj .#/
(161)
j D1
with spherical harmonic coefficients of the form Bn;j .tI E / D
1 b2
Z @G
w2b Bn.obs/ .#; t/ Bn .b; #; tI E / Yj .#/ dS:
(162)
Substituting Eqs. 160 and 161 into Eq. 139 and employing the orthonormality property (212) of the zonal scalar spherical harmonics Yj .#/, the gradient of the misfit &2 becomes rE &2 D
b2
Z
1 p T X 0
j
j .j C 1/Bn;j .tI E /rE Aj .b; t/ dt:
(163)
j D1
The constraint (155) with L D 0, that is, for the case of the boundary condition (142), is now expressed in terms of spherical harmonics. By the parameteriO zation (157) and the assumption n D er , the differential relation (226) applied to A yields O D n curl A
1 h X j D1
ij O t/ Yj .#/; n curl A.r; j j
(164)
1034
Z. Martinec
where h
O t/ n curl A.r;
1 Oj d C Aj .r/: D dr r
ij j
(165)
The first constituent in the first integral of the constraint (155) is expressed by Eq. 164, while the second constituent can be obtained by applying the gradient operator rE to Eq. 157. The two constituents in the second integral of the constraint (155) are expressed by Eqs. 160 and 161, respectively. Performing all indicated substitutions, one obtains Z
2
D0
Z
1 h X
O n curl A.b; t/
D0 j D1 1
Z
j Yj22 .#/b 2 sin #d #d
2
ij 1 j1
Z
j
Yj11 .#/
Db
D0
1 X
1 X
j
rE Aj22 .b; t/
j2 D1
Bn;j1 .tI E /Yj1 .#/
(166)
#D0 j D1 1
1 1 Xp j j2 .j2 C 1/rE Aj22 .b; t/Yj2 .#/b 2 sin #d #d : b j D1 2
Interchanging the order of integration over the full solid angle and summations over j ’s, and making use of the orthonormality properties (212) and (217) of the zonal scalar and vector spherical harmonics, respectively, Eq. 166 reduces to 1 h X
1 ij X p j j O n curl A.b; t/ rE Aj .b; t/ D j .j C 1/Bn;j .tI E /rE Aj .b; t/; j
j D1
j D1
(167) which is to be valid at any time t 2 .0; T /. To satisfy this constraint independently j O namely, of rE Aj .b; t/, one last condition is imposed upon the adjoint potential A, h
O n curl A.b; t/
ij j
D
p j .j C 1/Bn;j .tI E /
on @G
(168)
at any time t 2 .0; T /.
8.6
Adjoint Method
The formulation of the adjoint method of EM induction for the Z component of CHAMP satellite magnetic data can be summarized as follows. Given the electrical conductivity model .r; #/ in the sphere B, the forward solution A.r; #; t/ in B, and the atmosphere A for t 2 .0; T / and the observations .obs/ Bn .t/ on the mean-orbit sphere @G of radius r D b, with uncertainties quantified O #; t/ in G D B [A by solving by weighting factor wb , find the adjoint potential A.r; the adjoint problem:
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
O 1 O @A D 0 curl curl A @t
in G
1035
(169)
with the boundary condition h
O n curl A.b; t/
ij D
j
p j .j C 1/Bn;j .t/
on @G
(170)
and the terminal condition O t DT D 0 in G: Aj
(171)
/ is then expressed as The gradient of the misfit &2 .E Z
T
Z
2
rE & D
rE 0
G
@A.t/ O A.t/ dV dt: @t
(172)
The set of Eqs. 169–171 is referred to as the adjoint problem of the forward problem specified by Eqs. 133–135. Combining the forward solution A and the adjoint O according to Eq. 172 thus gives the exact derivative of the misfit &2 . solution A
8.7
Reverse Time
The numerical solution of Eq. 169, solved backwards in time from t D T to t D 0, is inherently unstable. Unlike the case of the forward model equation and the forward sensitivity equation, the adjoint equation effectively includes negative diffusion, which enhances numerical perturbations instead of smoothing them, leading to an unstable solution. To avoid such numerical instability, the sign of the diffusive term in Eq. 169 is changed by reversing the time variable. Let the reverse time L D T t; 2 .0; T /, and the reverse-time adjoint potential A./ be introduced such that O O L A.t/ D A.T / DW A./:
(173)
L O @A @A D ; @t @
(174)
Hence
and Eq. 169 transforms to the diffusion equation for the reverse-time adjoint L potential A./: L 1 L C @A D 0 curl curl A @
in G
(175)
1036
Z. Martinec
with the boundary condition h ij p L n curl A.b; / D j .j C 1/Bn;j .T / j
on @G:
(176)
O becomes the initial condition for the potential The terminal condition (171) for A L A: L D0 D 0 in G: Aj
(177)
With these changes, the adjoint equations become similar to those of the forward method, and thus nearly identical numerical methods can be applied. In addition, the gradient (172) transforms to Z TZ @A.t/ L 2 A.T t/ dV dt: (178) rE & D rE @t 0 G The importance of Eq. 178 is that, once the forward problem (133)–(135) is solved and the misfit &2 is evaluated from Eq. 137, the gradient rE &2 may be evaluated for little more than the cost of a single solution of the adjoint system (175)–(177) and a single scalar product in Eq. 178, regardless of the dimension of the conductivity parameter vector E . This is compared to other methods of evaluating rE &2 that typically require the solution of the forward problem (133)–(135) per component of E . The specific steps involved in the adjoint computations are now explained. First, the forward solutions A.ti / are calculated at discrete times 0 D t0 < t1 < < tn D T by solving the forward problem (133)–(135), and each solution A.ti / L i /; i D 0; : : : ; n, are must be stored. Then, the reverse-time adjoint solutions A.t calculated, proceeding again forwards in time according to Eqs. 175–177. As each adjoint solution is computed, the misfit and its derivative are updated according to L / has finally been calculated, both &2 and Eqs. 137 and 178, respectively. When A.T 2 rE & are known. The forward solutions A.ti / are stored because Eqs. 176 and 178 depend on them for the adjoint calculation. As a result, the numerical algorithm has memory requirements that are linear with respect to the number of time steps. This is the main drawback of the adjoint method.
8.8
Weak Formulation
The adjoint IBVP (175)–(177) can again be reformulated in a weak sense. Creating an auxiliary boundary-value vector .adj/
Bt
.#; / WD
1 X p j .j C 1/Bn;j .T /Yij .#/ j D1 1 p X
D
j .j C 1/Zj .T
j D1
(179) j /Yj .#/;
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1037
where the negative vertical downward Z component of the magnetic induction vector B0 has been substituted for the normal upward Bn component of B0 , the boundary condition (176) can be written as L D B.adj/ n curl A t
on @G:
(180)
It can be seen that the adjoint problem (175), (177), and (180) for the reverse-time L adjoint potential A./ has the same form as the forward problem (133)–(135) for the forward potential A. Hence, the weak formulation of the adjoint problem is given by the variational equation (75), where the forward boundary data vector Bt is to be .adj / . In addition, the form similarity replaced by the adjoint boundary data vector Bt .adj / enables one between the expression (57) for Bt and the expression (179) for Bt to express the spherical harmonic representation (106) of the linear functional F ./ in a unified form: # " 1 a j C1 ab 2 X b j .e/ .i / F .ıA0 / D Dj .t/ j ıGj .j C 1/ ıGj ; j D1 a b
(181)
where Dj .t/ D
Xj .t/ for the forward method; Zj .T t/ for the adjoint method;
(182)
and Zj .t/ is the residual between the Z component of the CHAMP observations .obs/ and the forward-modeled data, that is, Zj .t/ D Zj .t/ Zj .tI E /.
9
Adjoint Sensitivity Method of EM Induction for the Internal Gauss Coefficients of CHAMP Magnetic Data
In this section, the adjoint sensitivity method of EM induction for computing the sensitivities of the internal Gauss coefficients of CHAMP magnetic data with respect to mantle conductivity structure is formulated.
9.1
Forward Method
The forward method of EM induction for the external Gauss coefficients of CHAMP magnetic was formulated in Sect. 5. As discussed, the solution domain for EM induction modeling is the conducting sphere B with the boundary @B coinciding with the mean Earth surface with radius r D a. Since both the external and internal Gauss coefficients are associated with the spherical harmonic expansion of the magnetic scalar potential U in a near-space atmosphere to the Earth’s surface,
1038
Z. Martinec .e/
the boundary condition for Gj .t/ can only be formulated in terms of spherical harmonic expansion coefficients of the sought-after toroidal vector potential A. .e/ First, the forward IBVP of EM induction for the Gauss coefficient Gj .t/ is briefly reviewed: The toroidal vector potential A inside the conductive sphere B with a given conductivity .r; #/ is sought such that, for t > 0, it holds that @A 1 curl curl A C D 0 in B @t
(183)
with the boundary condition Œn curl
j A.a/ j
s j j j .e/ .2j C 1/Gj Aj .a/ D a j C1
on @B
(184)
and the inhomogeneous initial condition Ajt D0 D A0
9.2
in B:
(185)
Misfit Function and Its Gradient in the Parameter Space
In comparison with the adjoint sensitivity method for the Z component of CHAMP magnetic data, yet another modification concerns the definition of a misfit function. .i;obs/ Let the observations Gj .t/; j D 1; 2; : : : ; jmax , be made over the time interval (0, T ). The least-squares misfit is then defined as &2 .E / WD
a3 2
Z
jmax h T X 0
.i;obs/
Gj
.i /
.t/ Gj .tI E /
i2 dt;
(186)
j D1 .i /
where the forward-modeled data Gj .tI E / are computed according to Eq. 119 after solving the forward IBVP of EM induction (183)–(185). To bring the misfit function (186) to a form that is analogous to Eq. 137, two auxiliary quantities are introduced:
G .i;obs/ .#; t/ G .i / .#; tI E /
D
jmax X j D1
(
) .i;obs/ Gj .t/ Yj .#/; .i / Gj .tI E /.t/
(187)
by means of which, and considering the orthonormality property of spherical harmonics Yj .#/, the misfit (186) can be written as &2 .E / D
a 2
Z 0
T
Z @B
.i;obs/
2 G G .i / .E / dS dt:
(188)
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1039
In contrast to Eq. 137, the weighting factor w2b does not appear in the integral (188) since possible inconsistencies in the CHAMP magnetic data are already considered .i;obs/ in data processing for Gj .t/ (see Sect. 7). Realizing that the observations G .i;obs/ are independent of the conductivity E the gradient of &2 .E / is parameters E , that is, rE G .i;obs/ D 0, rE &2 D
a
Z
T
Z G .i / .E /rE G .i / dS dt;
(189)
@B
0
where G .i / .E / are the residuals of the internal Gauss coefficients: G .i / .E / WD G .i;obs/ G .i / .E /:
9.3
(190)
Adjoint Method
Differentiating Eqs. 183 and 185 for the forward solution with respect to conductivity parameters E results in the sensitivity equations of the same form as Eqs. 141 and 143, but now valid inside the sphere B. The appropriate boundary condition for the sensitivities rE A is obtained by differentiating Eq. 184 with respect to the parameters E : j
Œn curlrE A.a/ j
j j r A .a/ D 0 a E j
on @B:
(191)
Multiplying the last equation by the zonal toroidal vector spherical harmonics and summing up the result over j , the sensitivity equation (191) can be written in the form of Eq. 144, where the linear vector boundary operator L has the form L.rE A/ D
1 1X j j j r A .a/Yj .#/: a j D1 E j
(192)
In view of the form similarity between the expressions (139) and (189), the boundary O on @B can be deduced from the condition (155): condition for A Z
Z O rE A dS .n curl A/
@B
Z O dS C a L.rE A/ A
@B
G .i / rE G .i / dS D 0; @B
(193) which must be valid at any time t 2 .0; T /. .i / To express the partial derivatives of the forward-modeled data Gj with .i /
respect to the conductivity parameters E , that is, the gradient rE Gj in terms of the sensitivities rE A, Eq. 119 is differentiated with respect to the conductivity parameters E :
1040
Z. Martinec
s .i / rE Gj
1 D a
j j r A .a/; j C 1 E j
(194)
.e/
where rE Gj D 0 has been considered because the forward model boundary data .e/
Gj are independent of the conductivity parameters E . The constraint (193) can finally be expressed in terms of spherical harmonics. The first constituent in the first integral of the constraint (193) is expressed by Eq. 164, while the second constituent can be obtained by applying the gradient operator rE to Eq. 157. The two constituents in the second and third integrals of the constraint (193) are expressed by Eqs. 192 and 157 and by Eqs. 190 and 194, respectively. Performing all indicated substitutions results in Z 2 Z X 1 h ij1 j O n curl A.a; t/ Yj11 .#/ j1
D0 D0 j D1 1 1 X j j rE Aj22 .a; t/Yj22 .#/a2 sin #d #d j2 D1 Z Z 1 1 2 X j j j1 rE Aj11 .a; t/Yj11 .#/ a D0 D0 j D1 1 1 X j j 2 AOj2 .a; t/Yj22 .#/a2 sin #d #d j2 D1 Z 2 Z X 1 .i / Da Gj1 .tI E /Yj1 .#/
D0 D0 j D1 1 1 X a j D1 2
s
(195)
1
j2 j r A 2 .a; t/Yj2 .#/a2 sin #d #d : j2 C 1 E j2
Interchanging the order of integration over the full solid angle and summations over j ’s, and making use of the orthonormality properties (212) and (217) of the zonal scalar and vector spherical harmonics, respectively, Eq. 195 reduces to 1 h ij j X j j O O n curl A.a; t/ Aj .a; t/ rE Aj .a; t/ j a j D1 D
1 X j D1
s
j .i / j Gj .tI E /rE Aj .a; t/: j C1 j
(196)
To satisfy this constraint independent of rE Aj .a; t/, one last condition is imposed O namely, upon the adjoint potential A, s h ij j j j .i / O O n curl A.a; t/ Aj .a; t/ D Gj .tI E / on @B (197) j a j C1 for j D 1; 2; : : : ; jmax , and at any time t 2 .0; T /.
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1041
The formulation of the adjoint method of EM induction for the internal Gauss coefficients is now summarized. Given the electrical conductivity .r; #/ in the conducting sphere B, the forward solution A.r; #; t/ in B and the observations .i;obs/ Gj .t/, j D 1; 2; : : : ; jmax , on the mean sphere @B of radius r D a for the O #; t/ in B, such that, for t > 0, it time interval (0, T ), find the adjoint potential A.r; satisfies the magnetic diffusion equation O 1 O @A D 0 in B curl curl A @t
(198)
with the boundary condition h
O n curl A.a; t/
ij j
s j j AOj .a; t/ D a
j .i / Gj .t/ j C1
on @B
(199)
for j D 1; 2; : : : ; jmax , and the terminal condition O t DT D 0 in B: Aj
9.4
(200)
Weak Formulation
To find a stable solution of diffusion equation (198), the reverse time D T t L and the reverse-time adjoint potential A./ are introduced in the same manner as in Sect. 8.7. By this transformation, the negative sign at the diffusive term in Eq. 198 L can be reformulated in a is inverted to a positive sign. The adjoint IBVP for the A weak sense and described by the variational equation L ıA/ C b.A; L ıA/ D f2 .ıA/ a1 .A;
8ıA 2 V;
(201)
where the solution space V , the bilinear forms a1 .; / and b., / are given by Eqs. 64, 117, and 67, respectively, and the new linear functional f2 ./ is defined by 1 a2 X f2 .ıA/ D j D1 .i /
s j .i / j Gj .T /ıAj .a/: j C1 .i;sur/
(202)
.t/ determined from Here, Gj .t/ are the residuals between the coefficients Gj the CHAMP observations of the X and Z components of the magnetic induction vector and continued downwards from the satellite’s altitude to the Earth’s surface according to Eq. 107:
1042
Z. Martinec .i;sur/
Gj
.i;obs/
.t/ D .b=a/j C2 Gj
.t/;
(203)
.i /
and the forward-modeled coefficients Gj .tI E /: .i /
.i;sur/
Gj .t/ D Gj
.i /
.t/ Gj .tI E /:
(204)
L Having determined the forward solution A and the reverse-time adjoint solution A, the gradient of the misfit &2 .E / with respect to the conductivity parameters E can be computed by Z
T
Z
rE &2 D
rE 0
9.5
G
@A.t/ L A.T t/ dV dt: @t
(205)
Summary
The forward and adjoint IBVPs of EM induction for the CHAMP satellite data can be formulated in a unified way. Let F denote either the toroidal vector potential L for the forward and the A or the reverse-time adjoint toroidal vector potential A adjoint problems, respectively. F is sought inside the conductive sphere S with a given conductivity .r; #/ such that, for t > 0, it satisfies the magnetic diffusion equation 1 @F curl curl F C D 0 in S @t
(206)
with the inhomogeneous initial condition Fjt D0 D F0
in S
(207)
and an appropriate boundary condition chosen from the set of boundary conditions summarized for convenience in Table 2.
Table 2 Boundary conditions for the forward and adjoint methods
Forward
Satellite magnetic components .r D b/ p j Œn curl F.t / j D j .j C 1/Xj .t /
Adjoint
p Œn curl F. / jj D j .j C 1/Zj .T /
Method
Ground-based Gauss coefficients .r D a/ j j Œn curl F.t / j ja Fj .t / q .e/ j .2j C 1/Gj .t / D j C1 Œn curl F. / jj ja Fjj . / q .i/ j D j C1 Gj .T /
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
10
1043
Sensitivity Analysis for CHAMP Magnetic Data
The forward and adjoint solutions are now computed for the 2001-CHAMP data (see Sect. 7) with spherical harmonic cutoff degree jmax D 4 and time step t D 1 h. The sensitivity analysis of the data will be performed with respect to two different conductivity models: a three-layer, 1-D conductivity model and a two-layer, 2-D conductivity model. For each case, the approximation error of the adjoint sensitivity method is first investigated and then the conjugate gradient method is run to search for an optimal conductivity model by adjusting the Z component of CHAMP data in a least-squares sense.
10.1
Brute-Force Sensitivities
Sensitivities generated with the adjoint sensitivity method (ASM), called hereafter as the adjoint sensitivities, will be compared to those generated by direct numerical differentiation of the misfit, the so-called brute-force method (BFM) (e.g., Bevington 1969), in which the partial derivative of misfit with respect to m at the point E 0 is approximated by the second-order-accuracy-centered difference of two forward model runs:
0 0 &2 .10 ; : : : ; m0 C ; : : : ; M / &2 .10 ; : : : ; m0 ; : : : ; M / 2 E 0 (208) where " refers to a perturbation applied to the nominal value of m0 .
10.2
@&2 @m
Model Parameterization
To parameterize the electrical conductivity, the radial interval < 0; a > is divided into L subintervals by the nodes 0 D R1 < R2 < < RL < RLC1 D a such that the radial dependence of the electrical conductivity .r; #/ is approximated by piecewise constant functions: .r; #/ D ` .#/;
R` r R`C1 ;
(209)
where ` .#/ for a given layer ` D 1; : : : ; L does not depend on the radial coordinate r. Moreover, let ` .#/ be parameterized by the zonal scalar spherical harmonics Yj .#/. As a result, the logarithm of the electrical conductivity is considered in the form j L X p X `j ` .r/Yj .#/; log .r; #I E / D 4 `D1 j D0
(210)
1044
Z. Martinec
where ` .r/ is equal to 1 in the interval R` r R`C1 and 0 elsewhere. The number of conductivity parameters `j , that is, the size of conductivity parameter vector E , is M D L.J C 1/.
10.3
Three-Layer, 1-D Conductivity Model
Consider a 1-D conducting sphere B consisting of the lithosphere, the upper mantle (UM), the upper (ULM) and lower (LLM) parts of the lower mantle, and the core. The interfaces between the conductivity layers are kept fixed at depths of 220, 670, 1,500, and 2,890 km, respectively. The conductivities of the lithosphere and the core are 0.001 and 10,000 S/m, respectively, and fixed at these values for all computation runs; hence the number of conductivity parameters `0 is L D 3. The nominal 0 0 values of the conductivity parameters are 10 D 1 (hence LLM D 10 S=m), 20 D 0 0 .ULM D 1 S=m/, and 30 D 1 .UM D 0:1 S=m/.
Sensitivity Comparison The results of the sensitivity tests computed for the three-layer, 1-D conductivity model are summarized in Fig. 7, where the top panels show the misfit &2 as a function of one conductivity parameter `0 , with the other two equal to the nominal values. The bottom panels compare the derivatives of the misfit obtained by the ASM with the BFM. From these results, two conclusions can be drawn. First, the differences between the derivatives of the misfit obtained by the ASM and BFM (the dashed lines in the bottom panels) are about one order (for 30 ) and at least two orders (for 10 and 20 ) of magnitude smaller than the derivatives themselves, which justifies the validity of the ASM. The differences between the adjoint and brute-force sensitivities are caused by the approximation error in the time numerical differentiation (82). This error can be reduced by low-pass filtering of CHAMP time series Martinec and Velímský (2009). Second, both the top and bottom panels show that the misfit &2 is most sensitive to the conductivity changes in the upper mantle and decreases with increasing depth of the conductivity layer, being least sensitive to conductivity changes in the lower part of the lower mantle. Conjugate Gradient Inversion The sensitivity results in Fig. 7 are encouraging with regard to the solution of the inverse problem for a 1-D mantle conductivity structure. The conjugate gradient (CG) minimization with bracketing and line searching is employed using Brent’s method with derivatives (Press et al. 1992, Sect. 10.3) obtained by the ASM. The inverse problem is solved for the three parameters `0 , with starting values equal to .1:5; 0; 1/. Figure 8 shows the results of the inversion, where the left panel displays the conductivity structure in the three-layer mantle and the right panel the misfit &2 as a function of the CG iterations. The blue line shows the starting model of the CG minimization, the dotted line the model after the first iteration, and the
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
CMB – 1,500 km
1045
670 – 220 km
1,500 – 670 km
620
X2
610 600 590
log⏐∇σ χ2⏐
102 101 100 10–1 10–2 10–3 –1 0
1 2 s10
3
4 –2
–1
0 s20
1
2 –3
–2
–1 0 s30
1
Fig. 7 The misfit &2 (top panels) and the magnitude of its sensitivities rE2& (bottom panels) as functions of the conductivity parameters `0 for the three-layer, 1-D conductivity model consisting of the lower and upper parts of the lower mantle (` D 1; 2/ and the upper mantle (` D 3/. Two panels in a column show a cross section through the respective hypersurface &2 and jrE2& j in the 3-D parameter space along one parameter, while the other two model parameters are kept fixed and equal to nominal values E 0 D .2; 0; 1/. The adjoint sensitivities computed for D 1 h (the solid lines in the bottom panels) are compared with the brute-force sensitivities (" D 0:01) and their differences are shown (the dashed lines)
0
596
220 500
670 594 1,500
1,500
c2
Depth (km)
1,000
2,000 592 2,500 CMB
3,000 10–2 10–1 100
101 102
Conductivity (S/m)
590
0
2
4
6
8
10
CG iteration
Fig. 8 Three-layer, 1-D conductivity model (left panel) best fitting the 2001-CHAMP data (red line), the starting model for the CG minimization (blue line), and the model after the first iteration (dotted line). The right panel shows the misfit &2 as a function of CG iterations
1046
Z. Martinec
red line the model after ten iterations. As expected from the sensitivity tests, the minimization first modifies the conductivities of the UM and ULM, to which the misfit &2 is the most sensitive. When the UM and ULM conductivities are improved, the CG minimization also changes the LLM conductivity. The optimal values of the conductivity parameters after ten iterations are .10 ; 20 ; 30 / D .1:990; 0:186; 0:501/. This corresponds to the conductivities ULM D 1:53 S=m and UM D 0:32 S/m for ULM and UM, which are considered to be well resolved, while the conductivity LLM D 97:8 S/m should be treated with some reservation, because of its poor resolution. A CHAMP time series longer than 1 year would be necessary to increase the sensitivity of CHAMP data to the LLM conductivity.
10.4
Two-Layer, 2-D Conductivity Model
Sensitivity Comparison The adjoint sensitivities are now computed for the 2-D conductivity model, again consisting of the lithosphere, the upper mantle, and the upper and lower parts of the lower mantle, with the interfaces at depths of 220, 670, 1,500, and 2,890 km, respectively. The conductivities of the UM and ULM are now considered to be # dependent, such that the cutoff degree J in the conductivity parameterization (210) is equal to J D 1. The conductivity of the lithosphere is again fixed to 0.001 S/m. Because of the rather poor resolution of the LLM conductivity, this conductivity is chosen to be equal to the optimal value obtained by the CG minimization, that is, 97.8 S/m, and is kept fixed throughout the sensitivity tests and subsequent inversion. Complementary to the sensitivity tests for the zonal coefficients `0 shown in Fig. 7, the sensitivity tests for non-zonal coefficients `1 of the ULM .` D 1/ and UM .` D 2/ are now carried out in a manner similar to that applied in section “Sensitivity Comparison,” in Sect. 10.3 with the same nominal values for the zonal coefficients 0 0 `0 and 11 D 21 D 0. The forward and adjoint solutions are again computed for the 2001 CHAMP data (see Sect. 7) with spherical harmonic cutoff degree jmax D 4 and time step t D 1 h. The Earth model is again divided into 40 finite-element layers with layer thicknesses increasing with depth. Figure 9 summarizes the results of the sensitivity tests. The top panels show the misfit &2 as a function of the parameters 11 and 21 , where only one conductivity parameter is varied and the other to zero. The bottom panels compare the derivatives of the misfit obtained by the ASM and the BFM. It can be seen that the adjoint sensitivities show very good agreement with the brute-force results, with differences not exceeding 0.01 % of the magnitude of the sensitivities themselves. Moreover, the sensitivities to latitudinal dependency of conductivity are significant, again more pronounced in the upper mantle than in the lower mantle. This tells that the CHAMP data are capable of revealing lateral variations of conductivity in the upper and lower mantle. Conjugate Gradient Inversion The sensitivity results in Fig. 9 are encouraging to attempt to solve the inverse problem for lateral variations of conductivity in the mantle. For this purpose, the
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
620
c2
600 590 102 101 100 10–1 10–2 10–3 –1
100
100
60
120
Colatitude ( )
1
–1
0 s21
1
670 – 220 km 101
0
0 s11
180
10–1
592
χ2
Conductivity (S/m)
1,500 – 670 km 101
10–1
670 – 220 km
1,500 – 670 km
610
log⏐∇σ χ2⏐
Fig. 9 As for Fig. 7, but with respect to the conductivity parameters `1 of the latitudinally dependent conductivities of the upper part of the lower mantle .` D 1/ and the upper mantle .` D 2/. The nominal values of the conductivity parameters 0 0 ; `1 D .0; 0; 1; 0/. The `0 results apply to conductivities of the lithosphere, the lower part of the lower mantle, and the core equal to 0.001, 97.8, and 104 S/m, respectively
1047
591
590 0
60
120
Colatitude ( )
180
0
2
4
6
8
10
CG iteration
Fig. 10 Two-layer, latitudinally dependent conductivity model of the upper part of the lower mantle and the upper mantle (left and middle panels). The model best fitting the 2001 CHAMP data (red lines), the starting model for the CG minimization (blue lines), and the model after the first iteration (dotted line) are compared to the best 1-D conductivity model from Fig. 8 (black lines). The right panel shows the misfit &2 as a function of the number of CG iterations; the dashed line shows the misfit &2 for the best 1-D conductivity model
CG minimization with derivatives obtained by the ASM is again employed. The inverse problem is solved for four parameters, `0 and `1 , ` D 1; 2. The starting values of `0 are the nominal values of the three-layer, 1-D conductivity model (see Sect. 10.3), while the values of `1 are set equal to zero at the start of minimization. The results of the inversion are summarized in Fig. 10, where the left and center panels show the conductivity structure in the ULM and UM, while the right panel shows the misfit &2 as a function of CG iterations. The blue lines show the starting model of minimization, the dotted lines the model of minimization
1048
Z. Martinec
after the first iteration, and the red lines the final model of minimization after eight iterations. These models are compared with the optimal three-layer, 1-D conductivity model (black lines) found in section “Conjugate Gradient Inversion” in Sect. 10.3. Again, as indicated by the sensitivity tests, the minimization, at the first stage, adjusts the conductivity in the upper mantle, to which the misfit &2 is the most sensitive, and then varies the ULM conductivity, to which the misfit is less sensitive. The optimal values of the conductivity parameters after eight iterations are .10 ; 11 ; 20 ; 21 / D .0:192; 0:008; 0:476; 0:106/. It is concluded that the mantle conductivity variations in the latitudinal direction reach about 20 % of the mean value in the upper mantle and about 4 % in the upper part of the lower mantle. Comparing the optimal values of the zonal coefficients 10 and 20 with those found in section “Conjugate Gradient Inversion” in Sect. 10.3 for a 1-D conductivity model, it is concluded that the averaged optimal 2-D conductivity structure closely approaches the optimal 1-D structure. This is also indicated in Fig. 10, where the final 2-D conductivity profile (red lines) intersects the optimal 1-D conductivity profile (black lines) at the magnetic equator.
11
Conclusions
This chapter has been motivated by efforts to give a detailed presentation of the advanced mathematical methods available for interpreting the time series of CHAMP magnetic data such that the complete time series, not only their parts, can be considered in forward and inverse modeling and still be computationally feasible. It turned out that these criteria are satisfied by highly efficient methods of forward and adjoint sensitivity analysis that are numerically based on the timedomain, spectral finite-element method. This has been demonstrated for the year 2001 CHAMP time series with a time step of 1 h. To apply the forward and adjoint sensitivity methods to longer time series is straightforward, leading to memory and computational time requirements that are linear with respect to the number of time steps undertaken. The analysis of the complete, more than 8-year-long, CHAMP time series is ongoing with the particular objective of determining the lower-mantle conductivity. The achievement of the present approach is its ability to use satellite data directly, without continuing them from the satellite altitude to the ground level or without decomposing them into the exciting and induced parts by spherical harmonic analysis. This fact is demonstrated for a 2-D configuration, for which the electrical conductivity and the external sources of the electromagnetic variations are axisymmetrically distributed and for which the external current excitation is transient, as for a magnetic storm. The 2-D case corresponds to the situation where vector magnetic data along each track of a satellite, such as the CHAMP satellite, is used. The present approach can be extended to the transient electromagnetic induction in a 3-D heterogeneous sphere if the signals from multiple satellites,
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1049
simultaneously supplemented by ground-based magnetic observations, are available in the future. The presented sensitivity analysis has shown that the 2001 CHAMP data are clearly sensitive to latitudinal variations in mantle conductivity. This result suggests the need to modify the forward and adjoint methods for an axisymmetric distribution of mantle conductivity to the case where the CHAMP data will only be considered over particular areas above the Earth’s surface, for instance, the Pacific Ocean, allowing the study of how latitudinal variations in conductivity differ from region to region. This procedure would enable one to find not only conductivity variations in the latitudinal direction but also longitudinally. This idea warrants further investigation, because it belongs to the category of problems related to data assimilation and methods of constrained minimization can be applied. Similar methods can also be applied to the assimilation of the recordings at permanent geomagnetic observatories into the conductivity models derived from satellite observations.
Appendix: Zonal Scalar and Vector Spherical Harmonics In this section, we define the zonal scalar and vector spherical harmonics, introduce their orthonormality properties, and give some other relations. All considerations follow the book by Varshalovich et al. (1989), which is referenced in the following. The zonal scalar spherical harmonics Yj .#/ can be defined in terms of the Legendre polynomials Pj .cos #/ of degree j (ibid., p. 134, Eq. 6): r Yj .#/ WD
2j C 1 Pj .cos #/; 4
(211)
where j D 0; 1; : : :. The orthogonality property of the Legendre polynomials over the interval 0 # (ibid., p. 149, Eq. 10) results in the orthonormality property of the zonal scalar spherical harmonics Yj .#/ over the full solid angle .0 # ; 0 ' < 2/: Z
2
Z
Yj1 .#/Yj2 .#/ sin #d #d D ıj1 j2 ;
D0
(212)
#D0
where ıij stands for the Kronecker delta symbol. Note that the integration over longitude ' can be performed analytically, resulting in the multiplication by a factor of 2. However, the form of the double integration will be kept since it is consistent with surface integrals considered in the main text. The zonal vector spherical harmonics Y`j .#/, j D 0; 1; : : :; ` D j ˙ 1; j , can be defined (see also chapters Gravitational Viscoelastodynamics and Elastic and Viscoelastic Response of the Lithosphere to Surface Loading) via their polar components (ibid., p. 211, Eq. (10); pp. 213–214, Eqs. 25–27):
1050
Z. Martinec
p @Yj .#/ j 1 e# ; j .2j C 1/Yj .#/ D jYj .#/er C @# p @Yj .#/ j C1 e# ; .j C 1/.2j C 1/Yj .#/ D .j C 1/Yj .#/er C @# p @Yj .#/ j e ; j .j C 1/Yj .#/ D i @#
(213)
p where i D 1, and er , e# , and e are spherical base vectors. The vector functions j ˙1 j Yj .#/ are called the zonal spheroidal vector spherical harmonics and Yj .#/ are the zonal toroidal vector spherical harmonics. A further useful form of the zonal toroidal vector spherical harmonics can be obtained considering @Yj .#/=@# D p j .j C 1/Pj 1 .cos #/ (ibid., p. 146, Eq. 5), where Pj 1 (cos #) is fully normalized associated Legendre functions of order m D 1: j
Yj .#/ D iPj 1 .cos #/e :
(214)
The orthonormality property of the spherical base vectors and the zonal scalar spherical harmonics combine to give the orthonormality property of the zonal vector spherical harmonics (ibid., p. 227, Eq. 117): Z
2
D0
Z
2 #D0
h i Y`j11 .#/ Y`j22 .#/ sin #d #d D ıj1 j2 ı`1 `2 ;
(215)
where the dot stands for the scalar product of vectors and the asterisk denotes complex conjugation. Since both the zonal scalar spherical harmonics and the spherical base vectors j ˙1 are real functions, Eq. 213 shows that the spheroidal vector harmonics Yj .#/ are j
real functions, whereas the toroidal vector harmonics Yj .#/ are pure imaginary j
functions. To avoid complex arithmetics, Yj .#/ is redefined in such a way that they become real functions (of colatitude #): j
Yj .#/ WD Pj 1 .cos #/e :
(216)
At this stage, a remark about this step is required. To avoid additional notation, the j same notation is used for the real and complex versions of Yj .#/, since the real j
version of Yj .#/ is exclusively used throughout this chapter. It is in contrast to Martinec (1997), Martinec et al. (2003), and Martinec and McCreadie (2004) where j the complex functions Yj .#/, defined by Eq. 218, have been used. However, the redefinition (216) only makes sense for studying a phenomenon with an axisymmetric geometry. For a more complex phenomenon, the original definition (214) is to be used. The orthonormality property (215) for the real zonal vector spherical harmonics now reads as
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
Z
2
D0
Z
2
#D0
Y`j11 .#/ Y`j22 .#/ sin #d #d D ıj1 j2 ı`1 `2 :
1051
(217)
The formulae for the scalar and vector products of the radial unit vector er and the zonal vector spherical harmonics Y`j .#/ follow from Eq. 213: s
j Yj .#/; 2j C 1 s j C1 j C1 Yj .#/; er Yj .#/ D 2j C 1 j er Yj .#/ D 0; j 1
e r Yj
.#/ D
(218)
and s er
j 1 Yj .#/
D
er
j C1 Yj .#/
D
j
er Yj .#/
s
j C1 j Y .#/; 2j C 1 j
j j Y .#/; 2j C 1 j s s j C 1 j 1 j j C1 Yj .#/ Y .#/: D 2j C 1 2j C 1 j
(219)
Any vector A(#) which depends on colatitude # and which is square-integrable over the interval 0 # may be expanded in a series of the zonal vector spherical harmonics, that is,
A.#/ D
j C1 1 X X
A`j Y`j .#/
(220)
j D0 `Djj 1j
with the expansion coefficients given by Z A`j D
2
Z
D0
2
#D0
A.#/ Y`j .#/ sin #d #d :
(221)
The curl of vector A(r,#) is then
curl A D
C1 1 jX X j D1 `Dj 1
where (ibid, p. 217, Eq. 54)
Rj` .r/Y`j .#/;
(222)
1052
Z. Martinec
s
j C1 d j C Aj .r/; D dr r s j j d j C1 j Aj .r/; Rj .r/ D 2j C 1 dr r s s d j C2 j C1 d j 1 j j j 1 j C1 C Aj .r/C Aj .r/: Rj .r/ D 2j C 1 dr r 2j C1 dr r (223)
j 1 Rj .r/
j C1 2j C 1
The radial and tangential components of curl A may be evaluated as 1 Xp j j .j C 1/Aj .r/Yj .#/; r j D1 1
er curl A D
(224)
and er curl A D
1 1 X X 1 d j j j j C Aj .r/Yj .#/C Rj .r/ er Yj .#/ : dr r j D1 j D1 j ˙1
In particular, for a toroidal vector A(#), the coefficients Aj reduces to er curl A D
(225)
.r/ D 0 and Eq. 225
1 X 1 d j j C Aj .r/Yj .#/: dr r j D1
(226)
Acknowledgements The author thanks Kevin Fleming for his comments on the manuscript. The author acknowledges support from the Grant Agency of the Czech Republic through Grant No. 205/09/0546.
References Avdeev DB, Avdeeva AD (2006) A rigorous three-dimensional magnetotelluric inversion. PIER 62:41–48 Banks R (1969) Geomagnetic variations and the electrical conductivity of the upper mantle. Geophys J R Astron Soc 17:457–487 Banks RJ, Ainsworth JN (1992) Global induction and the spatial structure of mid-latitude geomagnetic variations. Geophys J Int 110:251–266 Bevington PR (1969) Data reduction and error analysis for the physical sciences. McGraw-Hill, New York Cacuci DG (2003) Sensitivity and uncertainty analysis. Volume I. Theory. Chapman & Hall/CRC, Boca Raton Constable S, Constable C (2004) Observing geomagnetic induction in magnetic satellite measurements and associated implications for mantle conductivity. Geochem Geophys Geosyst 5:Q01006. doi:10.1029/2003GC000634
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1053
Daglis IA, Thorne RM, Baumjohann W, Orsini S (1999) The terrestrial ring current: origin, formation and decay. Rev Geophys 37:407–438 Didwall EM (1984) The electrical conductivity of the upper mantle as estimated from satellite magnetic field data. J Geophys Res 89:537–542 Dorn O, Bertete-Aquirre H, Berryman JG, Papanicolaou GC (1999) A nonlinear inversion method for 3-D electromagnetic imaging using adjoint fields. Inverse Probl 15:1523–1558 Eckhardt D, Lamer K, Madden T (1963) Long periodic magnetic fluctuations and mantle conductivity estimates. J Geophys Res 68:6279–6286 Everett ME, Martinec Z (2003) Spatiotemporal response of a conducting sphere under simulated geomagnetic storm conditions. Phys Earth Planet Inter 138:163–181 Everett ME, Schultz A (1996) Geomagnetic induction in a heterogeneous sphere: azimuthally symmetric test computations and the response of an undulating 660-km discontinuity. J Geophys Res 101:2765–2783 Fainberg EB, Kuvshinov AV, Singer BSh (1990) Electromagnetic induction in a spherical Earth with non-uniform oceans and continents in electric contact with the underlying medium – I. Theory, method and example. Geophys J Int 102:273–281 Farquharson CG, Oldenburg DW (1996) Approximate sensitivities for the electromagnetic inverse problem. Geophys J Int 126:235–252 Hamano Y (2002) A new time-domain approach for the electromagnetic induction problem in a three-dimensional heterogeneous earth. Geophys J Int 150:753–769 Hultqvist B (1973) Perturbations of the geomagnetic field. In: Egeland A, Holter O, Omholt A (eds) Cosmical geophysics. Universitetsforlaget, Oslo, pp 193–201 Jupp DLB, Vozoff K (1977) Two-dimensional magnetotelluric inversion. Geophys J R Astron Soc 50:333–352 Kelbert A, Egbert GD, Schultz A (2008) Non-linear conjugate gradient inversion for global EM induction: resolution studies. Geophys J Int 173:365–381 Kivelson MG, Russell CT (1995) Introduction to space physics, Cambridge University Press, Cambridge. Korte M, Constable S, Constable C (2003) Separation of external magnetic signal for induction studies. In: Reigber Ch, Lühr H, Schwintzer P (eds) First CHAMP mission results for gravity, magnetic and atmospheric studies. Springer, Berlin, pp 315–320 Kˇrížek M, Neittaanmäki P (1990) Finite element approximation of variational problems and applications. Longmann Scientific and Technical/Wiley, New York Kuvshinov AV (2010) Deep electromagnetic studies from land, sea, and space: progress status in the past 10 years. Surv Geophs 33:169–209 Kuvshinov A, Olsen N (2006) A global model of mantle conductivity derived from 5 years of CHAMP, Ørsted, and SAC-C magnetic data. Geophys Res Lett 33:L18301. doi:10.1029/2006GL027083 Kuvshinov AV, Avdeev DB, Pankratov OV (1999a) Global induction by Sq and Dst sources in the presence of oceans: bimodal solutions for non-uniform spherical surface shells above radially symmetric earth models in comparison to observations. Geophys J Int 137:630–650 Kuvshinov AV, Avdeev DB, Pankratov OV, Golyshev SA (1999b) Modelling electromagnetic fields in 3-D spherical earth using fast integral equation approach. In: Expanded abstract of the 2nd international symposium on 3-D electromagnetics, pp 84–88. The university of Utah Lanczos C (1961) Linear differential operators. Van Nostrand, Princeton Langel RA, Estes RH (1985a) Large-scale, near-field magnetic fields from external sources and the corresponding induced internal field. J Geophys Res 90:2487–2494 Langel RA, Estes RH (1985b) The near-Earth magnetic field at 1980 determined from Magsat data. J Geophys Res 90:2495–2510 Langel RA, Sabaka TJ, Baldwin RT, Conrad JA (1996) The near-Earth magnetic field from magneto spheric and quiet-day ionospheric sources and how it is modeled. Phys Earth Planet Inter 98:235–268 Madden TM, Mackie RL (1989) Three-dimensional magnetotelluric modelling and inversion. Proc Inst Electron Electric Eng 77:318–333
1054
Z. Martinec
Marchuk GI (1995) Adjoint equations and analysis of complex systems. Kluwer, Dordrecht Martinec Z (1989) Program to calculate the spectral harmonic expansion coefficients of the two scalar fields product. Comput Phys Commun 54:177–182 Martinec Z (1997) Spectral-finite element approach to two-dimensional electromagnetic induction in a spherical earth. Geophys J Int 130:583–594 Martinec Z (1999) Spectral-finite element approach to three-dimensional electromagnetic induction in a spherical earth. Geophys J Int 136:229–250 Martinec Z, McCreadie H (2004) Electromagnetic induction modelling based on satellite magnetic vector data. Geophys J Int 157:1045–1060 Martinec Z, Velímský J (2009) The adjoint sensitivity method of global electromagnetic induction for CHAMP magnetic data. Geophys J Int 179:1372–1396. doi:10.1111/j.1365246X.2009.04356.x Martinec Z, Everett ME, Velímský J (2003) Time-domain, spectral-finite element approach to transient two-dimensional geomagnetic induction in a spherical heterogeneous earth. Geophys J Int 155:33–43 McGillivray PR, Oldenburg DW (1990) Methods for calculating Fréchet derivatives and sensitivities for the non-linear inverse problems: a comparative study. Geophys Prospect 38:499–524 McGillivray PR, Oldenburg DW, Ellis RG, Habashy TM (1994) Calculation of sensitivities for the frequency-domain electromagnetic problem. Geophys J Int 116:1–4 Morse PW, Feshbach H (1953) Methods of theoretical physics. McGraw-Hill, New York Newman GA, Alumbaugh DL (1997) Three-dimensional massively parallel electromagnetic inversion – I. Theory Geophys J Int 128:345–354 Newman GA, Alumbaugh DL (2000) Three-dimensional magnetotelluric inversion using nonlinear conjugate on induction effects of geomagnetic daily variations from equatorial gradients. Geophys J Int 140:410–424 Oldenburg DW (1990) Inversion of electromagnetic data: an overview of new techniques. Surv Geophys 11:231–270 Olsen N (1999) Induction studies with satellite data. Surv Geophys 20:309–340 Olsen N, Stolle C (2012) Satellite Geomagnetism. Annu Rev Earth Planet Sci 40:441–465 Olsen N, Sabaka TJ, Lowes F (2005) New parameterization of external and induced fields in geomagnetic field modeling, and a candidate model for IGRF 2005. Earth Planets Space 57:1141–1149 Olsen N, Lühr H, Sabaka TJ, Mandea M, Rother M, Toffiner-Clausen L, Choi S (2006a) CHAOS – a model of the Earth’s magnetic field derived from CHAMP, Øersted & SAC-C magnetic satellite data. Geophys J Int 166:67–75 Olsen N, Haagmans R, Sabaka T, Kuvshinov A, Maus S, Purucker M, Rother M, Lesur V, Mandea M (2006b) The swarm end-to-end mission simulator study: separation of the various contributions to earths magnetic field using synthetic data. Earth Planets Space 58:359–370 Oraevsky VN, Rotanova NM, Semenov VYu, Bondar TN, Abramova DYu (1993) Magnetovariational sounding of the Earth using observatory and MAGSAT satellite data. Phys Earth Planet Inter 78:119–130 Orszag SA (1970) Transform method for the calculation of vector-coupled sums: application to the spectral form of the vorticity equation. J Atmos Sci 27:890 Pˇecˇ K, Martinec Z (1986) Spectral theory of electromagnetic induction in a radially and laterally inhomogeneous Earth. Studia Geoph et Geod 30:345–355 Petzold L, Li ST, Cao Y, Serban R (2006) Sensitivity analysis of differential-algebraic equations and partial differential equations. Comput Chem Eng 30:1553–1559 ˇ Praus OJ, Pˇecˇ ová J, Cerv V, Kovaˇcíková S, Pek J, Velímský J (2011) Electrical conductivity at midmantle depths estimated from the data of Sq and long period geomagnetic variations. Studia Geoph Geod 55:241–264 Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1992) Numerical recipes in Fortran. The art of scientific computing. Cambridge University Press, Cambridge Rodi WL (1976) A technique for improving the accuracy of finite element solutions of MT data. Geophys J R Astron Soc 44:483–506
The Forward and Adjoint Methods of Global Electromagnetic Induction for. . .
1055
Rodi WL, Mackie RL (2001) Nonlinear conjugate gradients algorithm for 2-D magnetotel-luric inversion. Geophysics 66:174–187 Sandu A, Daescu DN, Carmichael GR (2003) Direct and adjoint sensitivity analysis of chemical kinetic systems with KPP: I-theory and software tools. Atmos Environ 37:5083–5096 Sandu A, Daescu DN, Carmichael GR, Chai T (2005) Adjoint sensitivity analysis of regional air quality models. J Comput Phys 204:222–252 Schultz A, Larsen JC (1987) On the electrical conductivity of the mid-mantle, I, Calculation of equivalent scalar magnetotelluric response functions. Geophys J R Astron Soc 88:733–761 Schultz A, Larsen JC (1990) On the electrical conductivity of the mid-mantle, II. Delineation of heterogeneity by application of extremal inverse solutions. Geophys J Int 101:565–580 Stratton JA (1941) Electromagnetic theory. Wiley, New Jersey (reissued in 2007) Tarantola A (2005) Inverse problem theory and methods for model parameter estimation. SIAM, Philadelphia Tarits P, Grammatica N (2000) Electromagnetic induction effects by the solar quiet magnetic field at satellite altitude. Geophys Res Lett 27:4009–4012 Uyeshima M, Schultz A (2000) Geoelectromagnetic induction in a heterogeneous sphere: a new three-dimensional forward solver using a conservative staggered-grid finite difference method. Geophys J Int 140:636–650 Varshalovich DA, Moskalev AN, Khersonskii VK (1989) Quantum theory of angular momentum World Scientific, Singapore Velímský J (2010) Electrical conductivity in the lower mantle: constraints from CHAMP satellite data by time-domain EM induction modelling. Phys Earth Planet Inter 180:111–117 Velímský J, Martinec Z (2005) Time-domain, spherical harmonic-finite element approach to transient three-dimensional geomagnetic induction in a spherical heterogeneous Earth. Geophys J Int 161:81–101 Velímský J, Martinec Z, Everett ME (2006) Electrical conductivity in the Earth’s mantle inferred from CHAMP satellite measurements – I. Data processing and 1-D inversion. Geophys J Int 166:529–542 Weaver JT (1994) Mathematical methods for geo-electromagnetic induction, research studies press. Wiley, New York Weidelt P (1975) Inversion of two-dimensional conductivity structure. Phys Earth Planet Inter 10:282–291 Weiss CJ, Everett ME (1998) Geomagnetic induction in a heterogeneous sphere: fully threedimensional test computations and the response of a realistic distribution of oceans and continents. Geophys J Int 135: 650–662
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill Nils Dorband, Martin Fengler, Andreas Gumann, and Stefan Laps
Contents 1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 What is a Weather Forecast? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Data Assimilation Methods: The Journey from 1d-Var to 4d-Var . . . . . . . . . . . . . . . . . 2.1 Observational Nudging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Variational Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Basic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Vertical Coordinate System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The Eulerian Formulation of the Continuous Equations . . . . . . . . . . . . . . . . . . . 3.3 Physical Background Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 The Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Ensemble Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Statistical Weather Forecast (MOS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Applying the Techniques to Kyrill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Analysis of the Air Pressure and Temperature Fields . . . . . . . . . . . . . . . . . . . . . 6.2 Analysis of Kyrills Surface Winds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Analysis of Kyrill’s 850 hPa Winds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Ensemble Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 MOS Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Weather Radar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1058 1058 1059 1060 1061 1062 1062 1064 1065 1067 1068 1071 1072 1074 1075 1075 1077 1081 1082 1084 1087
N. Dorband () • M. Fengler Meteomatics GmbH, St. Gallen, Switzerland e-mail: [email protected]; [email protected] A. Gumann Zurich, Switzerland S. Laps Bochum, Germany © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_21
1057
1058
N. Dorband et al.
Abstract
This chapter gives a short overview on modern numerical weather prediction (NWP): The chapter sketches the mathematical formulation of the underlying physical problem and its numerical treatment and gives an outlook on statistical weather forecasting (MOS). Special emphasis is given to the Kyrill event in order to demonstrate the application of the different methods.
1
Introduction
On 18 and 19 of January 2007, one of the most severe storms during the last decades came across Europe: Kyrill, as it was called by German meteorologists. With its hurricane-force winds, Kyrill was leaving a trail of destruction in its wake as it traveled across Northern and Central Europe. Public life broke down completely: schools, universities, and many companies had been closed beforehand; parts of the energy supply and public transport came to a virtual standstill; hundreds of flights had been canceled; and finally more than several dozens found their death due to injuries in accidents. The total economical damage had been estimated to EUR 2.4 billion. Starting from this dramatic event, this chapter tries to sketch the strengths, weaknesses, and challenges of modern weather forecasting. First, we will introduce the basic setup in tackling the underlying physical problem. This leads us to the governing dynamical equations and the approach used by European Center for Medium-Range Weather Forecasts (ECMWF, www.ecmwf.int) for solving them. To complete the picture, we also show how ensemble prediction techniques help to identify potential risks of a forthcoming event. And finally we show how Model Output Statistics (MOS) techniques – a statistical post-processing method – help to estimate the specific impact of such weather phenomena at specific locations. Clearly, in each of these steps mathematics plays the fundamental role. However in order to clarify the relation between mathematics and meteorology, we demonstrate several of the aspects discussed below on the example of the winter storm Kyrill.
1.1
What is a Weather Forecast?
Weather forecasting is nothing else than telling someone how the weather is going to develop. However, this definition does not explain the mechanics that are necessary for doing this. At the beginning of the twenty-first century, it is a lot more than reading the future from stars or throwing chicken bones as it might have been hundreds of years ago – although some of the readers might suggest this. Today the demands for precise weather forecasts are manifold. Classically, the main driver for modern meteorology came from marine and aeronautical purposes. But next to these aspects it is clear that for agriculture, insurance business, or energy producers the weather is an increasingly important economic factor. Actually, there
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1059
are studies estimating that 80 % of the value-added chain is directly or at least indirectly dependent on weather. Last but not least, the human itself has an intrinsic interest in weather, when clothing for the day. Currently, the market for weather forecasts is still heavily developing. At the moment, one estimates the turnover in the weather business to be $10 billion. If one talks about meteorology and weather forecasts, one is usually talking about forecasts lasting for the next couple of days. One distinguishes between real-time forecasts of the next few hours – commonly called nowcasting – and the classical weather forecast for several days (usually up to 10). When going beyond 10 days it becomes more convenient to speak of trend analyses for the next several weeks, a terminology that indicates that even the scientists know about the strong challenges in doing forecasts for such a time interval. Forecasts that cover months or years are actually no common topics in meteorology but rather in climatology. However, it should be outlined that meteorology pushes the virtual border to climatology step by step from days to weeks – and currently one is heavily working on monthly forecasts by introducing atmosphere and ocean-coupled systems to meteorology which was once the hobby-horse of climatologists. Finally, there is a rather small community that is specialized to analyze the dynamical forecasts derived from above. By applying statistical techniques starting from linear regression and ending up at complex nonlinear models between historical model data and stations’ measurements, one is able to refine the forecasts for specific stations. This immediately distills the real benefit of the forecast. Now, let us have a closer look at the different ingredients of weather forecast. At first glance, we are confronted with an initial value problem. Thus, once endowed with the initial state of the atmosphere and the complete set of all physical processes describing the world outside, we are able to compute in a deterministic way all future states of the atmosphere, such as temperature, precipitation, wind, etc. Unfortunately, in practice we know little about this initial state which introduces a significant uncertainty right from the beginning. Due to the nonlinear nature of the dynamical problem, this uncertainty can lead to very large errors in the prediction. In the most extreme cases, it can even drive the numerical model into a completely wrong atmospheric state, which can lead to missing or not properly predicting important events like Kyrill. Statistical methods can be applied for dealing with such uncertainties. We sketch out so-called ensemble techniques immediately after having discussed the deterministic case.
2
Data Assimilation Methods: The Journey from 1d-Var to 4d-Var
Numerical weather models are central tools for modern meteorology. With rapidly increasing computer power during the last decade, decreasing cost of hardware and improvements in weather and climate codes and numerical methods, it has become possible to model the global and mesoscale dynamics of the atmosphere with accurate physics and well-resolved dynamics. However, the predictive power
1060
N. Dorband et al.
of all models is still limited due to some very fundamental problems. Arguably the most severe one is our ignorance of the initial condition for a simulation. While, in many areas of the world, we do have lots of data from ground-based weather stations, for the higher layers of the atmosphere we must rely either on very sparse direct measurements, like radio soundings, or on remote sensing observations that are obtained for example from radar stations or satellites. The amount and quality of such remote sensing data is increasing rapidly, but they are often in a form that is not particularly useful for Numerical Weather Prediction (NWP). It is highly nontrivial to properly inject information into the NWP models, when the observed quantities (as for example radar reflectivity) are only indirectly related to the model parameters (typically temperature, pressure, humidity, and wind velocities). At the same time, assimilation of such data is a key element for creating realistic initial states of the atmosphere. Some of the basic techniques for assimilating data are described in this section.
2.1
Observational Nudging
A simple but effective approach to data assimilation is to modify the background analysis by terms proportional to the difference between the model state and the observational data. An example is the widely used Cressman analysis scheme. If xb is the background analysis and yi a vector of i observations, the model state x provided by a simple Cressman analysis would be n P
x D xb C
i D1
w.i; j /.yi xb;i / n P
;
(1)
w.i; j /
i D1
where the weights w.i , j / are a function of the distance di;j between the points i and j and take on the value 1 for i D j (Daley 1991). There are different possible definitions for these weights. In methods that are commonly referred to as observational nudging, the condition wi Dj D 1 is dropped, so that a weighted average between the background state and the observations is performed. Observational nudging can be used as a four-dimensional analysis method, that is, observational data from different points in time are considered. Instead of modifying the background state directly at an initial time, source terms are added to the evolution equations, so that the model is forced dynamically toward the observed fields. Effectively, this is equivalent to changing the governing evolution equations. Therefore, the source term has to be chosen small enough to prevent the model from drifting into unrealistic physical configurations. Observational nudging is still used in lots of operational NWP systems. It is a straightforward method, with a lot of flexibility. A common criticism is that the modifications are done without respecting the consistency of the atmospheric state which might lead to unrealistic configurations.
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
2.2
1061
Variational Analysis
In variational data assimilation, a cost function is defined from the error covariances and an atmospheric state that minimizes that cost function is constructed. If model errors can be neglected and assuming that background and observation errors are normal, unbiased distributions, the cost function is
J D
1 1 .x.0/ xb .0//T B1 .x.0/ xb .0// C .y H.x//T R1 .y H.x//; 2 2
(2)
where xb is the background field, y the vector of observations, and H(x) the observation operator that translates the model fields to the observed quantities. B and R are the background error covariance and observation error covariance, respectively. The operator H(x) provides a mechanism for assimilating any observational quantities that can be derived from the model parameters, without the necessity of solving the inverse problem. An example is radar reflectivity: it is much more difficult to match the atmospheric conditions to a radar image than to compute a reflectivity out of model parameters. The latter is what H(x) does and is all that is needed for defining the cost function. Having defined a suitable cost function, variational data analysis is reduced to a high-dimensional linear or nonlinear (depending on the properties of H(x)) minimization problem (see Menke (1984), Lorenc (1986), Tarantola (1987) for more detailed discussions and Courtier et al. (1998) or Baker et al. (2004) as examples for implementations of such algorithms). In order to find solutions to the minimization problem, a number of simplifying assumptions can be made. One simplification has already been introduced when we neglected the model errors in Eq. 2. Another common simplification is to evaluate all observation operators at a fixed time only, neglecting the time dependence of the observations. This leads to the so-called 3D-VAR scheme (see Parrish and Derber (1992), for more details). The resulting cost function is then minimized using variational methods. A method closely related to 3D-VAR is Optimal Interpolation, which uses the same approximation, but solves the minimization problem not via variational methods but by direct inversion (see Bouttier and Coutier (2001), and references given therein). In 4D-VAR methods, the time parameter of the observations is taken into account. Therefore, the assimilation of data is not only improving the initial state of the model, but also the dynamics during some period of time. The minimization of the cost function is then considerably more difficult. Commonly applied methods for finding a solution are iterative methods of solving in a linearized regime, and using a sequence of linearized solutions for approximating the solution to the full nonlinear problem (Bouttier and Coutier 2001). While the variational methods outlined above are aiming toward optimizing the state vector of the full three-dimensional atmosphere, they are general enough to be applied to simpler problems. An example that is frequently encountered is 1D-VAR,
1062
N. Dorband et al.
where assimilation is done only for a vertical column at a fixed coordinate and time. This method plays an important role in the analysis of satellite data. There are other assimilation methods that can be combined with the previously discussed methods, or even replace them. One of the more widely used ones is the Kalman filter, which can be used to assimilate 4D data (space and time), and is capable of taking into account the time dependence of the model errors.
3
Basic Equations
This section is dedicated to the set of basic, physical, and rather complex equations that are commonly used for numerical weather prediction. Before entering details of the partial differential equations it is necessary to talk about coordinate systems. While the horizontal one (; / is quite common in mathematics, we keep a special eye on the vertical one.
3.1
Vertical Coordinate System
-Coordinates Any quantity that exhibits a one-to-one relation to height z may be used as vertical coordinate. If the hydrostatic approximation is made, pressure is such a quantity, since > 0 lets @p/@z D g be negative everywhere. In these so-called pressure coordinates, the independent variables are ; ; p; t instead of ; ; z, t and the height z becomes a dependent variable (Norbury and Roulstone 2002). The physical height is rarely used as the vertical coordinate in atmospheric simulation models. Some models use pressure as vertical coordinate, because it simplifies the equations, at least if the atmosphere is in hydrostatic balance, which is generally true for synoptic and mesoscale motion. In such models, the 500 hPa isobaric surface (which undulates in space and time) for instance is a fixed reference level. For complex terrain it is better to use sigma-coordinates instead of pressure, because a sigma (or terrain-following) coordinate system allows for a high resolution just above ground level, whatever altitude the ground level may be, -coordinates are defined by D
psfc p ; psfc
where psfc is the ground-level pressure, and p the variable pressure. The coordinates range from 1 at the ground to 0 at the top of the atmosphere. The sigma coordinate found the basis for an essential modification that is introduced by the eta coordinates.
-Coordinates The fundamental base in the eta system is not at the ground surface, but at mean sea level (Simmons and Burridge 1981). The eta coordinate system has surfaces
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1063
that remain relatively horizontal at all times. At the same time, it retains the mathematical advantages of the pressure-based system that does not intersect the ground. It does this by allowing the bottom atmospheric layer to be represented within each grid box as a flat “step.” The eta coordinate system defines the vertical position of a point in the atmosphere as a ratio of the pressure difference between that point and the top of the domain to that of the pressure difference between a fundamental base below the point and the top of the domain. The ETA coordinate system varies from 1 at the base to 0 at the top of the domain. Because it is pressure based and normalized, it is easy to mathematically cast governing equations of the atmosphere into a relatively simple form. There are several advantages of eta coordinates compared with the sigma ones, which should be mentioned: 1. Eta models do not need to perform the vertical interpolations that are necessary to calculate the pressure gradient force (PGF) in sigma models (Mesinger and Janji 1985). This reduces the error in PGF calculations and improves the forecast of wind and temperature and moisture changes in areas of steeply sloping terrain. 2. Although the numerical formulation near the surface is more complex, the low-level convergence in areas of steep terrain are far more representative of real atmospheric conditions than in the simpler formulations in sigma models (Black 1994). Especially, precipitation forecasts improve in these areas significantly, which more than compensates for the slightly increased computer run time. 3. Compared with sigma models, eta models can often improve forecasts of cold air outbreaks, damming events, and lee-side cyclogenesis. For example, in cold-air damming events, the inversion in the real atmosphere above the cold air mass on the east side of a mountain are preserved almost exactly in an eta model. Unfortunately eta coordinates also introduce some drawbacks and come along with certain limitations, for example: 1. The step nature of the eta coordinate makes it difficult to retain detailed vertical structure in the boundary layer over the entire model domain, particularly over elevated terrain. 2. Gradually sloping terrain is not reflected within the Eta models. Since all terrain is represented in discrete steps, gradual slopes that extend over large distances can be concentrated within as few as one step. This unrealistic compression of the slope into a small area can be compensated, in part, by increasing the vertical and/or horizontal resolution. 3. By its step nature, Eta models have difficulty predicting extreme downslope wind events. For models using eta coordinates the user is referred to the ETA Model (Black 1994) and, naturally, to the ECMWF model as introduced below.
1064
3.2
N. Dorband et al.
The Eulerian Formulation of the Continuous Equations
In the following, we gather the set of equations used at ECMWF for describing the atmospherical flow. In detail we follow exactly the extensive documentation provided by IFS (2006a). To be more specific, we introduce a spherical coordinate system given by (; ; /, where denotes the longitude, the latitude, and the so-called hybrid vertical coordinate as introduced above. Then vertical coordinate could be considered as a monotonic function of the pressure p and the surface pressure psfc , that is, .p, psfc / such that
.0; psfc / D 0 and .psfc ; psfc / D 1:
(3)
Then the equations of momentum can be written as @U 1 @U @U @U C .U C V cos / C P 2 @t a cos @ @ @
1 @ @ C Rdry T* lnp D PU C KU f V C a @ @ 1 @V @V @V @V C .U C V cos C sin .U 2 C V 2 // C P @t a cos2 @ @ @
@ cos @ C Rdry T* ln p D PV C KV ; Cf U C a @ @
(4)
where a is the Earth’s radius, P is the vertical velocity P D d
, '; is the geopotential, dt Rdry is the gas constant of dry air, and Tv is the virtual temperature defined by Rvap T* D T 1 C 1 q ; Rdry where T is the temperature, q is the specific humidity, and Rvap is the gas constant of water vapor. The terms PU and PV represent contributions of additional physical background processes that are discussed later on. KU and KV denote horizontal diffusion. Equation (4) is coupled with the thermodynamic equation given by @T T* ! @T @T C V cos D PT C KT ; U C P @ @ @
.1 C .ı 1/q/p (5) (with cpdry the specific heat of dry air at constant pressure), ! is the
1 @T C @t a cos2 with D
Rdry cpdry
pressure coordinate vertical velocity ! D
dp , dt
and ı D
cpvap cpdry
with cpvap the specific
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1065
heat of water vapor at constant pressure. Again PT abbreviates additional physical background processes, whereas KT denotes horizontal diffusion terms. The moisture equation reads as 1 @q C @t a cos2
@q @q @q C V cos D Pq C Kq ; U C P @ @ @
(6)
where Pq and Kq are again background process and diffusion terms. The set of Eqs. (4)–(6) get closed by the continuity equation @ @t
@p @
@ @p @p C r vH C
P D 0; @
@
@
(7)
where vH is the vector (u, v/ of the horizontal wind speed. Now, under the assumption of an hydrostatic flow the geopotential ' in (4) can be written as Rdry Tv @p @ D : @
p @
Then the vertical velocity ! in (5) is given by Z
!D 0
@p d C vH rp: r vh @
By integrating Eq. (7) with boundary conditions P D 0 taken at the levels from Eq. (3), we end up with an expression for the change in surface pressure @p d : r vH @
0 Z
@p @p @p
P D d : r vH @
@t @
0 @psfc D @t
3.3
Z
1
Physical Background Processes
When we discuss about physical background processes we talk, for example, about radiation, turbulent diffusion, and interactions with the surface; subgridscale orographic drag, convection, clouds and large-scale precipitation, surface parametrization, methane oxidation, ozone chemistry parametrization, climatological data, etc. All of the above-mentioned processes have in common that they are parametrized and triggered subsequently after the computation of the prognostic equations. To make this idea more evident we have in the following a closer look to the generation of clouds and precipitation.
1066
N. Dorband et al.
Clouds and Precipitation The described equations allow for modeling subsequent physical processes that influence via the introduced forcing terms Px the prognostic equations. For convenience, we keep an eye on two important processes in order to demonstrate the general purpose: cloud modeling and large-scale (stratiform) precipitation. We follow the representation given in IFS (2006b). Clouds For simplicity we focus only on stratiform (non-convective) clouds. Having once implicitly introduced a vertical and horizontal grid in space (the Gauss-Legendre transform implies a regular grid), one can define the cloud and ice water content of a specific grid volume (cell) as 1 lD V
Z V
w dV;
where w is the density of cloud water, is the density of moist air, and V is the volume of the grid box. The fraction of the grid box that is covered by clouds is given by a. Then the time change of cloud water and ice can be obtained by @l D A.l/ C Sconv C Sstrat Ecld Gprec @t together with @a D A.a/ C ıaconv C ıastrat ıaevap ; @t where A.l/ and A.a/ denote the transport of cloud water/ice and cloud area through the boundaries of the grid volume. Sconv , ıaconv are the formation of cloud water/ice and cloud area by convective processes, resp. Sstrat , ıastrat by stratiform condensation processes. Ecld is the evaporation rate of cloud water/ice. Gprec is the rate of precipitation falling out of the cloud. And finally, we denote by ıaevap the rate of decrease of the cloud area by evaporation. For the formation of clouds one distinguishes two cases, namely the processes in case of already existing clouds and the formation of new clouds. Details can be found in IFS (2006b), but trivially spoken, new clouds are assumed to form when the relative humidity is larger than a certain threshold that depends on the pressure level, that is, tropospheric clouds are generated if the relative humidity exceeds 100 %. Finally, it is important to mention that the formation of new clouds comes along with evaporative processes which introduces reversibility into the system.
Precipitation Similarly, we sketch the procedure for estimating the amount of precipitation in some grid-box. The precipitation
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
P D
1 A
1067
Z PH .l/ dA;
where the step function H .l/ depends on the portion of the cell containing clouds at condensate specific humidity l and A denotes the volume of the grid-box. The precipitation fraction can then be expressed as 1 aD A
Z H .l/H .P / dA:
The autoconversion from liquid cloud-water to rain and also from ice to snow is parametrized in Sundqvist (1978) and can be written as
G D ac0 1 e
2 lcld lcrit
! :
The reader should be aware that there is also an additional, completely different process contributing to the large-scale precipitation budget, namely in case of clear-sky conditions. Again we refer to IFS (2006b), which also describes ice sedimentation, evaporation of precipitation, and melting of snow. Rain and snow is removed from the atmospheric column immediately but can evaporate, melt, and interact with the cloud-water in all layers through which it passes.
3.4
The Discretization
There are many ways of tackling the above-mentioned coupled set of partial differential equations. However, for meteorological reasons, one uses a scheme introduced by Simmons and Burridge (1981) based on frictionless adiabatic flow. It is designed such that it conserves angular momentum, which helps avoiding timing problems in traveling fronts. Therefore, one introduces a fixed number of vertical layers at fixed pressure levels-the so-called vertical (finite element) discretization build of cubic B-splines. The prognostic horizontal variables T , u, v, ';, q, and p are represented in terms of scalar spherical harmonics (see Freeden et al. (1998), for an extensive introduction). At the moment ECMWF uses a representation of 92 layers in the vertical and horizontally spherical harmonics of degree 1–1,279. All (nonlinear) differential operators acting on the spherical harmonics are applied after the transformation from Fourier into space domain on the grid. Then, physical, parametrized background processes are applied in space and one projects back to Fourier domain, where finally the diffusion terms are applied. For the discretization one leaves the Eulerian representation and uses a SemiLagrangian formulation. This is for two reasons. First, Eulerian schemes often require small time-steps to avoid numerical instability (CFL condition): that is, the
1068
N. Dorband et al.
prognostic variable must not be advected more than one grid length per time-step. The maximum time-step is therefore defined by the strongest winds. To overcome this problem one uses a Lagrangian numerical scheme where the prognostic variable is assumed to be conserved for an individual particle in the advection process along its trajectory. The drawback is that with a pure Lagrangian framework it would be impossible to maintain uniform resolution over the forecast region. A set of marked particles, would ultimately result in dense congestion at some geographical locations, complete absence in the other. To overcome this difficulty a semiLagrangian scheme has been developed. In this numerical scheme at every time-step the grid-points of the numerical mesh are representing the arrival points of backward trajectories at the future time. The point reached during this back-tracking defines where an air parcel was at the beginning of the time-step. During the transport, the particle is subject to various physical and dynamical forcing. Essentially, all prognostic variables are then found through interpolation (using values at the previous time-step for the interpolation grid) to this departure point. In contrast to the Eulerian framework, the semi-Lagrangian scheme allows the use of large time steps without limiting the stability. One limitation for stability is that trajectories should not cross each other. Another, that particles should not overtake another. Therefore, the choice of the time-step in the semi-Lagrangian scheme is only limited by numerical accuracy. However, despite its stability properties severe truncation errors may cause misleading results. Interestingly, one should note when talking about accuracy that the convergence order of the underlying Galerkin method could be massively improved by switching to a nonlinear formulation as proposed in Fengler (2005). Finally, we would like to outline that the horizontal discretization used for the Gauss-Legendre transformation (e.g., see Fengler 2005) is – due to performance issues – slightly modified. Originally, the Gauss-Legendre grid converges massively to the poles which is due to the zeros of the Legendre polynomials that accumulate at the boundary. This introduces naturally a work overload in polar regions, where little is known about the atmospheric conditions and numerical noise due to the pole convergence/singularity of the underlying vector spherical harmonics. To overcome these problems one integrates over reduced lattices that drop points on each row such that one keeps powers of 2n 3m 5k , which allow for fast Fourier transforms. Experimentally, one was able to show that this modification introduced only minor artifacts that are negligible in comparison to the effort to be spent for avoiding them.
4
Ensemble Forecasts
Now, having sketched the basic dynamics behind numerical weather prediction we may draw from this a bit more frowning picture. Indeed, numerical weather prediction generally suffers from two types of uncertainty: First, the initial state of the atmosphere is known only to an approximate extent and, second, the
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1069
numerical weather prediction models themselves exhibit an intrinsic uncertainty. In modern numerical weather prediction systems, assessing this double uncertainty employing ensemble forecasting has, on the one hand, become a major challenge and, on the other, provides a set of tools for probability-based decision-making. Ultimately, ensemble forecasting allows to quantitatively estimate the potential, environmental, and entrepreneurial risks of a forthcoming severe weather event. Over the past 20 years, ensemble forecasting has been implemented and further developed at the main weather prediction centers. For an overview of historic and recent developments, see Lewis (2005) and Leutbecher and Palmer (2008). Historically and practically, there are different ways to tackle the uncertainty issue inherent in numerical weather forecasts. The most basic idea is the one that probably any professional forecaster employs in his daily work: he compares the forecasts of different numerical models, which is sometimes referred to as the “poor man’s ensemble.” The more sophisticated version is an ensemble simulation, for which a certain numerical model is evaluated many times using different sets of initial conditions as well as different parameter sets for the parametrizations of the atmospheric physics. Due to the increased computational requirements compared to a single deterministic run, ensemble simulations are usually carried out using a lower horizontal resolution and a smaller number of vertical levels than the main deterministic run of the respective model. In addition to the perturbed ensemble members, one usually launches a control run, with the resolution of the ensemble members and still the best initialization available, the one which is in use for the main deterministic run. The different ensemble members ideally represent the different possible ways in which the current state of the atmosphere might possibly evolve. The variance or spread of the different members of the ensemble as well as the deviation from the control run provide useful information on the reliability of the forecast and on its future development. The two sources of uncertainty present in numerical weather prediction cannot really be distinguished in the final output of a numerical forecast. A numerical weather prediction model is a highly nonlinear dynamical system living in a phase space of about 106 –108 dimensions. Nevertheless, the underlying evolution equations are well defined and deterministic and, accordingly, the system exhibits deterministic chaotic behavior. This implies that small variations in the initial state of the system may rapidly grow and lead to diverging final states. At the same time, the errors from the initial state blend with the errors caused by the model itself, which stem from the choice of the parametrization coefficients, from truncation errors and from discretization errors. Thus, the errors in the final state are flowdependent and change from one run of the model to another. Technically speaking, the purpose of ensemble forecasting is to appropriately sample the phase space of the numerical model in order to estimate the probability density function of the final outcome. There are several methods which are commonly used to create the perturbations of the initial state. The perturbations of the initial state have to be set up in such a way that they are propagated during the model run, and thus lead to significant
1070
N. Dorband et al.
deviations of the final state of the ensemble members. The perturbations which grow strongly during the dynamical evolution identify the directions of initial uncertainty which lead to the largest forecast errors. The first group of methods to create the perturbations of the initial state is based on ensemble-specific data assimilation techniques. The ensemble Kalman filter, which adds pseudo-random numbers to the assimilated observations, used by the Canadian Meteorological Service belongs to this category (Houtekamer et al. 2005). The second group of methods is based on the so-called bred vector technique. This technique is based on the idea to repeatedly propagate and rescale a random initial perturbation in order to breed the perturbations which are the most important ones in the dynamical evolution. A bred vector technique is being employed by the US National Center for Environmental Prediction (NCEP) (Toth and Kalnay 1997). The third group of methods is based on the identification of the leading singular vectors of the operator which is responsible for the propagation of the perturbations. The leading singular vectors have to be identified for each initial state, and different ensemble members can then be initialized with different linear combinations of the leading singular vectors. The singular vector technique is employed by ECMWF (Molteni et al. 1996). The physical effects living on spatial scales which are not resolved by numerical weather prediction models are usually represented by parametrizations as sketched above. The most common approach to introducing model uncertainty is to perturb the parameters of the model’s parametrizations. Other less commonly used approaches are multi-model ensembles and stochastic-dynamic parametrizations. For a review of the current methods, which are used in order to represent model error, see Palmer et al. (2005). It is a highly nontrivial task to adjust an ensemble simulation such that it is neither over nor under dispersive. The spread of the members of an ideal ensemble should be such that its probability density function perfectly matches the probability distribution of the possible atmospheric configurations. This can only be reached by properly adjusting both the perturbations of the initial state as well as the perturbations of the model. Interestingly, numerical weather prediction models rather tend to be under dispersive and rapidly converge toward the climate normals if the perturbations of the models are insufficient. Calibration techniques based on statistics of past ensemble forecasts can be used in a post-processing step in order to improve the forecast skill and in order to adjust the statistical distribution of an ensemble simulation. Interpreting the outcome of an ensemble simulation is much more sophisticated than interpreting a single deterministic run. The first step is usually to compare the main deterministic run of the respective model with the control run of the ensemble. Both these runs are initialized using the best guess for the initial state of the atmosphere which is available. Large deviations indicate that the model resolution has a strong impact on the outcome for the given atmospheric configuration. In the second step, one usually investigates the spread of the members of the ensemble, their median, and the deviation from the control run. A small spread of the ensemble members indicates a comparatively predictable state of the atmosphere, whereas a
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1071
large spread indicates an unstable and less predictable state. Finally, the ensemble members allow for probabilistic weather predictions. If, for example, only a fraction of the ensemble members predict a specific event for a certain region, the actual probability of the occurrence of the event can be derived from this fraction. For a more detailed overview and further references on measuring the forecast skill of ensemble simulations, see Candille and Talagrand (2005) There are many different ways to depict the outcome of an ensemble simulation. The most prominent one is probably the ensemble plume for a certain location in the simulation domain. For an ensemble plume, the forecasts of all members for a certain parameter are plotted against the lead time. Additional information can be incorporated by including the control run, the deterministic run, the ensemble median, and optionally the climate characteristics. Ensemble plumes allow for a quick overview of the spread of the ensemble members and the easy comparison to the ensemble median, the control run, the deterministic run, and the climate normal (see, e.g., Fig. 17).
5
Statistical Weather Forecast (MOS)
Once endowed with the algorithms and techniques sketched above, one can do what is known as (dynamical) numerical weather prediction. Dropping for a moment any thoughts on physics and mathematics, that is, convergence, stability, formulation of the equations, technical difficulties, and accuracy, a weather model provides us with nothing else but some numerical output that is usually given either as gridded data on different layers and some (native) mesh or by some spherical harmonic coefficients. Either one could be mapped onto some regular grid for drawing charts and maps. However, any kind of regularity in the data immediately gives a convenient access of time series from model data if one archives model runs from the past. Clearly, this opens the way to answer questions concerning accuracy and model performance at some specific location but also allows to refine the model in a so-called post-processing step. This leads us to MOS (Model Output Statistics), which relates the historical model information to measurements that have been taken at a certain coordinate by linear or nonlinear regression. Hence, a dense station network helps to improve the outcome of a numerical weather prediction tremendously. For example, regional and especially local effects that are either physically not modeled or happen at some scale that is not resolved by the underlying model could be made visible in this statistically improved weather forecast. Such local effects could be for example luv- and lee-effects, some cold air basin, exposition to special wind systems in valleys, Foehn, inversion, and so on. Figure 1 shows the comparison between accuracy of the ECMWF direct model output (DMO) and the ECMWF-MOS at a station that is located on Hiddensee in the Baltic Sea. The MOS system improves significantly the model performance by detecting station specific characteristics like sea breezes, sea-land wind circulation, sea warming in autumn, and so forth Fig. 1.
1072
N. Dorband et al. Station 100850
Station 100850
EZ Mos EZ Model
2.5
9 Mean absolute error [km]
Mean absolute error [c]
8 2 1.5 1
7 6 5 4 3 2
0.5
EZ Mos EZ Model
1 0
0
50
100
150
200
0
0
20
40
Leadtime [h]
60
80 100 120 140 160 180 200 Leadtime [h]
Fig. 1 Comparison of MOS and DMO error in temperature and wind speed Fig. 2 Combination of different MOS systems
Real time data UKMO MOS ECMWF MOS MOS Mix GFS MOS
Radar
Finally, the accuracy of statistical post-processing systems can be improved by combining the MOS systems of different models: For example, such as a ECMWFMOS together with a GFS-MOS derived from the NOAA/NCEP GFS model, UKMO-/UKNAE-MOS derived from the Met Office’s global and mesoscale (NAE) models, and additional different MOS systems. The combination done by an expert system (see Fig. 2) adjusts the weighting by the current forecast skill and reduces the error variances tremendously and, thus, further improves the performance as shown in Fig. 3.
6
Applying the Techniques to Kyrill
Keeping in mind the methods introduced so far, we will have a closer look of how they apply to the Kyrill event. We use ECMWF and Met Office NAE model data for an analysis of the weather conditions around January 17 and 18, shown in Fig. 4. A detailed synoptic analysis of this severe storm has been done in Fink et al. (2009). The reason for us to choose this event for an analysis based on model output is due
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
GFH Mos EZ Mos UKMO Mos Mos Mix
2 Mean absolute Error [C]
1073
1.5
1
0.5
0 0
20
40
60
80
100
120
Leadtime [h]
Fig. 3 Increase of accuracy when combining the different forecasts
Fig. 4 Geopotential height [Dm] and temperature at 500 hPa at 17 January 2007 18z (left) and 18 January 2007 0z (right)
to two facts. First, the mesoscale flow patterns of this winter storm are well captured – even 8–10 days before landfall. This outlines the good formulation of the flow and its parametrization done by ECMWF. On the other hand, we observe certain effects that are not resolved by a spherical harmonic representation and which shows the limitations of this formulation, namely sharp and fast traveling cold fronts coming with convectively embedded rainfall. Therefore, at first we have a closer look at the ECMWF analysis to describe the geopotential height field and afterward we use the Met Office NAE model that uses a finite difference representation to resolve very local effects. This model is nested into a global one but operates with lead times of only up to the next 36 h. The reader should note that all times are given in Greenwich Mean Time, also known as UTC, Zulu-time, or z-time.
1074
6.1
N. Dorband et al.
Analysis of the Air Pressure and Temperature Fields
The images in Fig. 4 show a strong westerly flow that has dominated the first weeks of January. In the evening of January 17, the depression Jürgen, as it was dubbed by German meteorologists, influenced Central Europe with some windy and rainy weather. The first signals of Kyrill can be seen on the western edge of the domain over the Atlantic ocean. This map shows also that Kyrill is accompanied by very cold air on its backside in great heights, which indicates the high potential of a quickly intensifying low-pressure cell. In the night of the 18th we observe from Fig. 4, a strong separation of cold air masses in northern and milder ones in southern parts of Central Europe. This line of separation developed over the British Islands to the frontal zone of Kyrill. Meanwhile, Kyrill has shown a pressure minimum of 962 hPa at mean sea level and traveled quickly eastward due to the strong flow at 500 hPa (about 278–315 km/h). In the early morning of the 18th, Kyrill developed strong gradients to a height laying above Spain and northern parts of Africa. This distinctive gradient led – in the warm sector of Kyrill – to the first damages in Ireland, South England, and northern parts of France with gusts of more than 65 kn. The pressure gradients further intensified and Kyrill was classified as a strong winter storm (see Fig. 5). While in southern and south-westerly parts of Germany the pressure gradients started rising, the eastern parts were influenced by relatively calm parts of the height ridge between Kyrill and Jürgen. The embedded warm front of Kyrill brought strong rainfall in western parts of Germany, especially in the low mountain range. The warm front drove mild air into South England, Northern France, and West Germany. Till noon of the 18th, Kyrill has developed into a large storm depression yielding gusts at wind speeds of more than 120 km/h in wide areas of South England, North France, the Netherlands, Belgium, Luxembourg, Germany, and Switzerland (see Fig. 5). Even parts of Austria were affected. In the late noon till the early evening of the 18th the heavy and impressively organized cold front of Kyrill traveled from North West of Germany to the South
Fig. 5 Geopotential height (dm) and temperature at 500 hPa on 18 January 2007 6z (left) and 12z (right)
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1075
Fig. 6 Geopotential height (dm) and temperature at 500 hPa on 18 January 2007 18z (left) and 19 January 2007 0z (right)
East bringing heavy rain, hail, strong gusts, and thunderstorms (see Fig. 6). Due to the strong gradient in pressure and the cold front, gusts at velocities between 120 and 160 km/h have been measured even in the plains and lowlands. Finally, it should be mentioned that after the cold front had passed, a convergence and backward oriented occlusion connected to the depression brought heavy rainfall in the North West of Germany. In the back of Kyrill, which was passing through quickly, the weather calmed down and heavy winds only occurred in the South and the low mountain range.
6.2
Analysis of Kyrills Surface Winds
The following figures have been computed from the UK Met Office NAE model that uses a finite difference formulation at a horizontal resolution of CA. 12 km. They show the model surface winds at a height of 10 m above West France, Germany, Switzerland, Austria, and Eastern parts of Poland. The reader should note that the colors indicate the strength of the wind speed at the indicated time and not the absolute value of gusts in a certain time interval. The overlay of white isolines show the pressure field corrected to mean sea level. Starting at midnight of the 18th in Figs. 7 and 8, we observe a strong intensification of the surface winds over the North Sea and the described landfall. The strong winds inshore are due to the Channel acting like a nozzle. Noteworthy, when comparing Figs. 9–11, is the change in wind direction when the cold front entered northern parts of Germany.
6.3
Analysis of Kyrill’s 850 hPa Winds
To understand the heaviness of Kyrill’s gusts, we have closer look at the model on the 850 hPa pressure level. In these layers, due to the convective nature of the
1076
N. Dorband et al.
Fig. 7 Wind speed 10 m above ground at 18 January 2007 0z (left) and 3z (right)
Fig. 8 Wind speed 10 m above ground on 18 January 2007 6z (left) and 9z (right)
Fig. 9 Wind speed 10 m above ground on 18 January 2007, 12z (left) and 15z (right)
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1077
Fig. 10 Wind speed 10 m above ground on 18 January 2007, 18z (left) and 21z (right)
Fig. 11 Wind speed 10 m above ground on 19 January 2007, 0z (left) and 3z (right)
cold front, heavy rainfall causes transport of the strong horizontal momentum into the vertical. Caused by this kind of mixing, fast traveling air at heights of 1,200 and 1,500 m is pushed down to earth. These so-called down bursts are commonly responsible for the heavy damages of such a storm. From the images shown in Fig. 12 we observe a westerly flow at strong but not too heavy wind speeds. The reader should note that the wind speeds are given in knots. Moreover, it should be outlined that the “calm” regions in the Alps are numerical artifacts that have their origin in the fact that the 850 hPa layer is in these mountainous regions in the ground. In the following images shown from Figs. 13–16 we now observe the cold front with these strong winds passing over Germany.
6.4
Ensemble Forecasts
In the scope of the 50-member ECMWF ensemble forecasting system, first signals for a severe winter storm have shown up as early as 10 days in advance. In Fig. 17, we show a so-called ensemble plume plot for the 10 m wind for the city of Frankfurt, Germany. Plotted over the lead time, the plume plot contains the results for the
1078
N. Dorband et al.
Fig. 12 850 hPa-Wind on 18 January 2007 0z (left) and 3z (right)
Fig. 13 850 hPa-Wind on 18 January 2007 6z (left) and 9z (right)
Fig. 14 850 hPa-Wind on 18 January 2007, 12z (left) and 15z (right)
50 ensemble members, the ensemble median, as well as the 10 % and the 90 % quantile. In the ensemble run initialized on Monday 8.12, 12z, at least a fraction of the ensemble members exceeded wind speeds of 24 kn for January 18. At the same time, a fraction of the ensemble members exceeded 14 mm for the 6-h precipitation for the same forecast time (see Fig. 17).
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1079
Fig. 15 850 hPa-Wind on 18 January 2007, 18z (left) and 21z (right)
Fig. 16 850 hPa-Wind on 19 January 2007, 0z (left) and 3z (right)
Fig. 17 Ensemble plumes for Frankfurt from 8 January 2007, 12z. 10 m Wind (left) and precipitation (right)
1080
N. Dorband et al.
Fig. 18 Ensemble plumes for Frankfurt from 10 January 2007, 12z. 10 m Wind (left) and precipitation (right)
Fig. 19 Ensemble plumes for Frankfurt from 12 January 2007, 12z. Wind (left) and precipitation (right)
It should be noted that despite the underestimated strength of the event, the timing was already highly precise at that time. However, on January 8, the different members of the ensemble did not exhibit a very consistent outcome for the future Kyrill event. In the following, we will track the ensemble forecasts for the Kyrill winter storm for the city of Frankfurt, Germany, while approaching the time of the pass of the main front. In Figs. 18 and 19, we show according ensemble plume plots for the wind and the 6-h precipitation for the ensemble run initialized on January 10, 12z. The signals for the 10 m wind considerably increased compared to the previous run. A fraction of the ensemble members now exceeds 30 kn and the ensemble median clearly exceeds 15 kn. At the same time, a distinct peak in the expected precipitation develops, with an increased number of the ensemble members exceeding 14 mm for the 6-h precipitation. The timing for the main event is almost unchanged compared to the previous run. The strongest winds are still expected to occur around 19 January 0z.
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1081
Fig. 20 Ensemble plumes for Frankfurt from 14 January 2007, 12z. Wind (left) and precipitation (right)
In the plots for the ensemble run initialized January 12, 12z shown in Figs. 19 and 20, the signals for the severe winter storm Kyrill become more pronounced. At that time, about 7 days in advance, almost all ensemble members exceed 10 m winds of 15 kn and the 90 % quantile exceeds 30 kn. At the same time, the distinct peak in the 6-h precipitation becomes more pronounced. The timing of the event is hardly altered, with the strongest winds still expected to break their way around January 19, 0z. About 5 days before the main event, the 10 % quantile for the 10 m winds almost reaches 18 kn and a very consistent picture develops in the scope of the ECMWF ensemble forecasts (see Fig. 20). Based on the very consistent picture, 10 m winds of up to 30 kn can be expected for the time between January 18, 12z and January 19, 0z. The 6-h precipitation can be expected to reach about 5 mm based on the ensemble median.
6.5
MOS Forecasts
In order to stress the importance of local forecasts which take into account local effects, we will show in the following three MOS forecast charts for the city of Frankfurt, Germany (Figs. 21–23). The MOS forecasts shown in the following are based on the deterministic run of the ECMWF model. Already from the MOS run based on the model output from January 12, 12z, a detailed and very precise picture of the Kyrill winter storm can be drawn. All the main features of the event are contained as well as the precise timing, which had already been found in the ensemble forecasts. The strongest gusts with more than 50 km/h are expected for the late evening of January 19. At the same time, the reduced pressure is expected to drop to about 999 hPa at this specific location. In the MOS run based on the January 14, 12z model output, the pass of the cold front with the following peak of the gusts and the wind becomes more pronounced
1082
N. Dorband et al.
Fig. 21 MOS chart for Frankfurt from 12 January 2007, 12z
than in the previous run. The peak in the precipitation was to be expected after the pass of the cold front. In the MOS run based on the January 17, 12z ECMWF model output, the speed of the maximum gusts increased even more and the expected precipitation rose. From these MOS charts, the importance of local forecasts can easily be understood. Comparing the MOS charts with the ensemble forecasts shown in the previous section, it is obvious that local effects play a crucial role. In Fig. 24, we show an observation chart in order to verify the MOS forecast from January 17, 12z. The precipitation has been overestimated by the forecasts, but the sea level pressure, the wind, the gusts, and the temperature profile have been captured to a very high precision.
6.6
Weather Radar
For the sake of completeness we should also have a closer look at the radar images for the same time period. The radar images in Figs. 25–29 show shaded areas where one observes strong rainfall or hail. The stronger the precipitation event the brighter the color. These images provide a deep insight to the strong, narrow cold front traveling from North Germany to the South. In Figs. 27 and 28 we see an extreme sharpness and an impressively strong organization in the fast traveling cold front.
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
Fig. 22 MOS chart for Frankfurt from 14 January 2007, 12z
Fig. 23 MOS chart for Frankfurt from 17 January 2007, 12z
1083
1084
N. Dorband et al.
Fig. 24 Observation chart for Frankfurt for the Kyrill event
This vastly damaging front had a cross diameter of 46 km that the models we talked above, unfortunately, did not show due to their resolution. To resolve these patterns properly is a challenging task for the current research.
7
Conclusion
Since the early days of computer simulations, scientist have been interested in using them for modeling the atmosphere and predicting the weather. Nowadays these efforts have evolved into essential tools for meteorologists. There still is and will be rapid development in the future, due to the access to fast enough computers and the availability of highly sophisticated mathematical methods and algorithms for solving the underlying nonlinear problems. Some of the most important methods that are used in current operational forecasting codes have been reviewed in this chapter. After the fundamental set of governing partial differential equations was defined, a formulation and discretization suitable for solving these equations numerically on three-dimensional grids was introduced. On the example of cloud formation and precipitation, we showed how microphysical processes are coupled to such a model.
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1085
Fig. 25 Radar images from 18.1.2007 0z (left) and 3z (right)
Fig. 26 Radar images from 18.1.2007 6z (left) and 9z (right)
Some commonly used methods for assimilating information from observational data to improve the accuracy of predictions have been outlined. It is crucial to understand, that even though – as a well defined initial value problem – the models are deterministic, the complexity and nonlinearity of the underlying mathematics as well as our ignorance of exact initial conditions make it difficult to predict the quality of a single forecast. To tackle that problem, statistical methods are developed, so-called Ensemble Forecasts. Finally statistical post-processing techniques, known as Model Output Statistics (MOS) can be used for further improving the forecast quality at specific locations.
1086
N. Dorband et al.
Fig. 27 Radar images from 18.1.2007 12z (left) and 15z (right)
Fig. 28 Radar images from 18.1.2007 18z (left) and 21z (right)
During the last two decades, through methods as the ones described here, numerical weather models opened a window for observing and investigating the atmosphere at an unprecedented level of detail and contribute significantly to our ability to understand and predict the weather and its dynamics. As an example for the type and quality of information we can extract from atmospheric simulations, in combination with statistical analysis and observational data (weather station reports and radar maps) we analyzed the winter storm Kyrill and the processes that were leading to this event. For such an analysis, model data provide us with detailed temperature, pressure, and wind maps, that would not be available at a comparable frequency through observational data alone. The dynamics at different pressure levels that was eventually leading to this devastating storm were described
Modern Techniques for Numerical Weather Prediction: A Picture Drawn from Kyrill
1087
Fig. 29 Radar images from 19.1.2007 0z (left) and 19.1. 3z (right)
and analyzed in detail based on data obtained from the ECMWF and the UK Met Office NAE models. We then looked at the event through ensemble plumes and MOS diagrams, which give us first hints at the storm more than a week in advance and rather precise quantitative predictions about wind speeds and precipitation, a couple of days before the event at specific locations. The comparison to Radar images reveals limitations of the model predictions, due to fine, localized structures that are not accurately resolved on the model grids. This demonstrates the need for better local models with very high spatial resolutions and more reliable coupling to all available observational data. Despite the overall accurate picture we already obtain, such high resolution models can be very valuable when preparing for a severe weather situation, for example for emergency teams that have to decide where to start evacuations or move manpower and machinery. Acknowledgements The authors would like to acknowledge Meteomedia, especially Markus Pfister and Mark Vornhusen for many fruitful discussions and help with the ensemble and MOS charts. Moreover, the authors’ gratitude goes to ECMWF for providing an extensive documentation and scientific material to their current systems.
References Baker DM, Huang W, Guo YR, Bourgeois A, Xiao XN (2004) A three-dimensional variational data assimilation system for MM5: implementation and initial results. Mon Weather Rev 132:897 Black TL (1994) The new NMC mesoscale eta model: description and forecast examples. Weather Forecast 9:265 Bouttier F, Coutier P (2001) Meteorological training course lecture series, ECMWF Candille G, Talagrand O (2005) Evaluation of probabilistic prediction systems for a scalar variable. Q J R Meteor Soc 131:2131
1088
N. Dorband et al.
Courtier P et al (1998) The ECMWF implementation of three-dimensional variational assimilation (3DVAR). I: formulation. Q J R Meteor Soc 124:1783 Daley R (1991) Atmospheric data analysis. Cambridge University Press, Cambridge/New York ECMWF Webpage: www.ecmwf.int Fengler M (2005) Vector spherical harmonic and vector wavelet based non-linear Galerkin schemes for solving the incompressible Navier-Stokes equation on the sphere. Shaker Verlag, Maastricht Fink AH, Brücher T, Ermert V, Krüger A, Pinto JG (2009) The European storm kyrill in January 2007: synoptic evolution and considerations with respect to climate change. Nat Hazards Earth Syst Sci 9:405–423 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geomathematics). Oxford Science Publications/Clarendon, Oxford Houtekamer PL, Mitchell HL, Pellerin G, Buehner M, Charron M, Spacek L, Hansen B (2005) Atmospheric data assimilation with an ensemble Kalman filter: results with real observations. Mon Weather Rev 133:604 IFS Documentation (2006a) Cy31r1 operational implementation 12 September 2006; Part III: dynamics and numerical procedures IFS Documentation (2006b) Cy31r1 operational implementation 12 September 2006; Part IV: physical processes Leutbecher M, Palmer TN (2008) Ensemble prediction of tropical cyclones using targeted diabatic singular vectors. J Comput Phys 227:3515 Lewis JM (2005) Roots of ensemble forecasting. Mon Weather Rev 133:1865 Lorenc AC (1986) Analysis methods for numerical weather prediction. Mon Weather Rev 112:1177 Menke W (1984) Geophysical data analysis: discrete inverse theory. Academic, New York Mesinger F, Janji Z (1985) Problems and numerical methods of incorporation of mountains in atmospheric models. Lect Appl Math 22:81–120 Molteni F, Buizza R, Palmer TN (1996) The ECMWF ensemble prediction system: methodology and validation. Q J R Meteor Soc 122:73 Norbury J, Roulstone I (2002) Large-scale atmosphere ocean dynamics, vol I. Cambridge University Press, Cambridge Palmer TN, Shutts GJ, Hagedorn R, Doblas-Reyes FJ, Jung T, Leutbecher M (2005) Representing model uncertainty in weather and climate prediction. Annu Rev Earth Planet Sci 33:163 Parrish D, Derber J (1992) The national meteorological center’s spectral statistical interpolation analysis system. Mon Weather Rev 120:1747 Simmons AJ, Burridge (1981) An energy and angular momentum conserving vertical finite difference scheme and hybrid vertical coordinates. Mon Weather Rev 109:758–766 Sundqvist H (1978) A parametrization scheme for non-convective condensation including prediction of cloud water content. Q J R Meteor Soc 104:677–690 Tarantola A (1987) Inverse problem theory. Methods for data fitting and model parameter estimation. Elsevier, Amsterdam Toth Z, Kalnay E (1997) Ensemble forecasting at NCEP: the breeding method. Mon Weather Rev 125:3297
Radio Occultation via Satellites Christian Blick and Sarah Eberle
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Physical Background of Radio Occultation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 GPS Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Radio Occultation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Mathematical Modeling of Radio Occultation Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Spherical Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Green’s Function with Respect to the Beltrami Operator . . . . . . . . . . . . . . . 3.4 Spherical Spline Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1090 1091 1093 1096 1103 1103 1104 1106 1108 1116 1123 1124
Abstract
Radio Occultation is a method which is able to measure atmospheric properties of the Earth as well as of distant planets via satellites. The basic idea of the method is to determine the Doppler shift of a signal emitted by a satellite which passes through the atmosphere of that planet. In this chapter, we want to give an introduction to Radio Occultation (RO) including the physical properties and its modeling aspects. Further on, in order to visualize the data obtained by RO and to give a comparison to other measurements like radiosondes as well as data obtained by other satellites, we introduce combined spherical interpolating and smoothing splines which are particularly suited to handle RO. In doing so, approximations become available to be consistent with the data and/or we are able to smooth out short-lived atmospheric weather phenomena. Exemplary,
C. Blick () • S. Eberle Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany e-mail: [email protected]; [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_100
1089
1090
C. Blick and S. Eberle
we use spherical splines to depict certain layers of the atmosphere on a global and local scale, to illustrate the change over time in a certain layer, to compute differences in order to compare 2 years at a certain layer, and to show atmospheric profiles at arbitrary locations on the Earth.
1
Introduction
Over the past years, discussions about climate change grew more and more important. In order to prove or disprove the arguments used in these discussions, a large globally distributed dataset is required over a sufficiently large time interval. To this end, the RO method, a satellite-based measuring technique, was introduced in Earth’s sciences. This method, first suggested by a group at Stanford University in 1962, was developed to provide atmospheric data of distant planets in our solar system. Radio Occultation provides a globally distributed dataset of vertical profiles of a variety of atmospheric parameters such as density, pressure, temperature, and water vapor. Several satellites equipped with measuring instruments were launched into the Earth’s atmosphere, one of them is the German CHAllenging Minisatellite Payload (CHAMP), which also provided the data used in this work. The satellite was launched in July 2000 and took the first measurements in February 2001. CHAMP operated until September 2010 and collected measurements over the whole operating period. The Radio Occultation method as, e.g., based on CHAMP data, has several advantages over other measuring techniques, which are in use to obtain atmospheric data such as radiosondes and aircraft-based measurement techniques. Those benefits consist of weather independency, global distribution of the data from the Earth’s surface up till 40 km altitude, and high precision data. In order to handle climate data provided by the Radio Occultation method, they have to be visualized via mathematical methods. For that purpose, a certain spline approximation method is introduced and applied to the CHAMP dataset, which was provided by the German Research Center for Geosciences (GFZ). We like to mention that there are also other institutions working with Radio Occultation like the Danish Meteorological Institute (DMI), EUMETSAT Darmstadt, Jet Propulsion Laboratory Pasadena (JPL), the University Corporation for Atmospheric Research (UCAR), and the Wegener Center, University of Graz (see, e.g., Steiner et al. 2013). The main objectives of this chapter are the modeling aspect of Radio Occultation and the visualization of the data with the help of a combined interpolation and smoothing spline method. In particular, we are interested in visualizing the temperature for specific layers which show the vertical decomposition as well as the change over time for a specific layer. In addition, the spline approximation method is applied to visualize the change over different years for the same layer as well as to compute vertical profiles of atmospheric parameters at arbitrary positions.
Radio Occultation via Satellites
1091
Posistion of the GPSSatellite
Posistion of the LEO
Earth
Atmosphere
Fig. 1 Simplified general setting of Radio Occultation
2
Physical Background of Radio Occultation
First of all, we have a look at the physical modeling of Radio Occultation (RO). As mentioned in the Introduction, the signals we deal with are emitted from a GPS satellite and received form a Low Earth Orbiter (LEO), e.g., CHAMP, as shown in Fig. 1. The basic idea is to measure the change of the signal emitted by the GPS satellite with the LEO. The modification of the signal itself is based on refraction, diffraction, and scattering while passing through the atmosphere and will lead to conclusions about the atmospheric properties. To get more insight, we start with the consideration of the physical background (see also, e.g., Wickert 2002). For Radio Occultation, we conventionally assume that the signals are monochromatic and the transmitted signals behave like rays. Under these assumptions we are able to apply the concepts of geometrical optics. In this context, we are immediately led to Fermat’s principle in the form of the following integral relation: Z
receiver
n.s/ ds D min;
(1)
transmitter
where n denotes the (real) refraction index. This principle tells us, that the length of the optical path between the transmitter and receiver has to be minimal. Furthermore, we have to identify the different types of propagation. For the RO signal, we distinguish the following three types: Scattering: The signal is dispersed by different particles in the air. Diffraction: The signal is deflected by a near surface. Refraction: The signal is bent toward the surface.
1092
C. Blick and S. Eberle
Another important RO feature is the Fresnel zone, which gives the distance between the transmitter and receiver such that there are no scattering effects: L
0
!0 0
!0 >0
and lim w.f; x C .x// w.f; x .x// D f .x/ ;
(93)
lim Q.f; x C .x// Q.f; x .x// D 0 :
(94)
!0 >0
!0 >0
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1177
Furthermore, lim Œv.f; / .xC.x// .x/ Œv.f; / .x.x// .x/ D f .x/ : !0 >0
(95)
Since all the abbreviations and conversions which have been applied in the proofs of Theorems 4 and 5 also hold in the uniform sense, we are able to formulate the following result. The reader is referred to Günter (1957) and Freeden and Gerhards (2013) for an analogous argument in the case of the limit and jump relations corresponding to the Laplace equation. Theorem 6. For all f 2 c.0/ .†/, the limit relations formulated in Theorems 4 and 5 are valid in the uniform sense, i.e., lim sup jv.f; x ˙ .x// v.f; x/j D 0 ; !0 >0
(96)
x2†
ˇ ˇ ˇ ˇ 1 lim sup ˇˇP .f; x ˙ .x// P .f; x/ .x/ f .x/ ˇˇ D 0 ; !0 2 >0 x2† ˇ ˇ ˇ ˇ 1 ˇ lim sup ˇw.f; x ˙ .x// w.f; x/ ˙ f .x/ ˇˇ D 0 ; !0 2 >0 x2† lim sup jQ.f; x ˙ .x// Q.f; x/j D 0 ; !0 >0
x2†
>0
x2†
(97) (98) (99)
ˇ ˇ ˇ ˇ 1 lim sup ˇˇ Œv.f; / .x ˙ .x// .x/ f .x/ C Œv.f; / .x/ .x/ ˇˇ D 0 : !0 2 (100) Furthermore, for all f 2 c.0/ .†/, the jump relations formulated in Corollary 2 are valid in the uniform sense, i.e., lim sup jv.f; x C .x// v.f; x .x//j D 0 ; !0 >0
x2†
!0 >0
x2†
ˇ ˇ lim sup ˇP .f; x C .x// P .f; x .x// C .x/ f .x/ ˇ D 0 ; lim sup jw.f; x C .x// w.f; x .x// f .x/j D 0 ; !0 >0
(101) (102) (103)
x2†
lim sup jQ.f; x C .x// Q.f; x .x//j D 0 ; !0 >0
x2†
!0 >0
x2†
(104)
ˇ ˇ lim sup ˇ Œv.f; / .x C.x// .x/ Œv.f; / .x .x// .x/ Cf .x/ˇ D 0: (105)
1178
C. Mayer and W. Freeden
Potential Operators: In order to extend the limit and jump relations from the space of continuous vector fields on †, c.0/ .†/, to the space of square-integrable vector fields on †, l2 .†/, we express the single- and double-layer potentials using the so-called potential operators (for potential theoretic analogues, see Freeden 1980; Freeden and Mayer (2003); Freeden and Gerhards 2013). Definition 4. For 6D , jj ; jj sufficiently small, we define the potential operators V .; / W l2 .†/ ! l2 .†/ and P .; / W l2 .†/ ! L2 .†/ using the abbreviation x D x C .x/; x 2 †; and y D y C .y/; y 2 †, as follows: Z V .; /f .x/ D
u.x ; y /f .y/ d!.y/;
(106)
q.x ; y / f .y/ d!.y/ :
(107)
†
Z P .; /f .x/ D
†
For D 0 and > 0 sufficiently small, the operators V .; 0/ and P .; 0/ are called operators of the single-layer potential on † with values on †./. For 6D , jj ; jj sufficiently small, we define the potential operators W .; / W l2 .†/ ! l2 .†/ and Q.; / W l2 .†/ ! L2 .†/ as follows:
W .; /f .x/ D
3 X
Z "
.k/
.k/ .k/ .x ; y /f .y/ d!.y/ ; .y/ u" ; q "
†
kD1
Z
Q.; /f .x/ D rx
q.x ; y / f .y/ .y/ d!.y/
(108)
:
(109)
†
For D 0 and > 0 sufficiently small, the operators W .; 0/ and Q.; 0/ are called operators of the double-layer potential on † with values on †./. Finally, for 6D , jj ; jj sufficiently small, we define the operator V 0 .; / W 2 l .†/ ! l2 .†/ as follows: V 0 .; /f .x/ D uŒV .; /f .x C .x// .x/ :
(110)
Explicitly calculating the expressions in (108) and (109) for D 0, we can obtain, for f 2 l2 .†/, Z k.x C .x/; y/f .y/ d!.y/;
W .; 0/f .x/ D Z
x 2 †;
(111)
†
k.x C .x/; y/ f .y/ d!.y/;
Q.; 0/f .x/ D †
x 2 †;
(112)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1179
with kernel functions k 2 l2 .R3 R3 / and k 2 l2 .R3 /. For D D 0, the kernels of the potentials in (106), (107), (111), and (112) have weak singularities. The integrals formally defined by Z V .0; 0/f .x/ D Z
u.x; y/f .y/ d!.y/ ;
(113)
q.x; y/ f .y/ d!.y/ ;
(114)
†
P .0; 0/f .x/ D †
W .0; 0/f .x/ D
3 X
Z "
.k/ .k/ .y/ u" ; q " .x; y/f .y/ d!.y/ ;
.k/ †
kD1
Z Q.0; 0/f .x/ D rx
q.x; y/ f .y/ .y/ d!.y/
(115)
;
(116)
†
for x 2 †, however, exist and define linear bounded operators on l2 .†/. V .0; 0/ and W .0; 0/ map c.0/ .†/ into c.0/ .†/. Furthermore, the operators P .0; 0/ and Q.0; 0/ map c.0/ .†/ into C.0/ .†/. The operators V .0; 0/ and W .0; 0/ are even compact in c.0/ .†/ due to Theorem 22. Furthermore, the operator P .0; 0/ is self-adjoint with respect to the l2 .†/-inner product, i.e., for f; g 2 l2 .†/, we have Z
Z .V .0; 0/f .x// g.x/ d!.x/ D
f .x/ .V .0; 0/g.x// d!.x/ :
†
(117)
†
For the operator of the double-layer potential, this relation is not that easy. For the operator W .0; 0/, we have the following result. Lemma 1. The right normal projection of the single-layer potential and the negative double-layer potential are adjoint to each other with respect to the l2 .†/inner product, i.e., for each f; g 2 l2 .†/, Z
Z .P 0 .0; 0/f .x// g.x/ d!.x/ D
†
f .x/ .W .0; 0/g.x// d!.x/ :
(118)
†
Proof. Explicit calculations give, for f; g 2 l2 .†/, Z
†
ŒV .0; 0/f .x/.x/ g.x/ d!.x/
Z Z 3 .x y/ ˝ .x y/ D ..x y/ f .y// d!.y/ .x/ g.x/d!.x/ 4 † jx yj5 †
1180
C. Mayer and W. Freeden 3 X
D
3 4
i;j;kD1 3 X
D
3 4
Z Z †
.xi yi /.xj yj /.xk yk / jx yj5
†
Z Z
.xk yk /.xi yi /.xj yj /
jx yj5 Z Z D k.y; x/g.x/ d!.x/ f .y/ d!.y/ i;j;kD1
Z
†
D
†
fk .y/ d!.y/ j .x/gi .x/d!.x/ j .x/gi .x/ d!.x/ fk .y/d!.y/
†
†
.W .0; 0/g.y// f .y/ d!.y/ :
(119)
†
This yields the desired result. The relation of Lemma 1 is of particular importance for our later discussion of boundary integral equations. It relates the inner Dirichlet problem to the outer Neumann problem and vice versa which we see later on. As an immediate consequence of Lemma 1, we have, for f 2 l2 .†/, Z .V .0; 0/f /.x/.x/ D ŒV .0; 0/f .x/.x/ D .k.y; x//T f .y/ d!.y/; x 2 †: 0
†
(120)
The formally defined integrals in (113)–(116), however, do not coincide with the inner or outer limits of the potentials (59), (60), (63), and (64) as we have seen in Theorems 4 and 5. The potential operators defined in Definition 4 enable us to give a concise formulation of the limits of the potentials and their formal values on the regular surface, i.e., a reformulation of Theorems 4 and 5. To do so, we define limit and jump operators, L˙ i ./ and Ji ./, > 0, by L˙ 1 ./ D V .˙; 0/ V .0; 0/; 1 ˙ L2 ./ D P .˙; 0/ P .0; 0/ ; 2 1 L˙ I ; ./ D W .˙; 0/ W .0; 0/ ˙ 3 2
(121)
L˙ 4 ./ D Q.˙; 0/ Q.0; 0/; 1 0 0 I C V ./ D V .˙; 0/ .0; 0/ ; L˙ 5 2
(124)
(122) (123)
(125)
and J1 ./ D V .; 0/ V .; 0/;
(126)
J2 ./ D P .; 0/ P .; 0/ C ;
(127)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1181
J3 ./ D W .; 0/ W .; 0/ I;
(128)
J4 ./ D Q.; 0/ Q.; 0/;
(129)
0
0
J5 ./ D V .; 0/ V .; 0/ I :
(130)
Let T ./ be one of the operators Li ./ or Ji ./, i D 1; 3; 5; then the adjoint operator T ./ is, as usual, defined by .g; T ./f /l2 .†/ D T ./ g; f l2 .†/
(131)
for all f; g 2 l2 .†/. For the cases i D 2; 4, the adjoint operator is defined by .G; T ./f /L2 .†/ D T ./ G; f l2 .†/
(132)
for all f 2 l2 .†/ and G 2 L2 .†/. An explicit representation can, for example, for the operator J1 ./, be obtained as follows. Let f; g 2 l2 .†/; then we have Z .g; J1 ./f /l2 .†/ D Z
Z D
g.x/ .J1 ./f /.x/ d!.x/
g.x/ †
Z Z †
D †
u.x C .x/; y/ u.x .x/; y/ f .y/ d!.y/
d!.x/
†
D Z
(133)
†
T u.x C .x/; y/ u.x .x/; y/ g.x/ d!.x/ f .y/ d!.y/
†
.J1 ./ g/.y/ f .y/ d!.y/ D J1 ./ g; f l2 .†/
with J1 ./ W l2 .†/ ! l2 .†/ defined by Z
.J1 ./ g/.x/ D
u.y C.y/; x/u.y .y/; x/
T
g.y/ d!.y/;
†
x 2 †: (134)
L˙ i ./ ,
Similarly, explicit representations of the adjoint operators i D 1; 3; 5 and Ji ./, i D 3; 5, can be obtained. Essentially, the explicit representations of the adjoint operators are based on transposing the corresponding tensor kernels of the operators L˙ i ./, resp. Ji ./ and interchanging variables for the cases i D 1; 3; 5, respectively, interchanging variables in the vector kernels of the operators L˙ i ./ resp. Ji ./ for the cases i D 2; 4. As mentioned before, we can reformulate Theorems 4 and 5 using the operators L˙ i ./ and Ji ./, i D 1; : : : ; 5, as follows. Corollary 3. Let the operators L˙ i ./ and Ji ./ be defined by (121)–(125) and (126)–(130); then, for all f 2 c.0/ .†/, the following limit relations are valid:
1182
C. Mayer and W. Freeden
lim L˙ i ./f c.0/ .†/ D 0; i D 1; 3; 5 ;
(135)
lim L˙ i ./f C.0/ .†/ D 0; i D 2; 4 ;
(136)
lim kJi ./f kc.0/ .†/ D 0; i D 1; 3; 5 ;
(137)
lim kJi ./f kC.0/ .†/ D 0; i D 2; 4 :
(138)
!0 >0
!0 >0
and !0 >0
!0 >0
As we have stated before, adjoining of the operators L˙ i ./ and Ji ./, i D 1; : : : ; 5, is essentially based on transposing the corresponding tensor kernels and interchanging variables for the cases i D 1; 3; 5, respectively, interchanging variables in the vector kernels of the operators for the cases i D 2; 4. These procedures do not change the nature of the operators, i.e., properties like continuity or weak singularity of the operator kernels stay unchanged. Thus, the proofs of Theorems 4 and 5 can analogously be applied to the adjoint operators. For a derivation of the limit and jump relations of the adjoint operators in the case of the scalar Laplace equation, which is very similar to the case of the Stokes equations, the reader is referred to Freeden (1980); Kersten (1980). For the case of the Cauchy-Navier equations of linear elasticity, the relations of the adjoint operators are shown in Abeyratne (2003) and Abeyratne et al. (2003). Corollary 4. Let the operators L˙ i ./ and Ji ./ be the adjoint operators of the ˙ operators Li ./ and Ji ./ defined in (121)–(125) and (126)–(130) with respect to the .; /l2 .†/ -inner product; then, for all f 2 c.0/ .†/, the following limit and jump relations are valid:
lim L˙ i ./ f c.0/ .†/ D 0; i D 1; 3; 5 ;
(139)
lim L˙ i ./ f C.0/ .†/ D 0; i D 2; 4 ;
(140)
lim kJi ./ f kc.0/ .†/ D 0; i D 1; 3; 5 ;
(141)
lim kJi ./ f kC.0/ .†/ D 0; i D 2; 4 :
(142)
!0 >0
!0 >0
and !0 >0
!0 >0
Formulation in .l2 .†/; kkl2 .†/ /: Using the norm estimate kf kl2 .†/
p j†j kf kc.0/ .†/ ;
f 2 c.0/ .†/ ;
(143)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1183
where j†j is the surface area of the regular surface †, we immediately obtain from Corollary 3 the following result. Corollary 5. Let the operators L˙ i ./ and Ji ./ be defined by (121)–(125) and (126)–(130), and let L˙ ./ and Ji ./ be the corresponding adjoint operators i with respect to the .; /l2 .†/ -inner product; then, for all f 2 c.0/ .†/, the following limit and jump relations are valid: lim L˙ i ./f l2 .†/ D 0 ;
lim L˙ i ./ f l2 .†/ D 0 ;
lim L˙ i ./f L2 .†/ D 0 ;
lim L˙ ./ f
!0 >0
!0 >0
!0 >0
i D 1; 3; 5 ; (144) i D 2; 4
(145)
lim kJi ./ f kl2 .†/ D 0
i D 1; 3; 5 ;
(146)
lim kJi ./ f kL2 .†/ D 0 ;
i D 2; 4 :
(147)
!0 >0
i
L2 .†/
D 0;
and lim kJi ./f kl2 .†/ D 0 ; !0 >0
lim kJi ./f kL2 .†/ D 0 ; !0 >0
!0 >0
!0 >0
Finally, the limit and jump relations can be generalized to the Hilbert space l2 .†/ in the following way. Theorem 7. Let the operators L˙ i ./ and Ji ./ be defined by (121)–(125) and (126)–(130), and let L˙ ./ and Ji ./ be the corresponding adjoint operators i with respect to the .; /l2 .†/ -inner product; then, for all f 2 l2 .†/, the following limit and jump relations are valid: lim L˙ i ./f l2 .†/ D 0 ;
lim L˙ i ./ f l2 .†/ D 0 ;
lim L˙ i ./f L2 .†/ D 0 ;
lim L˙ ./ f
!0 >0
!0 >0
!0 >0
i D 1; 3; 5 ; (148) i D 2; 4
(149)
lim kJi ./ f kl2 .†/ D 0
i D 1; 3; 5 ;
(150)
lim kJi ./ f kL2 .†/ D 0 ;
i D 2; 4 :
(151)
!0 >0
i
L2 .†/
D 0;
and lim kJi ./f kl2 .†/ D 0 ; !0 >0
lim kJi ./f kL2 .†/ D 0 ; !0 >0
!0 >0
!0 >0
Proof. The concept follows in parallel to arguments given in Freeden (1980) (see also Kersten 1980; Freeden and Mayer 2003; Freeden and Gerhards 2013).
1184
3.4
C. Mayer and W. Freeden
Existence and Uniqueness of the Stokes Problems
We start with the formulations of the Stokes boundary-value problems. The inner Stokes problem can easily be formulated, while for the well-posedness of the outer Stokes problem, we have to prescribe a certain decay of the flow field u and the pressure P at infinity which is similar to the regularity at infinity in the theory of the Laplace equation (see Günter 1957; Freeden 1980, Freeden and Mayer 2003) and the Müller radiation condition in the theory of electromagnetic scattering (see Müller 1969; Colton and Kress 1992; Freeden and Mayer 2007). Let † be a regular surface. Assume that D is either the inner space †int or the outer space †ext . A pair .u; P / 2 c.2/ .D/ \ c.0/ .D/ C.1/ .D/ \ C.0/ .D/ is said to satisfy the Stokes system of equations if x u.x/ D rx P .x/;
x 2 D;
(152)
rx u.x/ D 0;
x 2 D:
(153)
Furthermore, a pair .u; P / 2 c.0/ .†ext / C.0/ .†ext / is called regular at infinity if ju.x/j D O
1 ; jxj
jrx ˝ u.x/j D O
1 jxj2
;
jxj ! 1 ;
(154)
and jP .x/j D O
1 jxj2
;
jxj ! 1 :
(155)
Before we formulate the Stokes problems corresponding to different types of boundary values in more detail, we give a reformulation of Theorem 3. Corollary 6. (1) Let .u; P / 2 c.2/ .†int / C.1/ .†int / be a solution of the interior Stokes system of equations. Then u can be represented in the form u.x/ D w.uj† ; x/ v.. Œu /j† ; x/;
x 2 †int :
(156)
(2) Let .u; P / 2 c.2/ .†ext / C.1/ .†ext / be a solution of the exterior Stokes system of equations and let .u; P / be regular at infinity. Then u can be represented in the form u.x/ D w.uj† ; x/ v.. Œu /j† ; x/;
x 2 †ext :
(157)
Proof. As stated before, the first part of this corollary is a reformulation of Theorem 3 using the representations of the double-layer potential (61) and the single-layer potential (59).
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1185
To show the second part of the corollary, let .u; P / be a solution of the exterior Stokes problem. Furthermore, take x 2 †ext to be arbitrary but fixed and let, for " > 0, B" .x/ be the unit ball with radius " around x such that B" .x/ †ext . Furthermore, let R > 0 such that s 2 BR .0/ and † BR .0/. We know that the pair .u.x; y/".k/ ; q.x; y/ ".k/ / is a solution of the Stokes system of equations in the area BR .0/ n f†int [ B" .x/g. Applying the second Green’s identity to .u; P / and .u.x; y/".k/ ; q.x; y/".k/ / and to the area BR .0/nf†int [B" .x/g, we get a vanishing left-hand side, since both pairs are a solution of the Stokes equations. Furthermore, we get three boundary integrals on the right-hand side, i.e., we have Z
.y/ Œu.x; /".k/; q.x; / ".k/ .y/u.y/ Œu; P .y/ u.x; y/".k/ d!.y/ †
Z D
(158) @B" .x/ .y/ Œu; P .y/ u.x; y/".k/ Œu.x; /".k/ ;
@B" .x/
q.x; / ".k/ .y/u.y/ d!.y/ Z C @BR .0/ .y/ Œu; P .y/ u.x; y/".k/ Œu.x; /".k/ ; @BR .0/
q.x; / ".k/ .y/u.y/ d!.y/ : Using (61) and (59), the first boundary integral in (158) over † can be written as Z
.y/ Œu.x; /".k/ ; q.x; / ".k/ .y/u.y/ Œu; P .y/ u.x; y/".k/ d!.y/
†
D w.uj† ; x/
k
v.. Œu /j† ; x/ k
(159)
for x 2 †ext and k 2 f1; 2; 3g, i.e., it is exactly the kth component of the term we have in the assertion. The second boundary integral over @B" .x/ can be treated exactly as in the proof of Theorem 3. Thus, we have Z lim
"!0 @B" .x/
.y/ Œu; P .y/ u.x; y/".k/ Œu.x; /".k/ ;
q.x; / ".k/ .y/u.y/ d!.y/ D u.x/ :
(160)
For the third term in (158), the boundary integral over @BR .0/, we can conclude using the regularity of u at infinity that
1186
Z
C. Mayer and W. Freeden
.y/ Œu; P .y/ u.x; y/".k/ Œu.x; /".k/ ; q.x; / ".k/ .y/u.y/ d!.y/
@BR .0/
DO
1 ; R
(161)
for R ! 1. Finally, combining (158) with (159) and (160) gives the desired second part of the corollary. Using the nomenclature of regularity at infinity given above and the Stokes equations (39) and (40), we formulate the following problems. Interior Stokes Problem of the Kind. Given f 2 c.0/ .†/, find a pair .u; P / First .1/ .2/ .0/ of class c .†int / \ c .†int / C .†/ \ C.0/ .†int / , which satisfies (39) and (40) with D D †int and the boundary condition uj† D f ;
on † :
Interior Stokes Problem of the Second Given g 2 c.0/ .†/, find a pair Kind. .2/ .1/ .1/ .u; P / of class c .†int / \ c .†int / C .†int / \ C.0/ .†int / , which satisfies (39) and (40) with D D †int and the boundary condition . Œu j† / D g ;
on † :
(162)
Exterior Stokes Given f 2 c.0/ .2/Problem of.0/the First Kind. .†/, find a pair .1/ .u; P / of class c .†ext / \ c .†ext / C .†ext / \ C.0/ .†ext / , which is regular at infinity, satisfying (39) and (40) with D D †ext and the boundary condition uj† D f ;
on † :
(163)
Exterior Stokes Problem of the Second Given g 2 c.0/ .2/ Kind. .†/, find a pair .1/ .1/ .u; P / of class c .†ext / \ c .†ext / C .†ext / \ C.0/ .†ext / , which is regular at infinity, satisfying (39) and (40) with D D †ext and the boundary condition . Œu j† / D g ;
on † :
(164)
These four problems are discussed in more detail. At first, we have a look at the uniqueness of a solution of the problems if a solution exists, and after that, we discuss the existence of a solution when certain boundary data are given. Uniqueness of Solutions of the Stokes Problems: As a first result, using Green’s first identity (43), we can easily prove the following statement. Theorem 8. For boundary values f 2 c.0/ .†/, the interior Stokes problem of the first kind has at most one solution.
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1187
Proof. Let .u1 ; P1 /; .u2 ; P2 / 2 c.2/ .†int / \ c.0/ .†int / C.1/ .†/ \ C.0/ .†int / be two solutions of the interior Stokes problem of the first kind and let u D u1 u2 and P D P1 P2 . Then .u; P / solves the interior Stokes problem of the first kind with homogeneous boundary data. Using (46), we get Z Z
†int
ˇ2 ˇˇ rx ˝ u.x/ C .rx ˝ u.x//T /ˇ dx 2
†int
rx ˝ u.x/ C .rx ˝ u.x//T / W rx ˝ u.x/ C .rx ˝ u.x//T / dx 2
D Z
(165)
.x/ Œu; P .x/u.x/ d!.x/ D 0 :
D †
But this can only be true if rx ˝ u.x/ C .rx ˝ u.x//T D 0; for all x 2 †int ; i.e., if r ˝ u is skew symmetric. Having a look especially at the diagonal elements, we obtain .rx ui .x//2 D 0; for all x 2 †int ; i D 1; 2; 3 ; which can only be true if u is constant in †int . Since u 2 c.0/ .†int / and uj† D 0, this implies u D 0 in †int . From the Stokes system of equations (39) and (40), it follows that also P D 0 in †int which gives u1 D u2 and P1 D P2 in †int . Similar to the above result, we can show the following statement. Theorem 9. For boundary values f 2 c.0/ .†/, the exterior Stokes problem of the first and the second kind has at most one solution. Proof. Let, as before, .u; P / 2 c.2/ .†ext / \ c.0/ .†ext / C.1/ .†ext / \ C.0/ .†ext / be a solution of the exterior Stokes problem of the first kind with homogeneous boundary data. Let R be a sphere with radius R > 0 such that † R . Using (46) for the region R n †int , we obtain Z Z
BR .0/n†int
ˇ2 ˇˇ rx ˝ u.x/ C .rx ˝ u.x//T /ˇ dx 2
(166)
rx ˝ u.x/ C .rx ˝ u.x//T / W rx ˝ u.x/ C .rx ˝ u.x//T / dx BR .0/n†int 2 Z Z D † .x/ Œu; P .x/u.x/ d!.x/ C R .x/ Œu; P .x/u.x/ d!.x/ ;
D
†
R
where R is the unit normal field on R pointing into the outer space. Since u is assumed to be regular at infinity, the second integral in (166) can, for R being large enough, be abbreviated as follows: Z
C : R .x/ Œu; P .x/u.x/ d!.x/ sup j Œu; P .x/j sup ju.x/jjR j R x2 x2 R R R (167)
1188
C. Mayer and W. Freeden
The first integral in (166) vanishes anyway, since u vanishes on †. Thus, we obtain in the limit R ! 1, observing that BR .0/ n †int ! †ext for R ! 1, Z †ext
ˇ2 ˇˇ rx ˝ u.x/ C .rx ˝ u.x//T /ˇ dx D 0 : 2
(168)
But then the same arguments as in the proof of Theorem 8 show that u is constant in †ext . Since u is regular at infinity, this can only be true if u D 0 in †ext . From the Stokes system of equations (39) and (40), it follows that also P D 0 in †ext which is the desired result. For the exterior Stokes problem of the second kind, the same line of arguments can be used. It should be observed that in (166) not u but Œu; P vanishes on † since .u; P / is a solution of a problem of the second kind with homogeneous boundary data. In Theorems 8 and 9, we have shown uniqueness for the different cases of boundary values for the Stokes problem except for the interior Stokes problem of the second kind. In this case, the following statement is true. Theorem 10. The interior Stokes problem of the second kind with homogeneous boundary data has 6 linearly independent nontrivial solutions. Proof. As before in the proof of Theorem can be 8, the same line of arguments used. If .u; P / 2 c.2/ .†int / \ c.1/ .†int / C.1/ .†/ \ C.0/ .†int / is a solution of the interior Stokes problem of the second kind, we can conclude that rx ˝ u.x/ C .rx ˝ u.x//T D 0;
for all x 2 †int :
(169)
This, together with the homogeneous boundary condition Œu; P j† D 0, does not result in u to be vanishing. (169) gives six conditions for the flow field u in †int . The system of equations has at most six linearly independent solutions. They are explicitly given by ' .k/ .x/ D ".k/ ;
x 2 †int ;
k D 1; 2; 3;
(170)
' .k/ .x/ D x ^ ".k/ ;
x 2 †int ;
k D 4; 5; 6 :
(171)
It is easy to verify that these functions are linearly independent. The corresponding pressure is simply taken to be constant. Theorem 10 gives rise to that every solution u of the interior Stokes problem of the second kind can be represented as
u.x/ D uP .x/ C
6 X kD1
k ' .k/ .x/;
x 2 †int ;
(172)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1189
where uP is a particular solution of the interior Stokes problem of the second kind and ' .k/ are the nontrivial solutions of the homogeneous problem given in (170). Thus, it is clear that the interior Stokes problem of the second kind is not uniquely solvable. We later derive conditions for the boundary data g 2 c.1/ .†/ such that we can guarantee uniqueness of the particular solution uP . Existence of Solutions of the Stokes Problems: Next, we turn our attention to existence results for the different kinds of Stokes problems. As usual, we use the boundary integral equation approach in connection with the theorem of Fredholm to show the existence of a solution for given boundary values. Interior Problem of the First Kind: The easiest condition on the boundary data f for the interior Stokes problem of the first kind to have a solution is a direct consequence of (40). Theorem 11. A necessary condition on the boundary data f 2 c.0/ .†/ for the interior Stokes problem of the first kind to be solvable is Z f .x/ .x/ d!.x/ D 0
(173)
†
which is called a no-flux condition. Proof. Let the interior Stokes problem of the first kind with boundary data f 2 c.0/ .†/ be solvable. Then the solution u fulfills the representation (40). Thus, we have by Gauss’ theorem Z
Z
Z
rx u.x/ dx D
0D †int
u.x/ .x/ d!.x/ D †
f .x/ .x/ d!.x/;
(174)
†
which is the desired result. Now, let us have a look at the existence of a solution of the interior Stokes problem of the first kind for given boundary data f 2 c.0/ .†/. For the first interior problem, we seek a solution in the form of a double-layer potential Z w.'; x/ D
k.x; y/'.y/ d!.y/;
x 2 †int :
(175)
†
To obtain the unknown layer function ' 2 c.0/ .†/, we take the limit x ! †, x 2 †int , and get using the limit relation for the double-layer potential and the boundary condition for the first interior problem
1190
C. Mayer and W. Freeden
f .x/ D lim w.'; x .x// D !0 >0
D
1 '.x/ C 2
Z k.x; y/'.y/ d!.y/;
(176)
†
1 '.x/ C .W .0; 0/'/ .x/; 2
x 2 †:
Since the operator W .0; 0/ W c.0/ .†/ ! c.0/ .†/ is compact, Eq. (176) is an integral equation of the second kind. By the theorem of Fredholm, it has a solution ' 2 c.0/ .†/ if and only if the right-hand side f is orthogonal to all nontrivial solutions 2 c.0/ .†/ of the corresponding homogeneous adjoint integral equation of (176) given by 1 .x/ .V 0 .0; 0/ /.x/ D 0; x 2 † ; (177) 2 Z 1 .x/ C .k.y; x//T .y/ d!.y/ D 0; x 2 † : , 2 † For this boundary integral equation, we can formulate the following result. Lemma 2. The vector field D is the only linearly independent nontrivial solution of the integral equation (177). Proof. First, we show that the outer normal field on † is a solution of the integral equation (177). Inserting D into (177), we obtain 1 .x/ C 2
Z .k.y; x//T .y/ d!.y/ †
(178)
Z
3 .y x/ ˝ .y x/ ..y x/ .x// .y/ d!.y/ 4 jy xj5 † Z 1 k.x; y/ d!.y/ .x/; x 2 † : D .x/ C 2 †
D
1 .x/ C 2
From (81), we get
Z k.x; y/ d!.y/ D †
1 i; 2
x 2 †;
such that we have Z 1 1 1 .x/ C k.x; y/ d!.y/ .x/ D .x/ i .x/ D 0; 2 2 2 †
(179)
x 2 †;
(180)
which shows that D is a solution of (177). To verify that there are no other linearly independent solutions of (177), we suppose that is a solution. Using as layer function, we construct the singlelayer potentials v. ; x/ and P . ; x/ as defined in (59) and (60) and get by (46)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
Z †ext
Z
ˇ2 ˇˇ rx ˝ v.; ; x/ C .rx ˝ v. ; x//T /ˇ dx 2
1191
(181)
.x/ Œv. ; /; P . ; / .x/v. ; x/ d!.x/ :
D †
The right-hand side of this equation vanishes since we can conclude by a combination of Theorem 5 and (177) that Œv. ; /; P . ; / .x/ .x/ D 0 for all x 2 †. Therefore, we get ˇ ˇ ˇrx ˝ v. ; x/ C .rx ˝ v. ; x//T /ˇ2 D 0;
x 2 †ext ;
(182)
and, since v. ; / vanishes at infinity, we have v. ; / D 0 in †ext . But now v. ; / vanishes on †, it is continuous in †int , and it fulfills the Stokes equations in †int . By the uniqueness of the interior Stokes problem of the first kind, we can conclude that v. ; / D 0 in †int . Furthermore, the pressure P . ; / corresponding to v. ; / is constant in †int and †ext because of the Stokes equations. Since it has to be regular at infinity in †ext , we can conclude that P . ; / D 0 in †ext . For the stress tensor of the flow v. ; / in †int , we obtain Œv. ; /; P . ; / .x/ D iP . ; x/ for x 2 †ext . For the solution
(183)
D of (177), we get for the stress tensor
Œv.; /; P D const .x/ Z 3 .x y/ ˝ .x y/ D .x y/ 4 † jx yj5
.y/ d!.y/;
x 2 †int :
(184)
Using the first formula of (81), the integral on the right-hand side can be calculated as Œv.; / .x/ D i;
x 2 †int :
(185)
Now, consider a density Q D C P . ; / and the corresponding single-layer potential v. Q ; /. For the stress tensor of v. Q ; /, we get by combining (183) and (185) the equation Œv. Q ; / .x/ D 0; x 2 R3 n † : By (93), we have Q D 0 on †. This shows that , which is assumed to be an arbitrary solution of (177), can be expressed linearly in terms of . Thus, the field is the only linearly independent nontrivial solution of the integral equation (177). By the theorem of Fredholm, we now know that in order to get a solution of the integral equation (176), the right-hand side f has to be orthogonal to the linearly independent solutions of the integral equation (177), i.e., to the vector field on the regular surface †. But this is nothing else than the necessary solvability condition we have already formulated in Theorem 11.
1192
C. Mayer and W. Freeden
According to the theorem of Fredholm, the homogeneous form of the integral equation (176) has exactly one nontrivial solution 'hom . Every solution ' of (176) can be expressed as ' D 'P C C 'hom , where 'P is a particular solution of (176) and C 2 R is a constant. Nevertheless, the velocity field obtained by a doublelayer potential with the vector density ' is unique, since a double-layer potential with density 'hom is identically zero in †int because it corresponds to an interior Stokes flow with vanishing boundary condition of the first kind on †. Thus, we can summarize our results as follows. Lemma 3. For every f 2 c.0/ .†/ which is orthogonal to the outer normal field on †, the interior Stokes problem of the first kind is solvable. Every solution u of the problem can be written as a single-layer potential Z u.x/ D
u.x; y/'.y/ d!.y/;
x 2 †ext ;
(186)
†
where the layer function ' fulfills the integral equation (176). The condition on f to be orthogonal to the normal field on † cannot be discarded which we have already seen in Theorem 11. Interior Problem of the Second Kind: Next, we show the existence of a solution for the interior Stokes problem of the second kind. We seek a solution in the form of a single-layer potential Z v. ; x/ D
u.x; y/ .y/ d!.y/;
x 2 †int :
(187)
†
To obtain the unknown layer function 2 c.0/ .†/, we take the limit x ! †, x 2 †int , of the stress tensor of v. ; / and get using the limit relation for the stress tensor of the single-layer potential (Theorem 5) and the boundary condition for the interior problem of the second kind g.x/ D lim Œv. ; / .x .x// D !0 >0
1 .x/ C Œv. ; / .x/.x/ 2
1 .x/ C P 0 .0; 0/ .x/ 2 Z 1 .x/ .k.y; x//T .y/ d!.y/; x 2 †: D 2 † (188)
D
Since the operator P 0 .0; 0/ W c.0/ .†/ ! c.0/ .†/ is compact, (176) is an integral equation of the second kind. By the well-known theorem of Fredholm (see, e.g.,
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1193
Heuser 1992), (188) has a solution 2 c.0/ .†/ if and only if the right-hand side g is orthogonal to all nontrivial solutions ' 2 c.0/ .†/ of the corresponding homogeneous adjoint integral equation of (188) given by 1 '.x/ .W .0; 0/'/.x/; x 2 † ; 2 Z 1 0 D '.x/ k.x; y/'.y/ d!.y/; x 2 † : 2 † 0D
,
(189)
For the nontrivial solutions of (189), we can formulate the following result. Lemma 4. (1) The restrictions of the six linearly independent rigid body motion vectors to the surface †, f' .k/j† gkD1;:::;6 given by ' .k/ j† .x/ D ".k/ ;
x 2 †;
k D 1; 2; 3;
(190)
' .k/ j† .x/ D x ^ ".k/;
x 2 †;
k D 4; 5; 6 :
(191)
are solutions of the homogeneous integral equation (189). (2) The dimension of the solution space of the integral equation (189) is 6. Proof. To realize the first part, let us take any of the vector fields ' .k/ , k 2 f1; : : : ; 6g, which, together with P .k/ D 0, fulfill the Stokes equations in †int . Since Œ' .k/ D 0, it follows from Theorem 3 that the flow field ' .k/ is given by ' .k/ .x/ D w.' .k/ j† ; x/;
x 2 †int :
(192)
In the limit x ! † coming from †int and using the jump relation for the doublelayer potential, we obtain 1 .k/ ' j† .x/ C .W .0; 0/' .k/j† /.x/ 2 Z 1 .k/ D ' j† .x/ C k.x; y/' .k/ j† .y/ d!.y/; 2 †
' .k/ j† .x/ D
(193) x 2 †:
This shows that each ' .k/ j† , k D 1; : : : ; 6 is a solution of (189). To show that the solutions of the integral equation (189) form a vector space of dimension 6, we assume this were not the case. Then, because of the theorem of Fredholm, also the integral equation (188) would have more than 6 linearly independent solutions. We assume (188) to have 7 linearly independent solutions which we denote by .k/ , k D 1; : : : ; 7. To each layer function .k/ on †, there corresponds a single-layer potential v. .k/ ; /. According to the limit relation for the stress tensor of the single-layer potential and due to (188), we have
1194
C. Mayer and W. Freeden
Œv.
.k/
!0
1 2 1 D 2 D
†int
.k/
(194)
.x/ C Œv. .k/ ; / .x/.x/ Z .k/ .x/ C .k.y; x//T .k/ .y/ d!.y/ †
x 2 †:
But then we get because of (46) that v. ˇˇ rx ˝ v. 2
; / .x .x//.x/
.k/
D 0;
Z
.k/
; / .x/.x/ D lim Œv.
.k/
; / satisfies the identity
; x/ C .rx ˝ v.
.k/
ˇ2 ; x//T /ˇ dx D 0
(195)
and we are able to deduce that ˇ ˇrx ˝ v.
.k/
; x/ C .rx ˝ v.
.k/
ˇ2 ; x//T /ˇ D 0;
x 2 †int :
(196)
Hence, no more than 6 of the fields v. .k/ ; / are linearly independent. These 6 linearly independent ones correspond to the 6 rigid body motions given by (170). Assume that v. .1/ ; /; : : : v. .6/ / are linearly independent. Then there exist constants Ci 2 R such that v.
.7/
; x/ D
6 X
Ci v.
.i /
; x/;
x 2 †int :
(197)
i D1
P Now, define D .7/ 6iD1 .i / ; then v. ; / D 0 in †int . By the limit relation of the single-layer potential, we can conclude that D 0 on †. Thus, .7/ can be written as a linear combination of .1/ ; : : : ; .6/ . But this is a contradiction to the linear independence of the set f .1/ ; : : : ; .7/ g which finishes the proof. Due to the theorem of Fredholm, the dimensions of the solution space of the homogeneous analogue of the integral equations (188) and (189) are equal. In contrast to (189), the six linearly independent solutions of (188) are generally unknown. Yet another consequence of the theorem of Fredholm is that the inhomogeneous equation (188) has a solution if and only if the right-hand side g is orthogonal to the vector fields f' .k/j† gkD1;:::;6 . Furthermore, we can conclude that each solution of (188) can be written as .x/ D
P .x/ C
6 X
Ck
.k/
.x/;
x 2 †;
(198)
kD1
where P is a particular solution of (188), Ck 2 R, k D 1; : : : ; 6 are constants, and .k/ , k D 1; : : : ; 6, are the six linearly independent nontrivial solutions
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1195
of the homogeneous analogue of (188). As stated before, the fields .k/ are generally unknown, but the corresponding Stokes flows in †int are known. We have characterized them in Theorem 10; they are the rigid body motion vectors f' .k/gkD1;:::6 . Summarizing our considerations, we can formulate the following result. Lemma 5. For every g 2 c.0/ .†/ which is orthogonal to the 6 linearly independent rigid body motions restricted to †, f' .k/j† gkD1;:::;6 , the interior Stokes problem of the second kind is solvable. Every solution u can be written as
u.x/ D uP .x/ C
6 X
Ck ' .k/ .x/;
x 2 †int ;
(199)
kD1
where Ck 2 R, k D 1; : : : ; 6, are constants and uP is a particular solution which can be written as a single-layer potential Z uP .x/ D v. ; x/ D
u.x; y/ .y/ d!.y/;
x 2 †int ;
(200)
†
where the layer function
fulfills the integral equation (188).
The condition on g in Lemma 5 seems to be a disadvantage, but it is not. If g is not orthogonal to one or more of the f' .k/ j† gkD1;:::;6 , then the single-layer approach resulting in an integral equation of the second kind is not applicable. Nevertheless, the interior Stokes problem of the second kind is solvable in this case. A solution is then given by an arbitrary linear combination of the six rigid body motions f' .k/gkD1;:::;6 . Uniqueness is often a problem when discussing interior problems of the second kind (which are also called interior Neumann problems). For example, in the case of the Laplace equation, an arbitrary constant can be added to every solution of the interior Neumann problems. Exterior Problem of the First Kind: Next, we want to show the existence of a solution for the exterior Stokes problem of the first kind. As before for the interior Stokes problem of the first kind, we express the solution in the form of a doublelayer potential Z w.'; x/ D
k.x; y/'.y/ d!.y/;
x 2 †ext :
(201)
†
To obtain the unknown layer function ' 2 c.0/ .†/, we take the limit x ! †, x 2 †int , and get using the limit relation for the double-layer potential and the boundary condition for the first exterior problem
1196
C. Mayer and W. Freeden
1 f .x/ D lim w.'; x C .x// D '.x/ C !0 2 >0
Z k.x; y/'.y/ d!.y/;
(202)
†
1 D '.x/ C .W .0; 0/'/ .x/; 2
x 2 †:
Equation (202) is an integral equation of the second kind. By the well-known theorem of Fredholm (see, e.g., Heuser 1992), it has a solution ' 2 c.0/ .†/ if and only if the right-hand side f is orthogonal to all nontrivial solutions 2 c.0/ .†/ of the corresponding homogeneous adjoint integral equation of (202) given by
,
1 .x/ .V 0 .0; 0/ /.x/ D 0; 2 Z 1 .x/ C .k.y; x//T .y/ d!.y/ D 0; 2 †
x 2 †;
(203)
x 2 †:
The nontrivial solutions of (203) are usually unknown. The theorem of Fredholm tells us that the nontrivial solutions of (203) and the nontrivial solutions of the homogeneous analogue of (202) form vector spaces of equal dimensions. In Lemma 4, we have seen that the vector fields f' .k/j† gkD1;:::;6 fulfill the homogeneous form of (202), i.e., 1 0 D ' .k/ j† .x/ C 2
Z k.x; y/' .k/ j† .y/ d!.y/;
x 2 † ; k D 1; : : : ; 6 :
†
(204) Furthermore, the dimension of the solution space of (204) is precisely 6. Thus, we can conclude that the dimension of the vector space of solutions of (203) is also 6, and we are able to formulate the following result. Lemma 6. For every f 2 c.0/ .†/ which is orthogonal to the 6 linearly independent solutions of (203), the exterior Stokes problem of the first kind is solvable. In this case, the solution u can be written as a double-layer potential Z u.x/ D
k.x; y/'.y/ d!.y/;
x 2 †ext ;
(205)
†
where the layer function ' fulfills the integral equation (202). The results of this lemma are not satisfactory. First of all, there seems to be no physical reason why the exterior Stokes problem of the first kind should not be solvable for all boundary data f 2 c .0/ .†/. Furthermore, we have seen in Theorem 10 that the exterior Stokes problem of the first kind is uniquely solvable. This is not adequately reflected by the nonuniqueness of the solution of (202). Summarizing, we can say that the double-layer approach presented here seems to be not the right choice for solving the exterior Stokes problem of the first kind.
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1197
We present two techniques to overcome the problems occurring in the “single” double-layer approach. Exterior Problem of the Second Kind: For the exterior Stokes problem of the second kind, we seek as for the corresponding interior problem a solution in the form of a single-layer potential Z v. ; x/ D
u.x; y/ .y/ d!.y/;
x 2 †ext :
(206)
†
To obtain the unknown layer function 2 c.0/ .†/, we take the limit x ! †, x 2 †ext , of the stress tensor of v. ; / and get using the limit relation for the stress tensor of the single-layer potential (Theorem 5) and the boundary condition for the exterior problem of the second kind g.x/ D lim Œv. ; / .x C .x// D !0 >0
1 .x/ C Œv. ; / .x/ 2
(207)
1 .x/ C P 0 .0; 0/ .x/ 2 Z 1 .x/ .k.y; x//T '.y/ d!.y/; x 2 †: D 2 †
D
Similar to (188), this is an integral equation of the second kind. By the theorem of Fredholm, it has a solution 2 c.0/ .†/ if and only if the right-hand side g is orthogonal to all nontrivial solutions of the corresponding homogeneous adjoint integral equation of (207) given by
,
1 '.x/ .W .0; 0/'/.x/ D 0; 2 Z 1 '.x/ k.x; y/'.y/ d!.y/ D 0; 2 †
x 2 †;
(208)
x 2 †:
Due to the theorem of Fredholm, the dimensions of the vector spaces of solutions of (208) and the homogeneous form of (207) are equal. In Lemma 2, we have seen that the homogeneous form of (207) has only one linearly independent solution given by D , where is the unit normal field on the surface †. Every solution of (207) can be expressed as D P C C , where P is a particular solution of (207) and C 2 R is a constant. Nevertheless, the velocity field obtained by a single-layer approach with the density is unique, since a single-layer potential with density is identically zero in †ext because it corresponds to an exterior Stokes flow of the second kind with vanishing boundary condition on †. Furthermore, we know from the theorem of Fredholm that (208) has only one linearly independent nontrivial solution denoted by 'hom which is generally unknown. Thus, we are able to formulate the following result.
1198
C. Mayer and W. Freeden
Lemma 7. For every g 2 c.0/ .†/ which is orthogonal to 'hom , where 'hom is a nontrivial solution of the integral equation (208), the exterior Stokes problem of the second kind is solvable. The solution u of the problem can be written as a singlelayer potential Z u.x/ D
u.x; y/ .y/ d!.y/;
x 2 †ext ;
(209)
†
where the layer function
fulfills the integral equation (207).
Summarizing the previous considerations, we have seen that the following relations exist between the integral equations of the second kind on the regular surface †: Z 1 '.x/C k.x; y/'.y/ d!.y/ D 0 2 † „ ƒ‚ …
adjoint
!
(176) and (208)
Z 1 '.x/ k.x; y/'.y/d!.y/ D 0 2 † „ ƒ‚ …
Z 1 .x/C .k.y; x//T .y/d!.y/ D 0 2 † „ ƒ‚ … (207) and (177)
(210) Z adjoint 1 .x/ .k.y; x//T .y/d!.y/ D 0 : ! 2 † „ ƒ‚ …
(202) and (189)
(188) and (203)
(211) For the corresponding homogeneous Stokes problems, this means that the following relations hold true: Interior problem of the first kind Exterior problem of the first kind
adjoint
! exterior problem of the second kind
adjoint
! interior problem of the second kind .
Especially, the connection of the exterior Stokes problem of the first kind and the interior Stokes problem of the second kind poses some difficulties. Since both problems are adjoint to each other in the layer potential approach presented above, the nonuniqueness of the interior Stokes problem of the second kind influences the solvability of the exterior Stokes problem of the first kind. But, as already discussed before, there is no physical reason for this specific problem not to be uniquely solvable for arbitrary boundary value f 2 c.0/ .†/. Thus, the double-layer approach presented above seems to be not appropriate to solve the exterior Stokes problem of the first kind. In the following sections, we present two different methods to overcome this problem. The Completed Double-Layer Approach: In the previous consideration, we have seen that the double-layer approach for the exterior problem of the first kind can only guarantee the existence of a solution for boundary data f 2 c.0/ .†/ which are
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1199
orthogonal to the solutions of the homogeneous adjoint boundary integral equation (203). We have seen that the dimension of the solution space of (203) is 6. Thus, it follows from the theorem of Fredholm that the homogeneous analogue of (202) has 6 nontrivial solutions. These solutions are explicitly known (see (170)) and can be used to construct a solution for arbitrary boundary data f 2 c.0/ .†/. For this construction, the nontrivial solutions of the homogeneous adjoint equation (203) have also to be calculated. They are not known explicitly for a general regular surface. Having a closer look at the double-layer potential w.'; / as an ansatz for solving the exterior problem of the first kind, it becomes clear that this approach could not lead to a satisfying result. The decay of a double-layer potential at infinity is given by w.'; x/ D O
1
jxj2
;
for jxj ! 1:
(212)
This decay of w.'; / is too rigorous compared to the condition for the flow field u at infinity in the exterior Stokes problem which is given by u.x/ D O
1 ; jxj
for jxj ! 1:
(213)
There are generally three ways to circumvent these difficulties and to guarantee the existence of a solution of the exterior Stokes problem of the first kind for arbitrary boundary data f 2 c.0/ .†/ using a boundary integral equation approach. This first way is to make an ansatz by a single-layer potential v.'; / with unknown layer function ' 2 c.0/ .†/. Due to the limit relation for the single-layer potential and the compactness of the single-layer potential operator, this approach results in an integral equation of the first kind for the unknown layer function ' including the boundary data f (see Faxen 1929; Fischer 1982). This ansatz is not discussed here in more detail. As is known, Fredholm integral equations of the first kind are illposed problems, and they generally give rise to unstable numerical schemes if any discretization is applied to the integral equation. The resulting linear systems are in most cases highly ill-conditioned. A second way to circumvent the difficulties was published in Power (1987) and Power and Miranda (1987) and intensively discussed in Power and Wrobel (1995). The authors of these articles observed that the double-layer ansatz can represent only those flow fields corresponding to surfaces which are force- and torque-free. To overcome this problem, certain terms with variable total force and torque are added to the double-layer potential. To be more precise, a Stokeslet and a Rotlet (which is defined below) located in the interior of the surface are added to the double-layer approach. Following this second approach, we seek a solution of the exterior Stokes problem of the first kind in the following form: u.x/ D w.'; x/ C u.x; 0/˛.'/ C r.x; 0/!.'/;
x 2 †ext ;
(214)
1200
C. Mayer and W. Freeden
where u is the well-known Stokeslet introduced in (48) and r, which is called Rotlet, is given by 0 r.x; y/ D
0 .x3 y3 /
1 1 @ .x3 y3 / 8 jx yj3 .x y / 2
2
.x2 y2 /
1
0 .x1 y1 / A ;
.x1 y1 /
x; y 2 R3 ; x 6D y :
0
(215) In (214), we have added to a double-layer potential w.'; / with unknown density ' a Stokeslet located at the origin, whose strength is given by the constant vector ˛.'/, and a Rotlet with constant strength !.'/, which is also located in the origin. This special Rotlet can be written as r.x; 0/!.'/ D
1 x ^ !.'/ ; 8 jxj3
x 2 R3 ; x 6D 0 :
(216)
It is convenient to choose (see Power and Wrobel 1995) ˛.'/ D
3 Z X kD1
!.'/ D
(217)
'.x/ ' .kC3/ j† .x/ d!.x/ ".k/ ;
(218)
†
3 Z X kD1
'.x/ ' .k/ j† .x/ d!.x/ ".k/;
†
where f' .k/ gkD1;:::;6 are the motions of the fluid as a rigid body introduced in (170). To obtain the unknown layer function ' 2 c.0/ .†/, we take the limit x ! †, x 2 †ext , and get using the limit relation for the double-layer potential and the boundary condition for the first exterior problem f .x/ D lim w.'; x C .x// C u.x C .x/; 0/˛.'/ C r.x C .x/; 0/!.'/ !0 >0
1 D '.x/ C 2
(219)
Z k.x; y/'.y/ d!.y/ C u.x; 0/˛.'/ C r.x; 0/!.'/ : †
Theorem 12. For all f 2 c.0/ .†/, the boundary integral equation (219) is uniquely solvable. R Proof. The double-layer potential operator ' ! 7 † k.x; y/'.y/ d!.y/ is compact on the space c.0/ .†/. Furthermore, it can easily be shown that the operators c.0/ .†/ ! c.0/ .†/ ' 7! u.; 0/˛.'/;
and
c.0/ .†/ ! c.0/ .†/ ' 7! r.; 0/!.'/;
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1201
where ˛.'/ and !.'/ are given by (217) and (218), are compact on c.0/ .†/. Thus, (219) is a Fredholm integral equation of the second kind and the theorem of Fredholm can be applied. In the following, we show that the homogeneous form of (219) given by 1 '.x/ C 2
Z k.x; y/'.y/ d!.y/ C u.x; 0/˛.'/ C r.x; 0/!.'/ D 0
(220)
†
admits only the trivial solution. We define two flow fields u1 and u2 by Z u1 .x/ D
k.x; y/'.y/ d!.y/ C r.x; 0/!.'/;
x 2 †ext ;
(221)
x 2 †ext :
(222)
†
u2 .x/ D u.x; 0/˛.'/;
Using the limit relation for the double-layer potential and (220), we can conclude that the restrictions of u1 and u2 to the surface † coincide. By the uniqueness of the exterior Stokes problem of the first kind, it follows that u1 and u2 are equal in †ext . Since 1 1 and u u1 .x/ D O .x/ D O ; jxj ! 1; (223) 2 2 jxj jxj we can conclude that both u1 and u2 are identically zero in †ext , i.e., for all x 2 †ext , Z k.x; y/'.y/ d!.y/ C r.x; 0/!.'/ D 0 ;
(224)
†
˛.'/ D
3 Z X
'.x/ '
.k/
j† .x/ d!.x/ ".k/ D 0 :
(225)
†
kD1
Next, we define two flow fields u3 and u4 by Z u3 .x/ D
k.x; y/'.y/ d!.y/;
x 2 †ext ;
(226)
x 2 †ext :
(227)
†
u4 .x/ D r.x; 0/!.'/;
Since the torque resulting from u4 on † is equal to !, and the torque of a doublelayer potential on † is zero, it can be concluded that, for x 2 †ext , Z k.x; y/'.y/ d!.y/ D 0 ;
(228)
'.x/ ' .kC3/ j† .x/ d!.x/ ".k/ D 0 :
(229)
†
!.'/ D
3 Z X kD1
†
1202
C. Mayer and W. Freeden
Because of (224) and (229), the integral equation (220) reduces to Z 1 '.x/ C k.x; y/'.y/ d!.y/ D 0 : 2 †
(230)
As we have seen before (see Lemma 4), this homogeneous equation has precisely six linearly independent solutions given by the restrictions of the six rigid body motions to †. Thus, every solution ' can be represented as a linear combination of these six linear independent solutions, i.e., there exist coefficients Cj 2 R, j D 1; : : : ; 6, P such that ' D 6j D1 Cj ' .j / j† . Inserting this equation into (224) and (229) results in 6 X
Z ' .j / j† .x/ ' .k/ j† .x/ d!.x/ D 0;
Cj
j D1
k D 1; : : : ; 6 :
(231)
†
This system of linear equations has only the trivial solution Cj D 0, j D 1; : : : ; 6, since the matrix Z .j / .k/ ' j† .x/ ' j† .x/ d!.x/ (232) †
j;kD1;:::;6
is regular because the functions ' .k/ j† are linearly independent. This shows that ' D 0 is the only solution of (220), and we can conclude by the theorem of Fredholm that the integral equation (219) is uniquely solvable for all right-hand sides f 2 c.0/ .†/. Using this result, we are finally able to formulate the following lemma. Lemma 8. For every f 2 c.0/ .†/, the exterior Stokes problem of the first kind is uniquely solvable. The solution u can be written in the form Z u.x/ D
k.x; y/'.y/ d!.y/Cu.x; 0/˛.'/Cr.x; 0/!.'/;
x 2 †ext ;
(233)
†
where the layer function ' fulfills the integral equation (219) and ˛ and ! are given by (217) and (218). The Modified Double-Layer Approach: Finally, a third method to overcome the problems of solvability for the exterior Stokes problem of the first kind has been proposed in Hebeker (1986). As we have seen before, the classical way of solving this problem leads to system of boundary integral equations with eigensolutions (see (202)). These are known explicitly, but the usual modified approach to circumvent these eigensolutions requires all the eigensolutions of the corresponding adjoint system of equations. They are usually not known. In Hebeker (1986), a modified boundary layer approach is proposed to obtain a system of uniquely solvable boundary integral equations. We seek a solution of the exterior Stokes problem of the first kind in the following form:
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1203
u.x/ D w.'; x/ C v.'; x/;
(234)
x 2 †ext ;
In (234), we have added a single-layer potential with density ', where 2 R is a free constant, to a double-layer potential w.'; / with unknown density '. This approach corresponds to an idea due to Leis (1964), Brakhage and Werner (1965), and Panich (1965) for the scalar Helmholtz equation. To obtain the unknown layer function ' 2 c.0/ .†/, we take the limit x ! †, x 2 †ext , and get using the limit relation for the double-layer and the single-layer potentials and the boundary condition for the first exterior Stokes problem f .x/ D lim w.'; x C .x// C v.'; x C .x// !0 >0
1 D '.x/ C 2
(235)
Z
Z k.x; y/'.y/ d!.y/ C
†
u.x; y/'.y/ d!.y/ : †
Theorem 13. For all > 0 and all f 2 c.0/ .†/, the boundary integral equation (235) is uniquely solvable. Proof. Since the single- and double-layer potentials are compact operators on the space c.0/ .†/, (235) is a Fredholm boundary integral equation of the second kind. Using the theorem of Fredholm, it suffices to show that the homogeneous adjoint system
1 .x/ C 2
Z
Z .k.y; x//T .y/ d!.y/ C
†
k.x; y/ .y/ d!.y/ D 0;
x 2 †;
†
(236)
has only the trivial solution. Because of Lemma 1 and Corollary 3, (236) is equivalent to the integral equation 0D
1 .x/ C ŒV .0; 0/f .x/.x/ C V .0; 0/ .x/; 2
D lim V 0 .; 0/ .x/ C V .0; 0/ .x/; !0 >0
(237)
x 2 †:
Integrating over †, this gives Z
Z jV .0; 0/ .x/j2 d!.x/ D
†
.V .0; 0/ .x// .V .0; 0/ .x// d!.x/ †
!
Z D
.V .0; 0/ .x// †
lim V 0 .; 0/ .x/ !0 >0
d!.x/ : (238)
1204
C. Mayer and W. Freeden
Using third Green’s identity (46), we obtain !
Z
.V .0; 0/ .x// †
Z
d!.x/
(239)
Z
†
ˇ ˇ2 ˇ ˇ ˇrx ˝ V .; 0/ .x/ C .rx ˝ V .0; 0/ .x//T ˇ dx 0
D lim !0 >0
!0 >0
.V .0; 0/ .x// V 0 .; 0/ .x/ d!.x/
D lim !0 >0
lim V 0 .; 0/ .x/
†int
which clearly gives using (238) and > 0 that V .0; 0/ D 0 on †. Because of the unique solvability of the interior as well as the exterior Stokes problem of the first kind, this immediately leads to V D 0 in R3 . The limit relation for the single-layer potential (91) now gives .x/ D lim .V .; 0/ .x/ V .; 0/ .x// D 0; !0 >0
x 2 †;
(240)
which finally shows that the integral equation (236) has only the trivial solution D 0. By the theorem of Fredholm, we can conclude that the integral equation (235) is uniquely solvable, which is the desired result. Using this result, we are finally able to formulate the following lemma. Lemma 9. For every f 2 c.0/ .†/, the exterior Stokes problem of the first kind is uniquely solvable. The solution u can be written in the form Z
Z k.x; y/'.y/ d!.y/ C
u.x/ D †
u.x; y/'.y/ d!.y/;
x 2 †ext ;
(241)
†
where the layer function ' fulfills the integral equation (235). To complete this section, we want to give two regularity results for the exterior Stokes problems, which connect the solution u of the Stokes problem to the respective boundary data. (see Freeden (1980) for an analogous result involving the Laplace operator). Theorem 14. Let † be a regular surface and let f 2 c.0/ .†/ be given. Let u 2 c.2/ .†ext / \ c.0/ .†ext / be the unique solution of the exterior Stokes problem of the first kind with boundary data f . Then, for every sufficiently small > 0, there exists a constant C .D C .K; †// such that .k/ r u
c.0/ .K/
C kf kl2 .†/
(242)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1205
for all K †ext with dist.K; †ext / and for all k 2 N0 . Proof. Let us recall that the exterior Stokes problem of the first kind can be solved by (234), where the unknown layer function ' fulfills the boundary integral equation (235). In the operator notation, (235) reads as follows: 1 T ' D I W .0; 0/ C V .0; 0/ ' D f : 2
(243)
The operator T W l2 .†/ ! l2 .†/ defined above and its adjoint T with respect to the l2 .†/-inner product are bijective in the Banach space .c.0/ .†/; kkc.0/ .†/ /. By virtue of the inverse mapping theorem (see, e.g., Heuser 1992), the operators T and T 1 are linear and bounded with respect to the kkc.0/ .†/ -norm. Using a technique due to Lax (1954) (see also the proof of Theorem 7), the boundedness of the operators T and T 1 can be transferred to the kkl2 .†/ norm. Now, let > 0 and let K †ext with dist.K; †ext / be given. Using (234) and the Cauchy-Schwarz inequality, we get, for all x 2 K, ˇZ ˇ ˇ ˇ ˇ ˇ .k/ ˇ r u .x/ˇ D ˇ .k.x; y/ C u.x; y// '.y/ d!.y/ˇ ˇ ˇ
(244)
†
Z
ˇ ˇ .k/ ˇr .k.x; y/ C u.x; y//ˇ2 d!.y/
†
1=2 k'kl2 .†/ :
Thus, we have shown that .k/ r u .0/ D k'kl2 .†/ c .K/
(245)
with Z D D sup x2K
ˇ ˇ .k/ ˇr .k.x; y/ C u.x; y//ˇ2 d!.y/
1=2 < 1:
(246)
†
Unfortunately, this is not the desired result, since we have the layer function ' on the right-hand side and not the boundary data f . However, weknowthat ' D T 1 f with the operator T being defined in (243). Defining C D D T 1 , we get .k/ r u
c.0/ .K/
D T 1 f l2 .†/ D T 1 kf kl2 .†/ D C kf kl2 .†/
(247)
which is the desired result. An analogous argument using the representation (201) and the integral equation (202) yields the following result. (see also Freeden (1980), Freeden and Gerhards (2013) for results for the Laplace operator and Freeden and Michel (2004) for estimates involving the Cauchy-Navier operator).
1206
C. Mayer and W. Freeden
Theorem 15. Let † be a regular surface and let g 2 c.0/ .†/ be given such that the exterior Stokes problem of the second kind is solvable and let u 2 c.2/ .†ext / \ c.1/ .†ext / be the unique solution of the Stokes problem with boundary data g. Then, for every sufficiently small > 0, there exists a constant C .D C .K; †// such that .k/ r u
c.0/ .K/
@g C @ 2 l .†/
(248)
for all K †ext with dist.K; †ext / and for all k 2 N0 .
3.5
The Stress Tensor of the Double-Layer Potential
For a regular surface, generally, the integral kernel that comes up with the stress tensor of the double-layer potential is not weakly singular, even after taking the right normal projection of it to get the tractions. There is no guarantee that a finite limit exists for the stress tensor and its right normal projection as the surface is approached from inside or outside. In Odqvist (1930), it is shown that, for the limit to exist, it is sufficient that the double-layer density and its first derivative on the regular surface are Hölder continuous. Moreover, if the limit of the right normal projection of the stress of a double-layer potential from one side of the surface exists, then the limit from the other side of the surface exists and the two limits take the same value at the surface. In other words, the projection of the stress tensor of a double layer passes continuously through the surface. This result is called the Lyapunov-Tauber theorem in Power and Wrobel (1995), while it is called Faxen’s theorem in Hebeker (1986) due to Faxen (1929). In this section, we present two results – the first one is just the statement from above and the second one is the result of Odqvist (1930). Theorem 16. The existence of either the exterior limit lim . Œw.f; / .x C .x/// .x/
(249)
lim . Œw.f; / .x .x/// .x/
(250)
!0; >0
or the interior limit
!0; >0
implies the existence of the other limit, respectively, and lim . Œw.f; / .x C .x/// .x/. Œw.f; / .x .x/// .x/ D 0; x 2 † :
!0; >0
(251)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1207
Proof. The realization of this result follows an idea of Kupradze (1965) for a corresponding version of this theorem in the theory of linear elasticity. Let w.l.o.g an exterior limit of Œw.f; / exist and let g.x/ D lim Œw.f; / .x C .x//.x/; !0 >0
x 2 †:
(252)
We construct a single-layer potential v.'; / satisfying the boundary condition of the second kind Œv.'; / .x/.x/ D g.x/;
x 2 †:
(253)
From the limit relation for the stress tensor of a single-layer potential (100), we obtain Z 1 '.x/ .k.y; x//T '.y/ d!.y/ D g.x/; x 2 † : (254) 2 † This is a Fredholm integral equation of the second kind which is solvable if and only if the right-hand side g fulfills the conditions Z g.x/' .k/ j† .x/ d!.x/ D 0;
k D 1; : : : ; 6;
(255)
†
where the functions ' .k/ , k D 1; : : : ; 6 are the six rigid body motions introduced in (170). The external limit for the given double-layer potential is given by (see (98)) lim w.f; x C .x// D !0 >0
1 .x/ C 2
Z k.x; y/ d!.y/;
x 2 †:
(256)
†
Considering this equation as a Fredholm integral equation of the second kind, it has a solution if and only if the boundary data lim !0 w.f; x C .x// fulfills >0
Z lim w.f; x C .x// †
!0 >0
.k/
.x/ d!.x/ D 0;
k D 1; : : : ; 6;
(257)
where the .k/ are the six linearly independent nontrivial solutions of the homogeneous adjoint integral equation of (256) given by
1 2
Z .k/
.k.y; x//T
.x/ C
.k/
.y/ d!.y/ D 0;
x 2 †:
(258)
†
To each of these nontrivial solutions, there corresponds a single-layer v. k D 1; : : : ; 6 for which
.k/
; /,
1208
C. Mayer and W. Freeden .k/
lim !0 Œv. >0
; / .x .x//.x/ D 0 for all x 2 † because of (258) and
the limit relation (99). Now, since v. .k/ ; / and w.f; / are solutions of the Stokes problem in †ext , we have following Theorem 2 (Lorentz reciprocal theorem): Z " lim v. †
!0 >0
.k/
; x C .x// lim Œw.f; / .x C .x//.x/ !0 >0
lim w.f; x C .x// lim Œv. !0 >0
(259) #
.k/
!0 >0
; / .x C .x//.x/ D 0;
for all k D 1; : : : ; 6. Using the jump relation of the right normal projection of a single-layer potential (105), the fact that lim !0 Œv. .k/ ; / .x .x//.x/ D 0 >0 for all x 2 †, and the continuity of the single-layer potential across the surface †, this equation is equivalent to Z lim v. ; x .x// lim Œw.f; / .x C .x//.x/ d!.x/ D 0 : †
!0 >0
!0 >0
(260)
(260) is equivalent to (255), since v. .k/ ; / is a solution of the homogeneous interior Stokes problem of the second kind and can be written as a linear combination of the six rigid body motions ' .k/ , k D 1; : : : ; 6. Thus, we finally have shown that there always exists a single-layer potential v.'; / satisfying the boundary condition (253). Now, we consider the vector field u.x/ D w. ; x/ v.'; x/;
x 2 †ext :
(261)
Using the limit relation for a single-layer potential (96) and the boundary condition (253), we get lim Œu .x C .x//.x/ D '.x/;
x 2 †:
!0 >0
(262)
Since u fulfills the Stokes equations in †ext , it can be represented according to Corollary 6 by u.x/ D w.uj† ; x/ v.. Œu /j† ; x/;
x 2 †ext :
(263)
Subtracting the two representations (261) and (263) of u and keeping (262) in mind, we obtain w.u ; x/ D 0;
x 2 †ext :
(264)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1209
Taking the limit x ! † coming from †ext , this gives
1 .u.x/ 2
Z .x// C
k.x; y/ .u.y/
.y// d!.y/ D 0;
x 2 †:
(265)
†
Every solution of this homogeneous Fredholm integral equation of the second kind can be written as a linear combination of the six rigid body motions ' .k/ restricted to †, i.e., 6 X Ck ' .k/ j† .x/; x 2 †; (266) u.x/ .x/ D kD1
where Ck 2 R are real coefficients. Therefore, u.x/ D
.x/ C Ck ' .k/ j† .x/;
x 2 †:
(267)
x 2 †int :
(268)
Now, if we observe (263) for x to be in †int , we have 0 D w.uj† ; x/
3 X
v.".i / Œu ; x/".i / ;
i D1
Substituting (267) and (262) into (268), we get 0 D w. ; x/ C
6 X
Ck w.' .k/ ; x/ v.'; x/;
x 2 †int ;
(269)
kD1
which can be written as 0 D w. ; x/ C
6 X
Ck0 ' .k/ v.'; x/;
x 2 †int ;
(270)
kD1
with new coefficients Ck0 2 R, since every double-layer potential with rigid body motions as densities can be written as linear combination of the six rigid body motions themselves in †int (see (170)). Taking the limit of (270) for x ! † coming from †int , using (253) and the fact that Œ' .k/ D 0 on †, we get lim Œw. ; / .x .x//.x/ lim Œv.'; / .x .x//.x/ !0 >0
!0 >0
(271)
D lim Œw. ; / .x .x//.x/ lim Œw. ; / .x C .x//.x/ D 0; !0 >0
!0 >0
for all x 2 † which finally shows the desired result.
1210
C. Mayer and W. Freeden
To complete this section, we want to give a result which guarantees the existence of the inner and outer limit of the tension of a double-layer potential. A proof of this theorem can be found, e.g., in Faxen (1929) and Odqvist (1930). Theorem 17. If the layer function f is Hölder continuous, i.e., f 2 c.0;˛/ .†/ with ˛ > 0, both the exterior limit lim . Œw.f; / .x C .x/// .x/
(272)
lim . Œw.f; / .x .x/// .x/
(273)
!0 >0
and the interior limit !0 >0
exist, and they are pointwise equal, i.e., lim . Œw.f; / .x C .x/// .x/ . Œw.f; / .x .x/// .x/ D 0 ;
!0 >0
x 2 †: (274)
4
Multiscale Regularizations of Layer Potentials
The following section is the core of this work, namely, a multiscale analysis within the space l2 .†/ of square-integrable vector fields on a regular surface †. The special types of tensor kernel functions are developed from the limit and jump relations. The major result which leads us to the definition of tensor scaling functions is the statement of Theorem 7 which presents the limit and jump relations for the singleand double-layer potentials as defined in Definition 3 and their corresponding stress tensors for the topology of the space .l2 .†/; kkl2 .†/ /. The tensorial layer integral kernels constitute tensor scaling functions on the regular surface, where the distance of a parallel surface to the regular surface acts as the scale parameter. Canonically, tensor wavelet functions are defined by forming the difference between two consecutive scaling functions. In turn, we are led to the settings of scale and detail spaces. (see Freeden and Mayer (2003) for the potential theoretic case and Abeyratne et al. (2003) for the case of linear elasticity).
4.1
Scaling Functions and Wavelets
Explicitly written out, the tensor-valued integral kernel functions known from the limit and jump relations as formulated in Corollary 5 read as follows.
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1211
Corollary 7. For all f 2 l2 .†/, the limit relation 8 < V .0; 0/f i D 1 ; i lim ˆ .; y/ f .y/ d!.y/ D f i D 2; 3; 5; 6 ; !0 : † >0 0 i D 4; Z
(275)
holds true for > 0 and .x; y/ 2 † †, where the tensor-valued kernel functions fˆ i g, i D 1; : : : ; 6, are given by ˆ 1˙ .x; y/ D u.x ˙ .x/; 0/ (276) i .x ˙ .x/ y/ ˝ .x ˙ .x/ y/ 1 ; C D 8 jx ˙ .x/ yj jx ˙ .x/ yj3
ˆ 2˙ .x; y/ D 2k.x ˙ .x/; y/ 2k.x; y/ 3 .x ˙ .x/ y/ ˝ .x ˙ .x/ y/ ..x ˙ .x/ y/ .y// D 2 jx ˙ .x/ yj5 .x y/ ˝ .x y/ ..x y/ .y// ; C jx yj5 (277)
ˆ 3˙ .x; y/
3 .x ˙ .x/ y/ ˝ .x ˙ .x/ y/ D ..x ˙ .x/ y/ .x// 2 jx ˙ .x/ yj5 .x y/ ˝ .x y/ ..x y/ .x// ; C jx yj5 (278)
ˆ 4 .x; y/ D u.x C .x/; y/ u.x .x/; y/; .x C .x/ y/ ˝ .x C .x/ y/ 1 i C D 8 jx C .x/ yj jx C .x/ yj3 i .x .x/ y/ ˝ .x .x/ y/ ; C jx .x/ yj jx .x/ yj3 (279)
1212
C. Mayer and W. Freeden
ˆ 5 .x; y/ D k.x C .x/; y/ k.x .x/; y/ 3 D 4
.x C .x/ y/ ˝ .x C .x/ y/ jx C .x/ yj5
.x .x/ y/ ˝ .x .x/ y/ jx .x/ yj5
..x C .x/ y/ .y//
..x .x/ y/ .y// ; (280)
ˆ 6 .x; y/
3 D 4
.x C .x/ y/ ˝ .x C .x/ y/ jx C .x/ yj5
..x C.x/ y/ .x//
.x .x/ y/ ˝ .x .x/ y/ ..x .x/ y/ .x// : jx .x/ yj5 (281) Furthermore, for all f 2 l2 .†/, we have 8 i D 1; 3 ; < f 'i .; y/ f .y/ d!.y/ D lim Q.0; 0/f i D 2 ; !0 : † >0 0 i D 4; Z
(282)
provided that > 0 and .x; y/ 2 † †, where the vector-valued kernel functions f'i g, i D 1; : : : ; 4, are given by 1 '˙ .x; y/ D 2 .q.x ˙ .x/; y/ q.x; y// 1 x ˙ .x/ y xy D ; 2 jx ˙ .x/ yj3 jx yj3
(283)
2 '˙ .x; y/ D k.x ˙ .x/; y/
D
jx ˙ .x/ yj2 .y/ 3 ..x ˙ .x/ y/ .y// .x ˙ .x/ y/ ; 2 jx ˙ .x/ yj5
(284) '3 .x; y/ D q.x C .x/; y/ q.x .x/; y/ 1 x C .x/ y x .x/ y D ; 4 jx .x/ yj3 jx .x/ yj3
(285)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1213
'4 .x; y/ D k.x C .x/; y/ k.x .x/; y/ D
2
jx C .x/ yj2 .y/ 3 ..x C .x/ y/ .y// .x C .x/ y/ jx C .x/ yj5
jx .x/ yj2 .y/ 3 ..x .x/ y/ .y// .x .x/ y/ jx .x/ yj5
! :
(286) We are mainly interested in constructing a multiscale analysis in the space l2 .†/ in terms of tensor scaling functions and wavelets. Hence, only the tensorial case, i.e., the tensor kernel functions ˆ i , i D 1; : : : ; 6, is discussed in the following. Definition 5. For > 0 and i 2 f1; : : : ; 6g, the family fˆ i g >0 of kernels ˆ i W † † ! R is called a †-tensor scaling function of type i . Moreover, for the case D 1, the kernel ˆ i1 W † † ! R is called the mother kernel of the †-tensor scaling function of type i . Remark 2. Some graphical illustrations of the †-tensor scaling function of type i D 5 can be found in Appendix C. By the use of the aforementioned tensor scaling functions, we are canonically able to introduce tensor wavelets on regular surfaces as follows. (cf. Freeden and Mayer (2003)). Definition 6. Let ˛ W Œ0; 1/ ! R be a positive weight function. For > 0 and i 2 f1; : : : ; 6g, the family f‰ i g >0 of kernels ‰ i W † † ! R given by ‰ i .x; y/ D
1 d i ˆ .x; y/; ˛./ d
x; y 2 †;
(287)
is called a family of †-tensor wavelet functions of type i . Moreover, in the case D 1, the kernel ‰ i1 W † † ! R is called the mother kernel of the †-tensor wavelet function of type i . Furthermore, the differential equation (287) is called the (scale continuous) †-scaling equation. Remark 3. In the remainder of this work, we particularly choose ˛./ D 1 . Of course, other weight functions than ˛./ D 1 can be chosen as well in (287). The †-tensor wavelet functions defined in Definition 6 can be calculated explicitly using the representations of the †-tensor scaling functions given in Corollary 7. The explicit formulas for the †-tensor wavelet functions of type i , f‰ i g >0 , i D 1; : : : ; 6, and some graphical illustrations can be found in Appendix C.
1214
C. Mayer and W. Freeden
Definition 7. Let fˆ i g >0 be a †-tensor scaling function of type i and f‰ i g >0 be the associated †-tensor wavelet function of type i . Then, the associated †-wavelet transform of type i is defined by .WT/.i / W l2 .†/ ! l2 ..0; 1/ †/ with Z .WT/.i / .f /.; x/ D †
‰ i .x; y/f .y/ d!.y/ ;
It is not difficult to see that ‰ i D O
1
> 0; x 2 † :
for ! 0 ;
(288)
(289)
is valid for all i 2 f1; : : : ; 6g. Hence, the convergence of the integrals in Theorem 18 can be guaranteed. Theorem 18. Let fˆi g >0 be a †-tensor scaling function of type i . Suppose that the vector field f is of class l2 .†/. Then the reconstruction formula Z
1 0
8 < V .0; 0/f i D 1 ; d D .WT/.i / .f /.; / f i D 2; 3; 5; 6 ; : 0 i D 4;
(290)
holds true in the sense of the kkl2 .†/ -norm. Proof. Let R > 0 be arbitrary. Integrating the scaling equation (287) from R to 1, we obtain the identity Z ˆ iR .x; y/ D
1 R
‰ i .x; y/
d ;
.x; y/ 2 † † :
(291)
Observing Fubini’s theorem, we get Z
1
d D .WT/ .f /.; /
Z
1
Z
.i /
R
R
Z Z
† 1
D Z
†
D †
R
‰ i .; y/f .y/ d!.y/ ‰ i .; y/f .y/
d
(292)
d d!.y/
ˆ iR .; y/f .y/ d!.y/ :
The limit R ! 0 in connection with Corollary 7 yields the desired result. Next, our interest is to reformulate the †-wavelet transform and the reconstruction theorem by the use of dilated and shifted versions of the mother kernel. For that purpose, we introduce the x-translation and the -dilation operator of a mother kernel as follows:
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1215
Tx W ‰ i1 7! Tx ‰ i1 D ‰ i1Ix D ‰ i1 .x; /;
x 2 †;
(293)
> 0:
(294)
D W ‰ i1 ! D ‰ i1 D ‰ i ; Consequently, it follows that
Tx D ‰ i1 D Tx ‰ i D ‰ i Ix D ‰ i .x; /; i D 1; : : : ; 6. In other words, Z .WT/.i / .f /.I x/ D ‰ i Ix .y/f .y/ d!.y/;
x 2 †; > 0 :
(295)
(296)
†
Summarizing the results of Corollary 7 and Theorem 18, we obtain the following result. Theorem 19. Let fˆi g >0 be a †-tensor scaling function of type i and f‰ i g >0 be the associated †-tensor wavelet function of type i . For all x 2 † and f 2 l2 .†/, we have 8 Z < V .0; 0/f i D 1; i lim ˆ RIx .y/f .y/ d!.y/ D (297) f i D 2; 3; 5; 6 R!0 : † R>0 0 i D4 and Z
1 0
4.2
Z
8 < V .0; 0/f i D 1; d D ‰ Ix .y/f .y/ d!.y/ f i D 2; 3; 5; 6 : † 0 i D 4:
(298)
Scale Discretized Scaling Functions and Wavelets
Until now, we were concerned with a scale continuous approach to tensor wavelets. In what follows, scale discretized †-tensor scaling functions and wavelets of type i are introduced. We start with the choice of a sequence which divides the continuous scale interval .0; 1/ into discrete pieces. More explicitly, .j /j 2Z denotes a sequence of real numbers satisfying limj !1 j D 0 and limj !1 j D 1 . Remark 4. For example, the dyadic sequence j D 2j , j 2 Z can be chosen. Note that in this case, 2j C1 D j for all j 2 Z. Given a †-tensor scaling function fˆ i g >0 of type i , then we clearly define the (scale) discretized †-tensor scaling function of type i by fˆij gj 2Z . In doing so, we immediately get the following result from Theorem 19.
1216
C. Mayer and W. Freeden
Theorem 20. Let fˆ ij gj 2Z be a scale discretized †-tensor scaling function of type i ; then, for all f 2 l2 .†/, the identity Z lim
j !1
8 < V .0; 0/f i D 1; i ˆ j .; y/f .y/ d!.y/ D f i D 2; 3; 5; 6 : † 0 i D4
(299)
holds in the sense of the k kl2 .†/ -norm. Our procedure canonically leads us to the following type of scale discretized tensor wavelets. Definition 8. Let fˆ j gj 2Z be a discretized †-tensor scaling function of type i . Then the (scale) discretized †-tensor wavelet function of type i is defined by Z ‰ ij .; / D
j j C1
‰ i .; /
d ;
j 2Z
(300)
In connection with (287), it follows that Z ‰ ij .; / D
j
j C1
d i d ˆ .; / D ˆ ij C1 .; / ˆ ij .; /: d
(301)
Formula (301) is called (scale) discretized †-tensor scaling equation of type i . Assume now that f is a function of class l2 .†/. Observing the discretized †tensor scaling equation, we get for all J 2 Z and N 2 N Z
Z †
ˆ iJ CN .; y/f
C
J CN X1 Z j DJ
†
.y/ d!.y/ D †
ˆ iJ .; y/f .y/ d!.y/
‰ ij .; y/f .y/ d!.y/ :
(302)
Therefore, we are able to formulate the following corollary. Corollary 8. Let fˆij gj 2Z be a (scale) discretized †-tensor scaling function of type i and f‰ j gj 2Z be the corresponding discretized †-tensor wavelet function of type i . Then the multiscale representation of a function f 2 l2 .†/ C1 X j D1
Z
8 < V .0; 0/f i D 1; ‰ ij .; y/F .y/ d!.y/ D f i D 2; 3; 5; 6 : † 0 i D4
(303)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1217
holds in the sense of the kkl2 .†/ -norm. Moreover, we have
PiJ .f / C
C1 X j DJ
Z
8 < V .0; 0/f i D 1; ‰ ij .; y/f .y/ d!.y/ D f i D 2; 3; 5; 6 : † 0 i D4
(304)
for every J 2 Z (in the sense of the kkl2 .†/ -norm), where PiJ .f / is given by Z PiJ .f
/D †
ˆ iJ .; y/f .y/ d!.y/ :
(305)
The scale discretized †-tensor wavelets allow the following formulation: Tx Dj ‰ i1 D Tx ‰ ij D ‰ ij Ix D ‰ ij .x; /
(306)
for i D 1; : : : ; 6 and x 2 †. Within this notational framework, the (scale) discretized †-tensor wavelet transform of type i , i D 1; : : : ; 6, can be defined by
.WT/.i /
8
0, i D 2; 3; 6, in (325)–(327) are remaining tensor kernels Ri D ˆ i ˆ given by R2˙ .x; y/
D .x ˙ .x/ y/ .y/ rxQ ˝ ryQ C .x y/ .y/
rxQ ˝ ryQ
R3˙ .x; y/ D .x ˙ .x/ y/ .x/
C .x y/ .x/
R6 .x; y/ D .x C .x/ y/ .x/
1 4 jxQ yj Q
rxQ ˝ ryQ
rxQ ˝ ryQ
C .x .x/ y/ .x/ rxQ ˝ ryQ
xDx˙ Q .x/; yDy Q
; xDx; Q yDy Q
(331)
1 4 jxQ yj Q
1 4 jxQ yj Q
rxQ ˝ ryQ
1 4 jxQ yj Q
xDx˙ Q .x/; yDy Q
; xDx; Q yDy Q
1 4 jxQ yj Q 1 4 jxQ yj Q
(332)
xDxC Q .x/; yDy Q
; xDx Q .x/; yDy Q
(333) for x; y 2 † and > 0. As before for the case i D 5, the well-known limit and jump relations of the single-layer potential corresponding to the Laplace equations (see Günter 1957) lead us to the conclusion that, for i D 2; 3; 6, Z lim
!0 >0
†
Ri .x; y/f .y/ d!.y/ D 0
(334)
for all f 2 l2 .†/ and all x 2 †. Thus, the property of the tensor scaling function ˆ i , i D 2; 3; 5; 6, of establishing an approximate identity is resulting from the second part in (320) and (325)–(327) since the scalar kernels ˆi , i D 2; 3; 5; 6 form a scalar approximate identity (see Freeden and Mayer 2003). The remaining tensor
1222
C. Mayer and W. Freeden
kernels Ri , i D 2; 3; 5; 6 are vanishing in the limit ! 0. These observations motivate the following definition. i
i
Q g >0 of kernels ˆ Q W Definition 10. For > 0 and i 2 f2; 3; 5; 6g, the family fˆ † † ! R defined by Q i .x; y/ D ˆi .x; y/ i ; ˆ
x; y 2 †;
i D 2; 3; 5; 6;
(335)
where the scalar kernels ˆi , i D 2; 3; 5; 6 are given in (322) and (328)–(330), is called a †-tensor scaling function of the second kind of type i . Q 5 are Graphical impressions of the †-tensor scaling function of the second kind ‰ illustrated in Appendix C. By the use of the aforementioned †-tensor scaling functions, we can define †tensor wavelets of the second kind on regular surfaces as follows. Definition 11. Let ˛ W Œ0; 1/ ! R be a positive weight function. For > 0 and Q i g >0 of kernels ‰Q i W † † ! R given by i 2 f2; 3; 5; 6g, the family f‰ 1 d Qi i ‰Q .x; y/ D ˆ .x; y/; ˛./ d
x; y 2 †;
(336)
is called a family of †-tensor wavelet functions of the second kind of type i . 5 Graphical illustrations of the †-tensor wavelet function of the second kind ‰Q can be found in Appendix C. The tensor wavelet functions of the second kind defined in Definition 11 can be calculated explicitly using the representations of the tensor scaling functions of the second kind given in Definition 10. It easily follows that
Q i .x; y/ D ‰ i .x; y/ i ; ‰
x; y 2 †;
i D 2; 3; 5; 6;
(337)
x; y 2 †;
(338)
with ‰i W † † ! R given by ‰i .x; y/ D
1 d i ˆ .x; y/; ˛./ d
where the scalar kernels ˆi , i D 2; 3; 5; 6, defined in (322) and (328)–(330). Explicit representations of the scalar kernels ‰i , i D 2; 3; 5; 6, can be found in Freeden and Mayer (2003). Starting from the definitions of †-tensor scaling functions and †-tensor wavelets of the second kind, the whole multiscale framework can be realized. In analogous way, before, we can define a wavelet transform of the second kind by
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
e e
.WT/.i / W l2 .†/ ! l2 ..0; 1/ †/ Z i .i / .WT/ .f /.; x/ D ‰Q .x; y/f .y/ d!.y/ ;
1223
(339)
†
and get the two important vectorial approximate identities, viz., Z lim
!0 >0
†
Q i .; y/ f .y/ d!.y/ D f ˆ
(340)
e
(341)
and Z
1
.WT/.i / .f /.; / 0
d Df ;
for i D 2; 3; 5; 6 and f 2 c.0/ .†/. As in Sect. 4.3, scale and detail spaces can be introduced and a scale discretization can be performed analogously.
4.5
Spherical Multiresolution Analysis
Our purpose now is to establish a multiresolution analysis for the †-tensor scaling and wavelet functions. First, we characterize a multiresolution analysis of the space l2 .†/. We easily see that most of the properties of a multiresolution, such as the linearity and the approximate identity, have already been shown earlier. The only feature which is not trivial is the property that the scale spaces form a nested sequence for decreasing scale . Let us explain under which conditions a family of subspaces constitutes a multiresolution analysis of the space l2 .†/. Definition 12. A family of subspaces fV .†/g 2.0;1/ l2 .†/ is called a multiresolution analysis if it satisfies the following properties: 2 0 (1) f0g \ V .†/ V˚ 0 .†/ l .†/ for 0 < < < 1;
2 V .†/ D f 2 l .†/jf 2 V .†/ for all 2 .0; 1/ D f0g; (2) 2.0;1/
(3)
[
kkl2 .†/
V .†/
kkl2 .†/
D ff 2 l2 .†/ j f 2 V .†/ for some 2 .0; 1/g
D l2 .†/:
2.0;1/
Lemma 10 summarizes already known results. Lemma 10. For the scale spaces Vi .†/; i D 2; 3; 5; 6; of the †-tensor scaling function of type 2, 3, 5, and 6 defined in (312), respectively, the following statements are true:
1224
C. Mayer and W. Freeden
(1) Vi .†/ l2 .†/ for all Z2 .0; 1/; \ i i 2 (2) V .†/ D lim ˆ .:; y/f .y/ d!.y/ j f 2 l .†/ D f0g; (3) (4)
2.0;1/ Vi .†/ is
[
!1 †
a linear subspace of l2 .†/, kkl2 .†/
Vi .†/
kkl2 .†/
D ff 2 l2 .†/jf 2 Vi .†/ for some 2 .0; 1/g
2.0;1/ 2
D l .†/: Proof. Property (1) is clear by the definition of the scale spaces. Moreover, statement (2) follows from the fact that the †-tensor scaling functions of type 2; 3; 5, and 6 tend to 0 for ! 1. Finally, property (3) is a result of the linearity of the integral, while (4) essentially consists of the statement of Corollary 7. Remark 5. The above properties of the scale spaces remain true for the scale spaces defined by using the †-tensor scaling function of the second kind. Next, we restrict our theory to a sphere with radius ˛ > 0 (cf. Freeden and Mayer (2003)). Let ˆ be one of the †-tensor scaling functions ˆ i , i D 2; 3; 5; 6, or one Q i , i D 2; 3; 5; 6, which all of the †-tensor scaling functions of the second kind ˆ form an approximate identity in the space l2 .†/. If we are able to verify that the tensor scaling function ˆ can be decomposed in the form
ˆ .x; y/ D
1 3 X X 2n C 1 ^ .i;j / x y ; .ˆ / .n; ˛; i; j / p n 4 ˛ 2 ˛2 i;j D1 nD0
x; y 2 ˛ ;
i
(342) .i;j / where pn are the Legendre tensor fields as defined in Freeden and Schreiner (2009), and if we can, furthermore, show that the symbols .ˆ /^ .n; ˛; i; j /, for fixed i; j D 1; 2; 3 and n 2 N0i , are monotonically increasing if decreases, then it follows that the corresponding scale spaces form a nested sequence. Indeed, this conclusion follows from the spherical theory of tensor scaling function (as presented in Freeden et al. 1994, 1998). In the sequel, we want to restrict ourselves to the cases i D 5; 6. The two other cases i D 2; 3 building a vectorial approximate identity in l2 .†/ can be discussed in an analogous way. At the beginning of Sect. 4.4, we already saw that ˆ 5˙ .x; y/ D ˆ5 .x; y/i C R5 .x; y/ ;
x; y 2 † ;
(343)
ˆ 6˙ .x; y/ D ˆ6 .x; y/i C R6 .x; y/ ;
x; y 2 † ;
(344)
where the scalar kernels ˆ5 and ˆ6 are given by
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
ˆ5 .x; y/
1 D 4
ˆ6 .x; y/
1 D 4
.x C .x/ y/ .y/ jx C .x/ yj3 .x C .x/ y/ .x/ jx C .x/ yj3
.x .x/ y/ .y/ jx .x/ yj3 .x .x/ y/ .x/ jx .x/ yj3
1225
;
(345)
:
(346)
Q 5 D ˆ5 .x; y/i and These two formulas were the motivation to call the kernels ˆ 6 Q D ˆ6 .x; y/i †-tensor scaling functions of the second kind of type 5 and 6. To ˆ construct a decomposition of the tensor scaling functions ˆ 5 and ˆ 6 of the form Q i and Ri , i D 5; 6, separately. (342), we have to deal with the two parts ˆ In the note Freeden and Mayer (2003), scalar scaling functions and wavelets have been developed from the single- and double-layer potentials and their normal derivatives corresponding to the scalar Laplace equation. The arising scalar-valued †-scaling functions, based on the jump relation of the double-layer potential, ˆ5 , and the jump relation of the normal derivative of the single-layer potential, ˆ6 , are exactly the ones which also appear in (322) and (330). Hence, for the case of the regular surface being a sphere with radius ˛ > 0, ˛ , we are able to deduce that (see Freeden and Mayer 2003) ˆ5 .x; y/ D
1 X 2n C 1 nD0
ˆ6 .x; y/ D
1 X 2n C 1 nD0
with x; y 2 ˛ , t D
x jxj
4 ˛ 2
y jyj
4 ˛ 2
D
ˆ5 ˆ6
^ ^
.˛; n/Pn .t/;
(347)
.˛; n/Pn .t/;
(348)
xy ˛2
2 Œ1; 1 , where fPn gn2N0 forms the ^ sequence of the scalar Legendre polynomials. The spherical symbols ˆ5 .˛; n/ 6 ^ and ˆ .˛; n/ are given by # " nC1 ˛ n 1 ˛ C .n C 1/ ; (349) n 2n C 1 ˛C ˛ " nC2 # ˛ n1 6 ^ 1 ˛ ˆ .˛; n/ D C .n C 1/ ; (350) n 2n C 1 ˛ ˛C
^ ˆ5 .˛; n/ D
for n 2 N0 ; 2 .0; ˛/ and ˛ > 0. This fact is remarkable, since scaling functions as well as spherical symbols are explicitly available. The proof of these equations essentially follows from the multipole decomposition of a single pole (fundamental solution of the Laplace operator): 1 1 1 X jxjn D Pn . / 4 jx yj 4 nD0 jyjnC1
(351)
1226
C. Mayer and W. Freeden
with x D jxj ; y D jyj ; ; 2 ; jxj < jyj and jx C .x/j D ˛ C ; x 2 ˛ ; 2 .˛; 1/ : It should be noted that the restriction < ˛ does not really matter, in the limit ! 0. Figure 1 shows the symbols ^ 5 ^ since we are interested ˆ .˛; n/ and ˆ6 .˛; n/ for the case ˛ D 1 and different values of the scale parameter . For more details about this construction in the case of the Laplace equation, the reader is referred to Freeden and Mayer (2003). Similar to the representation (342) in the tensorial case, (347) and (348) are the standard spectral representation for scalar spherical product kernels as discussed in Freeden (1998). For the symbols .ˆi /^ .˛; n/ of the spherical kernels ˆ5 and ˆ6 , respectively, we easily obtain the following properties: ^ (1) ˆ5 .˛; 0/ D 1 for all 2 .0; ˛/, ^ ˛2 for all 2 .0; ˛/, (2) ˆ6 .˛; 0/ D .˛C /2 i ^ (3) lim ˆ .˛; n/ D 1 for all n 2 N0 and i 2 f5; 6g, !0 >0i ^ (4) ˆ .˛; n/ is monotonically decreasing in for all n 2 N0 , 2 .0; ˛/ and i 2 f5; 6g. ^ The first three points can be deduced from the definition of the symbols ˆ5 .˛; n/ ^ and ˆ6 .˛; n/, and the fourth point can easily be verified by the facts that d 5 ^ ˛ nC1 1 .˛ /n1 ˆ .˛; n/ D n.n C 1/ n.n C 1/ d 2n C 1 .˛ C /nC2 ˛n (352) ˛ nC1 .˛ s/n1 n.n C 1/ < 0; C D nC2 2n C 1 .˛ C / ˛n
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0 10 20 30 40 50 60 70 80 90 100
0 10 20 30 40 50 60 70 80 90 100
^ Fig. 1 The spherical symbols ˆ .1; n/ and ˆ6 .1; n/ of the scalar scaling functions ˆ5 (left) and ˆ6 (right) restricted to the unit sphere for different values of (the lowest graph corresponds to D 21 decreasing to D 25 for the highest one, respectively)
5 ^
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1227
and .˛ /n2 d 6 ^ 1 ˛ nC2 ˆ .˛; n/ D n.n 1/ .n C 1/.n C 2/ d 2n C 1 ˛ n1 .˛ C /nC3 (353) D
1 2n C 1
n.n 1/.˛ /n2 .n C 2/.n C 1/˛ nC2 C n1 ˛ .˛ C /nC3
< 0;
for all n 2 N0 and 2 .0; ˛/. By these properties, we are immediately able to deduce that the corresponding scale spaces form a scalar multiresolution analysis in the space L2 .˛ /. For a more detailed discussion concerning the idea of a continuous multiresolution analysis, the reader is referred to Freeden and Windheuser (1996); Freeden (1999); Freeden and Mayer (2003). Let us return to the case of the †-tensor scaling functions of the second kind. Using the decomposition (347) and (348) and the vectorial Funk-Hecke formulas (as outlined in Freeden and Schreiner 2009), we can verify that, for i D 5; 6, for all x 2 ˛ , n 2 N0 , and k D 1; : : : ; 2n C 1, Z ˛
Q i .x; y/y .1/;˛ .y/ d!.y/ D ˆ n;k
Z
.1/;˛
˛
ˆi .x; y/yn;k .y/ d!.y/
(354)
Z 1 x y X 2n C 1 i ^ .1/;˛ D yn;k .y/ d!.y/ ˆ .˛; n/ P n 2 2 4 ˛ ˛ ˛ nD0 n i ^ n C 1 i ^ .1/;˛ ˆ .˛; n C 1/ C ˆ .˛; n 1/ yn;k .x/ D 2n C 1 2n C 1 p ^ n.n C 1/ i ^ .2/;˛ C ˆ .˛; n C 1/ ˆi .˛; n 1/ yn;k .x/ : 2n C 1
Furthermore, we have for all x 2 ˛ , n 2 N, and k D 1; : : : ; 2n C 1, Z ˛
p ^ n.n C 1/ i ^ .1/;˛ ˆ .˛; n C 1/ ˆi .˛; n 1/ yn;k .x/ 2n C 1 (355) n i ^ n C 1 i ^ .2/;˛ ˆ ˆ C .˛; n C 1/ C .˛; n 1/ yn;k .x/ ; 2n C 1 2n C 1
.2/;˛ Q i .x; y/yn;k ˆ .y/ d!.y/ D
and Z ˛
Q i .x; y/y .3/;˛ .y/ d!.y/ D ˆi ^ .˛; n/y .3/;˛ .x/ : ˆ n;k n;k
(356)
1228
C. Mayer and W. Freeden
Q i , i D 5; 6, On the other hand, if we assume that the tensor scaling functions ˆ admit a decomposition of type (342), we can easily show that if we convolve a †.1/;˛ tensor scaling function with a vector spherical harmonic of type 1, yn;k , n 2 N0 , k D 1; : : : ; 2n C 1 (for more details, see Freeden and Schreiner 2009; Freeden and Gutting 2013), we get Z ˛
Q i .; y/y .1/;˛ .y/ d!.y/ ˆ n;k
(357)
^ ^ ^ .1/;˛ .2/;˛ .1/;˛ D ˆ i .n; ˛; 1; 1/yn;k C ˆ i .n; ˛; 2; 1/yn;k C ˆ i .n; ˛; 3; 1/yn;k : .2/;˛
For the convolution with a vector spherical harmonic of type 2, yn;k , n 2 N, k D 1; : : : ; 2n C 1, we get Z ˛
Q i .; y/y .2/;˛ .y/ d!.y/: ˆ n;k
(358)
^ ^ ^ .1/;˛ .2/;˛ .1/;˛ D ˆ i .n; ˛; 1; 2/yn;k C ˆ i .n; ˛; 2; 2/yn;k C ˆ i .n; ˛; 3; 2/yn;k .3/;˛
Finally, for the convolution with a vector spherical harmonic of type 3, yn;k , n 2 N, k D 1; : : : ; 2n C 1, we obtain Z ˛
Q i .; y/y .3/;˛ .y/ d!.y/ ˆ n;k
(359)
^ ^ ^ .1/;˛ .2/;˛ .1/;˛ D ˆ i .n; ˛; 1; 3/yn;k C ˆ i .n; ˛; 2; 3/yn;k C ˆ i .n; ˛; 3; 3/yn;k : .i /;˛
Since we know that the system fyn;k j i D 1; 2; 3; n 2 N0i ; k D 1; : : : ; 2n C 1g of vector spherical harmonics (see Freeden and Schreiner 2009) is a closed and complete orthonormal system in the Hilbert space l2 .˛ /, we can conclude comparing (354)–(356) with (357)–(359) that, for i D 5; 6, ˆ i .x; y/ D
1 3 X X 2n C 1 i ^ .i;j / x y ˆ .n; ˛; i; j / p ; n 4 ˛ 2 ˛2 i;j D1 nD0
x; y 2 ˛ ;
i
(360) with the symbols given by i ^ n C 1 i ^ n i ^ ˆ .˛; n C 1/ C ˆ ˆ .n; ˛; 1; 1/ D .˛; n 1/ ; 2n C 1 2n C 1 p ^ i ^ n.n C 1/ i ^ ˆ .˛; n 1/ ˆi .˛; n C 1/ ˆ .n; ˛; 2; 1/ D 2n C 1
(361)
(362)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1229
^ D ˆ i .n; ˛; 1; 2/ ; i ^ ˆ .n; ˛; 2; 2/ D
(363)
n i ^ n C 1 i ^ ˆ ˆ .˛; n C 1/ C .˛; n 1/ ; 2n C 1 2n C 1
(364) i ^ i ^ (365) ˆ .n; ˛; 3; 3/ D ˆ .˛; n/ ; i ^ i ^ i ^ i ^ ˆ .n; ˛; 1; 3/ D ˆ .n; ˛; 3; 1/ D ˆ .n; ˛; 2; 3/ D ˆ .n; ˛; 3; 3/ D 0; (366) for n 2 N0 . For the case i D 5, the nonvanishing symbols are illustrated for different cases of the scale parameter in Fig. 2. Finally, we turn our attention to the †-tensor scaling function ˆ i , i D 5; 6. In order to get a complete insight into the spectral performance of the †-tensor scaling function ˆ i , i D 5; 6, restricted to a sphere ˛ , we have to examine the spectral representation of the remaining tensor kernel Ri , i D 5; 6, in more detail. This will be done exemplarily for the case i D 5. Restricting the variables x and y to a sphere ˛ with radius ˛ > 0, we have that .x/ D x=˛ and .y/ D y=˛ such that we get using the abbreviation t D .x y/=˛ 2 2 Œ1; 1 R5 .x; y/ D
0
˛.1 t/ @ rxQ ˝ ryQ 0 t @ rxQ ˝ ryQ
1 4 jxQ yj Q
1 4 jxQ yj Q
x xD Q 1C ˛ yDy Q
.
/
rxQ ˝ ryQ
x xD Q 1C ˛ yDy Q
.
5;2 D ˛.1 t/ R5;1 .x; y/ t R .x; y/;
/
C rxQ ˝ ryQ
1 4 jxQ yj Q 1 4 jxQ yj Q
(367) 1
x xD Q 1 ˛ yDy Q
.
/
x xD Q 1 ˛ yDy Q
.
/
A 1 A
x; y 2 ˛ :
5;2 For determining spectral representations of the tensor kernels R5;1 and R , we again use the identity, for x D jxj ; y D jyj , and ; 2 with jxj < jyj, 1 1 X jxjn 1 D Pn . / : 4 jx yj 4 nD0 jyjnC1
(368)
Applying the operator rx ˝ ry results in the representation, for jxj < jyj with x D jxj , y D jyj and ; 2 ,
rx ˝ ry
1 1 1 X jxjn1 n.n C 1/P .
/ ˝
C n ˝ r P .
/ D n n
nC2 4 jx yj 4 nD0 jyj (369)
1230 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
C. Mayer and W. Freeden 0.25 0.2 0.15 0.1 0.05
0 10 20 30 40 50 60 70 80 90 100
0
0 10 20 30 40 50 60 70 80 90 100
Φ t5 (n, a, 1, 1)
Φ t5 (n, a, 1, 2) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 10 20 30 40 50 60 70 80 90 100
0 10 20 30 40 50 60 70 80 90 100
Φ t5 (n, a, 2, 2)
Φ t5 (n, a, 3, 3)
Q 5 /^ .n; 1; i; j / of the †-tensor scaling functions of Fig. 2 The nonvanishing spherical symbols .ˆ Q 5 restricted to the unit sphere . The values for are given by D 2J with the second kind ˆ 5
Q /^ .n; 1; i; i / for i D 1; 2; 3 tend to 1 for tending to J 2 f2; 3; 4; 5g. The (diagonal) symbols .ˆ 5 ^ Q 0, while the mixed symbol .ˆ / .n; 1; 1; 2/ vanishes for ! 0
.n C 1/ r Pn . / ˝ C r ˝ r Pn . / : .i;j /
Using the notation of Legendre tensor fields pn from Freeden and Schreiner (2009), this can be written in the form (note that jxj < jyj which is the case if xQ D x .x/) 1 X 1 2n C 1 jxjn1 n.n C 1/ .1;1/ p . ; / D (370) 4 jx yj 4 jyjnC2 2n C 1 n nD0 p p .n C 1/ n.n C 1/ .2;1/ n n.n C 1/ .1;2/ n.n C 1/ .2;2/ pn . ; / pn . ; / C pn . ; / : C 2n C 1 2n C 1 2n C 1
rx ˝ ry
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1231
Furthermore, for the case jyj < jxj, we have with x D jxj , y D jyj , and ; 2 , 1 1 X jyjn1 1 n.n C 1/Pn . / ˝
D (371) 4 jx yj 4 nD0 jxjnC2 .n C 1/ ˝ r Pn . / C n r Pn . / ˝ C r ˝ r Pn . / :
rx ˝ ry
.i;j /
Using once more the Legendre tensor fields pn as introduced by Freeden and Schreiner (2009), this can be written in the form (note that, for jyj < jxj which is the case if xQ D x C .x/) 1 X 1 2n C 1 jyjn1 n.n C 1/ .1;1/ D p . ; / 4 jx yj 4 jxjnC2 2n C 1 n nD0 p p n n.n C 1/ .2;1/ .n C 1/ n.n C 1/ .1;2/ pn . ; / C pn . ; / 2n C 1 2n C 1 n.n C 1/ .2;2/ pn . ; / : C 2n C 1
rx ˝ ry
(372)
5;2 Summarizing our results, we obtain for the kernels R5;1 and R
R5;1 .x; y/ D
1 2 X X 2n C 1 5;1 ^ / R .n; ˛; i; j / p.i;j . ; /; n 2 4 ˛ i;j D1 nD0
(373)
R5;2 .x; y/ D
1 2 X X 2n C 1 5;2 ^ / R .n; ˛; i; j / p.i;j . ; /; n 2 4 ˛ i;j D1 nD0
(374)
with x D ˛ ; y D ˛ 2 ˛ , ; 2 and the symbols given by 5;1 ^ ˛ nC1 n.n C 1/ .˛ /n1 ; (375) R .n; ˛; 1; 1/ D 2n C 1 .˛ C /nC2 ˛n p 5;1 ^ n.n C 1/ ˛ nC1 .˛ /n1 R .n C 1/ ; .n; ˛; 1; 2/ D C n 2n C 1 .˛ C /nC2 ˛n (376) p 5;1 ^ .˛ /n1 ˛ nC1 n.n C 1/ .n; ˛; 2; 1/ D C .n C 1/ n R ; 2n C 1 .˛ C /nC2 ˛n (377)
1232
C. Mayer and W. Freeden
5;1 ^ n.n C 1/ .n; ˛; 2; 2/ D R 2n C 1
˛ nC1 .˛ /n1 .˛ C /nC2 ˛n
;
(378)
and 5;2 ^ n.n C 1/ .˛ /n1 ˛ nC1 R ; (379) .n; ˛; 1; 1/ D C 2n C 1 ˛n .˛ C /nC2 p 5;2 ^ ˛ nC1 n.n C 1/ .˛ /n1 n ; R .n; ˛; 1; 2/ D .n C 1/ 2n C 1 .˛ C /nC2 ˛n (380) p 5;2 ^ ˛ nC1 .˛ /n1 n.n C 1/ n ; .n; ˛; 2; 1/ D .n C 1/ R 2n C 1 .˛ C /nC2 ˛n (381) 5;2 ^ ˛ nC1 n.n C 1/ .˛ /n1 R ; (382) .n; ˛; 2; 2/ D C nC2 2n C 1 .˛ C / ˛n for n 2 N0 , 2 .0; ˛/ and ˛ > 0. In order to establish a decomposition of the †-tensor scaling function ˆ .5/ of the type (342), we have to put together (367), (373), and (374) such that a decomposition of the remainder kernel R5 becomes available. This finally gives, in connection with the representation of the †-tensor scaling function of the second kind (360) using (320), the desired representation of ˆ 5 . However, we avoid these calculations here, because especially the relation (367) makes insuperable difficulties in establishing a representation of the form (342) for the remainder kernel function R5 .
4.6
A Case Study in Meteorology: The Storm Kyrill
Kyrill is the name of a heavy winter storm which moved across Western and Central Europe in 2007 (January 18 and January 19). Kyrill caused severe damage in a wide area across Western and Central Europe, especially in the UK and Germany. As a matter of fact, 47 fatalities have been reported as well as extensive disruptions of public transport, damages to buildings, and major forest damage through windthrow. The maximum wind speed of Kyrill was measured up to 225 km/h in the Swiss Alps. In the lowland, the mean wind speed raised up to 90–100 km/h with flurries up to 130 km/h. In higher areas of Germany, as, for example, in the Harz, the maximum wind speeds attached up to 190 km/h. The lowest measured pressure has been 964.8 hPa with an accompanying pressure gradient over Germany of up to 51 hPa, which is a rather high value for Central Europe. Next, we apply the †-tensor scaling functions developed to wind field data of the storm Kyrill. The data are hourly mean values sampled between January 17, 2007, and January 19, 2007, at up to 1370 stations in Central Europe. The data have kindly
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1233
been provided by the Meteomedia AG, Gais, Switzerland (see also the contribution by Meteomedia in this work). The stations, where the wind field is measured, are illustrated in the left part of Fig. 3. To be more precise, let fu.xi /gi D1;:::;N be the measurements of the mean wind field at the stations located at points fxi gi D1;:::;N , i.e., we assume the Earth surface to be a sphere; hence, we neglect the altitude of the stations. Furthermore, let fyj gj D1;:::;M be a system of equidistributed points located in the area of interest. The points fyj g are called nodal points in the following. The task is to find coefficients faj gj D1;:::;M R3 such that M X
ˆ i .xi ; yj / aj D u.xi /;
i D 1; : : : ; N ;
(383)
j D1
where the scale parameter is determined a priori by a sophisticated rule of thumb in dependency of the distribution of the data points fxi g. Remark 6. On the unit sphere , we are able to choose all -tensor scaling functions, which establish an approximate identity, i.e., we can arbitrarily choose i 2 f2; 3; 5; 6g. In this section, we particularly take i D 5. If the number of nodal points M is equal to the number of data points N , which happens, for example, if we choose fyj gj D1;:::;M D fxi gi D1;:::;N , then the system (383) obviously is quadratic. For an irregular distribution of the data points as it is considered in our case (see Fig. 3), this approach is not feasible. The choice of the scale parameter would be very difficult, because on the one hand the data points in a very dense data area demand for a very small near to 0. On the other hand, in areas where the data distribution is very rough, a high-scale parameter has to be chosen such that enough data points lie in the effective support of one -tensor scaling function. To circumvent these problems, we decided to take M < N such that the system (383) is overdetermined. The resulting equations are solved by a least-squares method. The major advantage is that we are free in the choice of the system of nodal points fyj gj D1;:::;M . In fact, we especially assume the nodal points as a part of an equidistributed spiral grid system, where some nodal points in the area of interest have to be picked out by hand because of the irregular distribution of the data points. The resulting nodal system consists of 113 points which are illustrated in the right part of Fig. 3. Figure 4 and 5 shows a time series of reconstructions of the wind field between January 18, 17:00 UTC, and January, 19, 02:00 UTC. This is the period when the strongest wind speeds have occurred over Germany. The scale parameter of the †tensor scaling function in the system (383) has been selected to be D 0:025. It is clearly visible that after 21:00 UTC the strongest wind field over the German coast has moved eastward and the wind speed decreased over the German Bight. This was
1234
C. Mayer and W. Freeden
Fig. 3 Top: distribution of the data point system fxi giD1;:::;N of the stations where the wind field is measured. Bottom: distribution of the nodal point system fyj gj D1;:::;M chosen in (383). The system is based on the so-called spiral grid system (see, e.g., Mayer 2007), where some points in the area of interest have to be picked out by hand because of the irregular distribution of the data points (e.g., over the North Sea)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1235 80
70 55° N
60
70
55° N
60 50
50
40 50° N
40
30 50° N
30
20
20
10 5° E
10° E
10
15° E
5° E
17:00 UTC
10° E
15° E
January18, 18:00 UTC 80
80 70
55° N
70
55° N
60
60
50
50 40
40
50° N
50° N
30
30
20
20
10
10 5° E
10° E
15° E
5° E
January18, 19:00 UTC
10° E
15° E
January18, 20:00 UTC 90 80
55° N
70 55° N
70
60
60
50
50
40
40 50° N
50° N
30
30 20
20
10
10 5° E
10° E
15° E
January18, 21:00 UTC
5° E
10° E
15° E
January18, 22:00 UTC
Fig. 4 A time series of approximations of the wind field between January 18, 17:00 UTC and 22:00 UTC. The scale parameter of the -tensor scaling function ˆ 5 in the system (383) has been selected to be D 0:025, and the nodal system consists of 113 points shown in the right part of Fig. 3
1236
C. Mayer and W. Freeden 70 60
55°N
55°N
60
50 50 40 40 30
50°N
50°N
30
20
20
10
5°E
10°E
10
15°E
5°E
January 18, 23:00 UTC
10°E
15°E
January 19, 00:00 UTC 70
70
55°N
60
55°N
60 50
50
40
40
50°N
30
5°E
10°E
15°E
January 19, 01:00 UTC
50°N
30
20
20
10
10
5°E
10°E
15°E
January 19, 02:00 UTC
Fig. 5 A time series of approximations of the wind field between January 18, 23:00 UTC, and January 19, 02:00 UTC. The scale parameter of the -tensor scaling function ˆ 5 in the system (383) has been selected to be D 0:025, and the nodal system consists of 113 points shown in the right part of Fig. 3
one of the reasons that the storm tide, which had been forecast for Hamburg, failed to appear. In the following, we want to compare our approximations calculated from local wind field data over Central Europe with the data of a global wind field model. Therefore, we took wind field data of the global 4 times daily NCEP/NCAR Reanalysis provided by the NOAA-CIRES Climate Diagnostic Center, University of Colorado, Boulder, USA. The data are gridded on a 71 144 equiangular longitude latitude grid, which results in a number of N D 10;224 global data points. As before, our aim is to determine coefficients faj gj D1;:::;M R3 such that (383) holds true, where the nodal points fyj gj D1;:::;M are chosen to be the points of a spiral grid with M D 2;000 points. A value auf D 0:25 turned out to be appropriate for this number of nodal points. The resulting overdetermined system has 3M D 6;000 unknowns and a right-hand side with 3N D 30;672 entries. In double precision, this needs a total memory of 1;404 MB.
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1237
90 80
55° N
65 55° N
60
70 55 60 50 50 45 40
50° N
50° N
40
30
35
20
30
10 5° E
10° E
15° E
January18, 18:00 UTC
25 5° E
10° E
15° E
January19, 00:00 UTC
Fig. 6 Approximations of the wind field at January 18, 18:00 UTC (left), and January 19, 00:00 UTC (right). The scale parameter of the -tensor scaling function ˆ 5 has been selected to be D 0:25, and the nodal system consists of 2,000 points of a spiral grid. This approximation has been calculated from wind field data of the global 4 times daily NCEP/NCAR Reanalysis
Figure 6 shows the approximation at January 18, 18:00 UTC, and at January 19, 00:00 UTC, of the 4 times daily NCEP/NCAR Reanalysis wind field. Also in this reconstruction, it can clearly be seen that at January 19, 00:00 UTC, the major storm field has moved eastward, which already has been observed in the approximation of the wind field measurements.
Appendices We conclude our work with a list of appendices.
A
Regular Surfaces
First, the main geometrical reference object discussed in this thesis, i.e., a regular surface, is defined, and certain properties are explained in more detail (see also Müller 1969; Freeden and Gerhards 2013): Definition 13. A subset † R3 is called a regular surface in R3 if the following properties are fulfilled: 1. † is a closed and compact surface free of double points. 2. † divides the Euclidean space R3 into the bounded inner region †int and the P †[ P †ext . 3. The origin is in †int , unbounded outer region †ext with R3 D †int [ 0 2 †int . 4. † is locally of class C.2/ . The fourth property means that, for each point x 2 †, there exists a neighborhood U .x/ R3 of x such that † \ U .x/ can be mapped bijectively onto an open
1238
C. Mayer and W. Freeden
subset V R2 and that this mapping is twice continuously differentiable. The fourth property of Definition 13 is equivalent to the existence of a continuously differentiable unit normal field on † pointing, by definition, into the outer space †ext . Examples of a regular surface are the sphere R with radius R > 0, the ellipsoid, and as geoscientifically relevant example the real (regular) Earth’s surface (obtained by modern GPS technology). Definition 14. Let W † ! R3 denote the unit normal field on †. Then the set †./ D fx 2 R3 jx D y C .y/; y 2 †g
(384)
generates a parallel surface which is exterior to † for > 0 and interior for < 0. It is well known (see, e.g., Müller 1969; Freeden and Gerhards 2013) that if jj is sufficiently small, then the regularity of † implies the regularity of †./. According to our regularity assumptions, imposed on †, the functions .x; y/ 7!
j.x/ .y/j ; jx yj
.x; y/ 2 † †; x 6D y;
(385)
and .x; y/ 7!
j.x/ .x y/j jx yj2
;
.x; y/ 2 † †; x 6D y;
(386)
are bounded. Hence, there exists a constant M > 0 such that, for all x; y 2 †, j.x/ .y/j M jx yj ;
(387)
j.x/ .x y/j M jx yj2 :
(388)
Moreover, it is easy to see that inf jx C .x/ .y C .y//j D j j
x;y2†
(389)
provided that jj and jj are sufficiently small. In order to separate members of the class c.†/ of continuous vector fields on † into their tangential and normal parts with respect to a regular surface, we introduce the projection operators pnor and ptan by pnor f .x/ D .f .x/ .x//.x/;
x 2 †; f 2 c.†/;
(390)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
ptan f .x/ D f .x/ pnor f .x/;
x 2 †; f 2 c.†/ :
1239
(391)
Hence, the corresponding subspaces of c.†/ are given by
.p/
cnor .†/ D ff 2 c.†/jf D pnor f g;
(392)
ctan .†/ D ff 2 c.†/jf D ptan f g:
(393)
.p/
The spaces cnor .†/ and ctan .†/, 0 p 1 are definable in the same fashion. The set of vector fields f W † ! R which are measurable and for which Z kf kl p .†/ D
p
p1
jf .x/j d!.x/
< 1;
(394)
†
is denoted by l p .†/, where d!.x/ denotes the surface element on † (note that in the case of † D R with radius R > 0, we write d!R .x/ instead of d!R .x/ and d! instead of d!1 in the case R D 1). The definition of the normal and the tangential operator can be extended in canonical way to vector fields in l 2 .†/ by a density argument. Hence, we define 2 .†/ D ff 2 l 2 .†/jf D pnor f g; lnor
(395)
2 ltan .†/
(396)
2
D ff 2 l .†/jf D ptan f g:
Clearly, we have the orthogonal decomposition 2 2 .†/ ˚ ltan .†/ : l 2 .†/ D lnor
B
(397)
Kernel Functions
When we introduce layer potentials with respect to a regular surface, scalar- and tensor-valued kernel functions defined on the regular surface are of particular importance. Thus, they are discussed in the following. Definition 15. Let † be a regular surface. A bivariate scalar kernel function K W R3 R3 ! R is called weakly continuous if K is continuous for all x; y 2 † with x 6D y, and there exist positive constants M and 0 < ˛ 2 such that, for all x; y 2 †, x 6D y, we have
1240
C. Mayer and W. Freeden
jK.x; y/j M
1 jx yj2˛
:
(398)
˝ ˛ The pair C.0/ .†/; C.0/ .†/ with † being a regular surface, together with the Ł2 .†/inner product, is a dual system. Thus, the first requirements of the theorem of Fredholm (see, e.g., Kress 1989; Heuser 1992) is fulfilled. To finally apply this theorem, we need compact operators on the space C.0/ .†/. Theorem 22. Let † be a regular surface and let the integral operator A W C.0/ .†/ ! C.0/ .†/ be given by Z .AF /.x/ D
K.x; y/F .y/ d!.y/;
x 2 †;
(399)
†
where the kernel K is continuous or weakly singular. Then the operator A is compact on C.0/ .†/. For a proof of this theorem, the reader is referred to, e.g., Kupradze (1965) and Kress (1989). Theorem 23. Let † be a regular surface. Assume the kernel K to be weakly continuous with constant ˛. Furthermore, let us assume that there exists an N 2 N and a constant M > 0 such that jK.x1 ; y/ K.x2 ; y/j M
N X
jx1 x2 jj
j D1
jx1 yj2Cj ˛
(400)
for all x1 ; x2 2 R3 and y 2 † with 2 jx1 x2 j jx1 yj. Then the scalar potential U W R3 ! R formally defined by Z K.x; y/F .y/ d!.y/;
U .x/ D
x 2 R3 ;
(401)
†
with layer density F 2 C.0/ .†/ belongs to the Hölder space C.0;ˇ/ .R3 / for all 0 < ˇ ˛ if 0 < ˛ < 1, for all 0 < ˇ < 1 if ˛ D 1, and for all 0 < ˇ 1 if 1 < ˛ < 2. Theorem 24. Let † be a regular surface and let x0 2 †. Assume the kernel K to be continuous for all x 2 D0 , y 2 †, x 6D y with D0 given by D0 D fy D x C .x/ j x 2 †; jj j0 jg;
(402)
and assume that there exists a constant C > 0 such that for all x 2 D0 , y 2 †, x 6D y, we have
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
jK.x; y/j
C jx yj2
:
1241
(403)
Furthermore, let us assume that there exists an N 2 N such that jK.x1 ; y/ K.x2 ; y/j C
N X jx1 x2 jj j D1
jx1 yj2Cj
(404)
for all x1 ; x2 2 D0 , y 2 †, x 6D y with 2 jx1 x2 j jx1 yj, and that ˇZ ˇ ˇ ˇ
†n.Br .z/\†/
ˇ ˇ K.x; y/ d!.y/ˇˇ C
(405)
for all z 2 †, x 2 D0 and for all 0 < r < R, where R is chosen sufficiently small such that BR .z/ \ † is still connected. We formally define, for F 2 C.0;˛/ .†/, Z
K.x; y/ F .y/ F .z/ d!.y/;
U .x/ D
x 2 D0 :
(406)
†
Then the potential U is continuous and belongs to the Hölder space C.0;˛/ .D0 /. ˝ ˛ The pair c.0/ .†/; c.0/ .†/ together with the ł2 .†/-inner product is a dual system. In order to use the theorem of Fredholm, we finally need compact operators on the space c.0/ .†/. It is clear that Definition 15, Theorems 22, and 23 can canonically be extended to the case of a tensor kernel function k. Definition 16. Let † be a regular surface. A tensorial kernel function k W R3 R3 ! R3 3 is said to be weakly continuous if k is defined and continuous for all x; y 2 † with x 6D y, and there exist positive constants M and 0 < ˛ 2 such that for all x; y 2 †, x 6D y, we have jk.x; y/j M
1 jx yj2˛
:
(407)
Corollary 9. Let † be a regular surface. 1. Let the integral operator A W c.0/ .†/ ! c.0/ .†/ be given by Z .Af /.x/ D
k.x; y/f .y/ d!.y/;
x 2 †;
(408)
†
where the tensor kernel k is continuous or weakly singular. Then the operator A is compact on c.0/ .†/.
1242
C. Mayer and W. Freeden
2. Let us assume that the tensor kernel k be weakly continuous with constant ˛. Furthermore, let us assume that there exists N 2 N and a constant M > 0 such that
jk.x1 ; y/ k.x2 ; y/j M
N X
jx1 x2 jj
j D1
jx1 yj2Cj ˛
(409)
for all x1 ; x2 2 R3 and y 2 † with 2 jx1 x2 j jx1 yj. Then the vector potential u W R3 ! R3 defined by Z x 2 R3 ;
k.x; y/f .y/ d!.y/;
u.x/ D
(410)
†
with layer density f 2 c.0/ .†/ is an element of the Hölder space c.0;ˇ/ .R3 / with the same relations between ˇ and ˛ as given in Theorem 23. 3. Let us assume that the tensor kernel k be continuous for all x 2 D0 , y 2 †, x 6D y, where D0 is defined in Theorem 24, and let us assume that there exists a constant C > 0 such that for all x 2 D0 , y 2 †, x 6D y, we have jk.x; y/j
C jx yj2
:
(411)
Furthermore, assume that there exists N 2 N such that
jk.x1 ; y/ k.x2 ; y/j C
N X jx1 x2 jj j D1
jx1 yj2Cj
(412)
for all x1 ; x2 2 D0 , y 2 †, x 6D y with 2 jx1 x2 j jx1 yj, and that ˇZ ˇ ˇ ˇ
†n.Br .z/\†/
ˇ ˇ k.x; y/ d!.y/ˇˇ C
(413)
for all z 2 †, x 2 D0 and for all 0 < r < R. We define, for f 2 c.0;˛/ .†/, the vector potential u by Z
k.x; y/ f .y/ f .z/ d!.y/;
u.x/ D
x 2 D0 :
(414)
†
Then the vector potential u is continuous and belongs to the space c.0;˛/ .D0 /.
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
C
1243
Scaling Functions and Wavelets
In the following, we present some helpful auxiliary material. In particular, we are interested in the explicit representations of the †-tensor scaling functions and wavelets presented in Sect. 4.1 as well as those of the second kind as introduced in Sect. 4.4. Furthermore, we give some graphical illustrations of the tensor scaling functions and wavelets.
C.1
Scaling Functions
Tensorial scaling functions on regular surfaces have been introduced in Corollary 7 and Definition 5. Their explicit representations are, for the tensorial case, given by ˆ 1˙ .x; y/
1 D 8
ˆ 2˙ .x; y/ D
3 2
i .x ˙ .x/ y/ ˝ .x ˙ .x/ y/ ; C jx ˙ .x/ yj jx ˙ .x/ yj3 (415)
.x ˙ .x/ y/ ˝ .x ˙ .x/ y/ jx ˙ .x/ yj5
..x ˙ .x/ y/ .y//
(416) .x y/ ˝ .x y/ ..x y/ .y// ; (417) C jx yj5 3 .x ˙ .x/ y/ ˝ .x ˙ .x/ y/ ..x ˙ .x/ y/ .x// ˆ 3˙ .x; y/ D 2 jx ˙ .x/ yj5 (418) .x y/ ˝ .x y/ ..x y/ .x// ; (419) C jx yj5 i 1 .x C .x/ y/ ˝ .x C .x/ y/ 4 C ˆ .x; y/ D 8 jx C .x/ yj jx C .x/ yj3 (420) .x .x/ y/ ˝ .x .x/ y/ i ; C jx .x/ yj jx .x/ yj3 (421) 3 .x C .x/ y/ ˝ .x C .x/ y/ ˆ 5 .x; y/ D ..x C .x/ y/ .y// 4 jx C .x/ yj5 (422) .x .x/ y/ ˝ .x .x/ y/ ..x .x/ y/ .y// ; jx .x/yj5 (423)
1244
C. Mayer and W. Freeden
ˆ 6 .x; y/ D
3 4
.x C .x/ y/ ˝ .x C .x/ y/ jx C .x/ yj5
.x .x/ y/ ˝ .x .x/ y/ jx .x/ yj5
..x C .x/ y/ .x//
(424) ..x .x/ y/ .x// ; (425)
where > 0 is the scale parameter and x; y 2 †. Graphical illustrations of the †tensor scaling function of type i D 5, ˆ 5 , for different values of the scale parameter can be found in Fig. 7.
C.2
Wavelet Functions
The †-tensor wavelet functions corresponding to the †-tensor scaling functions have been, for the weight function ˛./ D 1 , defined in Definition 6 by ‰ i .x; y/ D
d i ˆ .x; y/; d
x; y 2 † :
(426)
Their explicit representations can be calculated to be ‰ 2˙ .x; y/
(427)
d .2k.x ˙ .x/; y/ 2k.x; y// ; d 3 .x ˙ .x/ y/ ˝ .x ˙ .x/ y/ 2 D 5 ..x ˙ .x/ y/ .y// 4 jx ˙ .x/ yj7 .x ˙ .x/ y/ ˝ .x/ C .x/ ˝ .x ˙ .x/ y/ ..x ˙ .x/ y/ .y// jx ˙ .x/ yj5 .x ˙ .x/ y/ ˝ .x ˙ .x/ y/ ..x/ .y// jx ˙ .x/ yj5 .x y/ ˝ .x y/ 2 C5 ..x y/ .y// jx yj7 .x y/ ˝ .x/ C .x/ ˝ .x y/ ..x y/ .y// jx yj5 .x y/ ˝ .x y/ ..x/ .y// ; jx yj5 D
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
0.8
1245
0.9 0.8
0.7 0.7 0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3 0.2
0.2
0.1
−150 −100 −50
0
50
100
150
0
50
100
150
0
50
100
150
Scale parameter t = 0.75. 1.8 1.6
2 1.8 1.6
1.4
1.4
1.2
1.2 1
1 0.8 0.6
0.8 0.6 0.4
0.4
0.2
0.2
0
−150 −100 −50
Scale parameter t = 0.5. 7
8 7
6
6
5
5
4
4
3
3
2
2 1
1
0
−150 −100 −50
Scale parameter t = 0.25. Fig. 7 †-tensor scaling function ˆ 5 on the unit sphere for different values of the scale parameter . The left figures show the Frobenius norms of the tensor scaling functions ˆ5 .x; y/ for a fixed value y 2 and variable x 2 . The right figures show a sectional cut of the left one along the equator
1246
C. Mayer and W. Freeden
‰ 3 .x; y/
(428) .x ˙ .x/ y/ ˝ .x ˙ .x/ y/ 3 ..x ˙ .x/ y/ .x//2 5 D 4 jx ˙ .x/ yj7 .x ˙ .x/ y/ ˝ .x/ C .x/ ˝ .x ˙ .x/ y/ ..x ˙ .x/ y/ .x// jx ˙ .x/ yj5 .x ˙ .x/ y/ ˝ .x ˙ .x/ y/ jx ˙ .x/ yj5 .x y/ ˝ .x .x/ y/ 2 ..x y/ .x// C5 jx yj7 .x y/ ˝ .x/ C .x/ ˝ .x y/ ..x y/ .x// jx yj5 .x y/ ˝ .x y/ ; jx yj5
‰ 5 .x; y/
(429)
d .2k.x C .x/; y/ 2k.x .x/; y// ; d 3 .x C .x/ y/ ˝ .x C .x/ y/ 2 D ..x C .x/ y/ .y// 5 4 jx C .x/ yj7 .x C .x/ y/ ˝ .x/ C .x/ ˝ .x C .x/ y/ ..x C .x/ y/ .y// jx C .x/ yj5 .x C .x/ y/ ˝ .x C .x/ y/ ..x/ .y// jx C .x/ yj5 .x .x/ y/ ˝ .x .x/ y/ 2 ..x .x/ y/ .y// C5 jx .x/ yj7 .x .x/ y/ ˝ .x/ C .x/ ˝ .x .x/ y/ ..x .x/ y/ .y// jx .x/ yj5 .x .x/ y/ ˝ .x .x/ y/ ..x/ .y// ; jx .x/ yj5 D
‰ 6 .x; y/
(430) .x C .x/ y/ ˝ .x C .x/ y/ 3 ..x C .x/ y/ .x//2 5 D 4 jx C .x/ yj7 .x C .x/ y/ ˝ .x/ C .x/ ˝ .x C .x/ y/ ..x C .x/ y/ .x// jx C .x/ yj5
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
.x C .x/ y/ ˝ .x C .x/ y/
C5
.x .x/ y/ ˝ .x .x/ y/ jx .x/ yj7
..x .x/ y/ .x//2
.x .x/ y/ ˝ .x/ C .x/ ˝ .x .x/ y/
jx C .x/ yj5
jx .x/ yj5 .x .x/ y/ ˝ .x .x/ y/
1247
..x .x/ y/ .x//
;
jx .x/ yj5
for x; y 2 † and > 0. In Sect. 4.3, we introduced a scale discretization which led to scale discrete †-tensor wavelet functions of type i . They are given, for i D 1; : : : ; 6, by ‰ ij .x; y/ D ˆ ij C1 .x; y/ ˆ ij .x; y/;
x; y 2 † ;
(431)
where the sequence fj gj 2Z is a discretization of the scale interval .0; 1/. The graphical illustrations of the scale discrete †-tensor scaling function of type i D 5, ‰ 5j , can be found in Fig. 8.
C.3
Scaling Functions of the Second Kind
†-tensor scaling functions of the second kind have been defined in Definition 10. For > 0, they are defined by Q i .x; y/ D ˆi .x; y/ i ; ˆ
x; y 2 †;
i D 2; 3; 5; 6;
(432)
where the scalar kernels ˆi , i D 2; 3; 5; 6 are given by ˆ2˙ .x; y/ D
1 2
ˆ3˙ .x; y/ D
1 2
ˆ5 .x; y/ D
1 4
ˆ6 .x; y/ D
1 4
.x ˙ .x/ y/ .y/ jx ˙ .x/ yj3 .x ˙ .x/ y/ .x/ jx ˙ .x/ yj3 .x C .x/ y/ .y/ jx C .x/ yj3 .x C .x/ y/ .x/ jx C .x/ yj3
.x y/ .y/ jx yj3 .x y/ .x/ jx yj3
;
(433)
;
(434)
.x .x/ y/ .y/ jx .x/ yj3 .x .x/ y/ .x/ jx .x/ yj3
;
(435)
;
(436)
for x; y 2 †. Graphical illustrations of the †-tensor scaling function of the second kind Q 5 , for different values of the scale parameter can be found in Fig. 9. of type i D 5, ˆ
1248
C. Mayer and W. Freeden 5.5
6
5 4.5
5
4 3.5
4 3
3 2.5
2
2 1.5
1
1 0.5
0 −1
−150 −100 −50
Discrete scale parameter j = 1, i.e. tj = 22
2−1
0
50
100
150
50
100
150
= 0.5.
25
20 18 16 14 12
20 15 10
10 8 6 4 2
5 0 −5
−150 −100 −50
Discrete scale parameter j = 2, i.e. tj =
2−2
0
= 0.25.
Fig. 8 Discrete †-tensor wavelet function ‰ 5j on the unit sphere for different values of the discrete scale parameter j 2 Z. The left figures show the Frobenius norms of the tensor kernel ‰ 5j .x; y/ for a fixed value y 2 and variable x 2 . The right figures show the scalar value "r .x/ ‰ 5j .x; y/"r .x/ , where "r .x/ is the radial unit vector at the point x. This “radial projection” of the tensor kernel is suitable to show the wavelet character of the †-tensor wavelet functions
C.4
Wavelet Functions of the Second Kind
The †-tensor wavelet functions of the second kind corresponding to the †-tensor scaling functions of the second kind presented in Appendix C.3 have been, for the weight function ˛./ D 1 , > 0 and i D 2; 3; 5; 6, defined in Definition 11 by ‰ i .x; y/ D
d i ˆ .x; y/; d
x; y 2 † :
(437)
They can be calculated explicitly using the representations of the tensor scaling functions of the second kind given in Definition 10. It easily follows that Q i .x; y/ D ‰ i .x; y/ i ; ‰
x; y 2 †;
i D 2; 3; 5; 6;
(438)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1249
0.5 0.45
0.45
0.4
0.4
0.35
0.35
0.3
0.3 0.25
0.25 0.2
0.2 0.15
0.15
0.1
0.1
0.05
−150 −100 −50
0
50
100
150
0
50
100
150
0
50
100
150
Scale parameter t = 0.75. 1.4 1 0.9 0.8 0.7
1.2 1 0.8
0.6 0.5 0.4 0.3
0.6 0.4 0.2
0.2 0.1
0
−150 −100 −50
Scale parameter t = 0.5. 4.5 4 3.5 3 2.5 2 1.5
4 3.5 3 2.5 2 1.5 1
1 0.5
0.5 0
−150 −100 −50
Scale parameter t = 0.25. Q 5 on the unit sphere for different values Fig. 9 †-tensor scaling function of the second kind ˆ of the scale parameter . The left figures show the Frobenius norms of the tensor scaling functions Q 5 .x; y/ for a fixed value y 2 and variable x 2 . The right figures show a sectional cut of ˆ j the left one along the equator
with the scalar kernels ‰i W † † ! R given by ‰i .x; y/ D
1 d i ˆ .x; y/; ˛./ d
x; y 2 †;
(439)
1250
C. Mayer and W. Freeden 2 3 2.5 2 1.5
1.5 1 0.5
1 0 0.5 −0.5
−150 −100 −50
0
50
100
150
50
100
150
Discrete scale parameter j = 1, i.e. t j = 2−1 = 0.5. 12 10
8 7 6 5
8 6
4 3 2
4 2
1 0 −1
−150 −100 −50
0
Discrete scale parameter j = 2, i.e. t j = 2−2 = 0.25. Q 5 of the second kind on the unit sphere for Fig. 10 Discrete †-tensor wavelet function ‰ j different values of the discrete scale parameter j 2 Z on the unit sphere . The left figures show 5 Q .x; y/ for a fixed value y 2 and variable x 2 . the Frobenius norms of the tensor wavelets ‰ j 5 Q .x; y/"r .x/ , where "r .x/ is the radial unit The right figures show the scalar value "r .x/ ‰ j vector at the point x. This “radial projection” of the tensor kernel is suitable to show the wavelet character of the †-tensor wavelet functions
and the scalar kernels ˆi , i D 2; 3; 5; 6, given in Appendix C. Explicit representations of the scalar wavelet functions ‰i can be calculated to be (see also Freeden and Mayer 2003) 2 .x; y/ D ‰˙
.x/ .y/ 2 jx ˙ .x/ yj3 C
3 ..x ˙ .x/ y/ .x//..x ˙ .x/ y/ .y// ; 2 jx ˙ .x/ yj5
3 ..x ˙ .x/ y/ .x//2 1 C ; 3 2 jx ˙ .x/ yj 2 jx ˙ .x/ yj5 .x/ .y/ .x/ .y/ ‰5 .x; y/ D C 4 jx C .x/ yj3 jx .x/ yj3
3 .x; y/ D ‰˙
(440)
(441) (442)
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
3 4
1251
..x C .x/ y/ .x//..x C .x/ y/ .y// js C .x/ yj5 ..x .x/ y/ .x//..x .x/ y/ .y// C ; jx .x/ yj5 1 1 C (443) ‰6 .x; y/ D 4 jx C .x/ yj3 jx .x/ yj3 3 ..x C .x/ y/ .x//2 ..x .x/ y/ .x//2 C C ; 4 jx C .x/ yj5 jx .x/ yj5 C
for > 0 and x; y 2 †. In Sect. 4.4, we have also defined scale discrete †-tensor wavelet functions of the second kind of type i . They are given, for i D 1; : : : ; 6, by Q i .x; y/ ˆ Q i .x; y/; Q i .x; y/ D ˆ ‰ j j C1 j
x; y 2 † ;
(444)
where the sequence fj gj 2Z is a discretization of the scale interval .0; 1/. Some graphical illustrations of the scale discrete †-tensor scaling function of the second kind of type Q 5 , can be found in Fig. 10. i D 5, ‰ j
References Abeyratne MK (2003) Cauchy-Navier wavelet solvers and their application in deformation analysis. PhD thesis, Geomathematics Group, University of Kaiserslautern Abeyratne MK, Freeden W, Mayer C (2003) Multiscale deformation analysis by Cauchy-Navier wavelets. J Appl Math 12:605–645 Atkinson KE (1997) The numerical solution of integral equations of the second kind. Cambridge University Press, Cambridge, MA Blatter C (1998) Wavelets – Eine Einführung. Vieweg Verlag, Braunschweig Brakhage H, Werner P (1965) Über das Dirichlet’sche Aussenraumproblem für die Helmholtz’sche Schwingungsgleichung. Arch Math 16:325–329 Breuer J, Steinbach O, Wendland WL (2002) A wavelet boundary element method for the symmetric boundary integral formulation. In: Proceedings of IABEM 2002, Austin Colton C, Kress R (1992) Inverse acoustic and electromagnetic scattering theory. Springer, Berlin/New York Dahmen W (1997) Wavelet and multiscale methods for operator equations. Acta Numer 6:55–228 Dahmen W, Kurdila AJ, Oswald P (1997a) Multiscale wavelet methods for partial differential equations (wavelet analysis and its applications). Academic, New York Dahmen W, Kurdila AJ, Oswald P (eds) (1997b) Multiscale wavelet methods for partial differential equations. Wavelet analysis and its applications. Academic, San Diego Daubechies I (1992) Ten lectures on wavelets. SIAM, New York Faxen H (1929) Fredholm’sche Integralgleichungen zu der Hydrodynamik zäher Flüssigkeiten. Ark Mat Astr Fys 21A:1–40 Fischer TM (1982) An integral equation procedure for the exterior 3-D slow viscous flow. Integral Equ Oper Theory 5:490–505 Freeden W (1980) On the approximation of external gravitational potential with closed systems of (trial) functions. Bull Geod 54:1–20
1252
C. Mayer and W. Freeden
Freeden W (1998) The uncertainty principle and its role in physical geodesy. In: Freeden W (ed) Progress in geodetic science. Shaker, Aachen, pp 225–236 Freeden W (1999) Multiscale modelling of spaceborne geodata. B.G. Teubner, Stuttgart/Leipzig Freeden W, Gerhards C (2013) Geomathematically oriented potential theory. Chapman and Hall/CRC, Boca Raton Freeden W, Gutting M (2013) Special functions of mathematical (geo-)physics. Birkhäuser, Basel/Heidelberg Freeden W, Mayer C (2003) Wavelets generated by layer potentials. Appl Comput Harmonic Anal 14:195–237 Freeden W, Mayer C (2006) Multiscale solution for the Molodensky problem on regular telluroidal surfaces. Acta Geod Geophys Hung 41:55–86 Freeden W, Mayer C (2007) Wavelet modelling of tangential vector fields on regular surfaces by means of Mie potentials. Int J Wavelets Multiresolut Inf Process 5:417–449 Freeden W, Michel V (2004) Multiscale potential theory: with applications to geoscience. Birkhäuser, Boston Freeden W, Schreiner M (2009) Spherical functions of mathematical geosciences: a scalar, vectorial, and tensorial setup. Springer, Heidelberg Freeden W, Windheuser U (1996) Spherical wavelet transform and its discretization. Adv Comput Math 5:51–94 Freeden W, Gervens T, Schreiner M (1994) Tensor spherical harmonics and tensor spherical splines. Manuscr Geod 19:70–100 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geomathematics). Oxford Science Publications/Clarendon, Oxford Freeden W, Mayer C, Schreiner M (2003) Tree algorithms in wavelet approximation by Helmholtz potential operators. Numer Funct Anal Optim 24:747–782 Goswami JC, Chan AK (1999) Fundamentals of wavelets: theory, algorithms, and applications. Wiley, New York Günter NM (1957) Die Potentialtheorie und ihre Anwendung auf Grundaufgaben der Mathematischen Physik. B.G. Teubner, Stuttgart Hebeker F-K (1986) Efficient boundary element methods for three-dimensional exterior viscous flows. Numer Methods Partial Differ Equ 2:273–297 Heuser H, Funktionalanalysis: Theorie und Anwendungen. B.G. Teubner, Stuttgart (1992) Ilyasov M (2011) A tree algorithm for Helmholtz potential wavelets on non-smooth surfaces: theoretical background and application to seismic data postprocessing. PhD thesis, Geomathematics Group, University of Kaiserslautern Kellogg OD (1967) Foundation of potential theory. Springer, Berlin/Heidelberg/New York Kersten H (1980) Grenz- und Sprungrelationen für Potentiale mit qudratsummierbarer Flächenbelegung. Resultate der Mathematik 3:17–24 Konik M (2002) A fully discrete wavelet Galerkin boundary element method in three dimensions. Logos Verlag, Berlin Kress R (1989) Linear integral equations. Springer, Berlin/Heidelberg/New York Kupradze VD (1965) Potential methods in the theory of elasticity. Israel Program for Scientific Translations, Jerusalem Ladyzhenskaja OA (1969) The mathematical theory of viscous incompressible flow. Gordon and Breach, New York Lage C, Schwab C (1999) Wavelet Galerkin algorithms for boundary integral equations. SIAM J Sci Comput 20:2195–2222 Lax PD (1954) Symmetrizable linear transformations. Commun Pure Appl Math 7:633–647 Leis R (1964) Zur Eindeutigkeit der Randwertaufgabe der Helmholtz’schen Schwingungsgleichung. Math Z 85:141–153 Louis A, Maaß P, Rieder A (1994) Wavelets. Teubner Verlag, Stuttgart Mallat S (1998) A wavelet tour of signal processing. Academic, San Diego/San Francisko/ New York
Stokes Problem, Layer Potentials and Regularizations, and Multiscale Applications
1253
Mayer C (2007) A wavelet approach to the Stokes problem. Habilitation thesis, Geomathematics Group, University of Kaiserslautern Müller C (1969) Foundation of the mathematical theory of electromagnetic waves. Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen, Bd. 155. Springer, Heidelberg Odqvist FKG (1930) Über die Randwertaufgaben der Hydrodynamik zäher Flüssigkeiten. Math Z 32:329–375 Panich OI (1965) On the question of the solvability of the exterior boundary-value problem for the wave equation and Maxwell’s equation. Russ Math Surv 20:221–226 Power H (1987) On Rallison and Acrivos solution for the defomation and burst of a viscous drop in an extensional flow. J Fluid Mech 185:547–550 Power H, Miranda G (1987) Second kind integral equation formulation of Stokes’ flows past a particle of arbitrary shape. SIAM J Appl Math 47:689–698 Power H, Wrobel LC (1995) Boundary integral methods in fluid mechanics. Computational Mechanics Publications, Southampton/Boston
On High Reynolds Number Aerodynamics: Separated Flows Mario Aigner
Contents 1 2 3
General Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marginal Separation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cauchy Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Steady Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Ill-Posedness and Regularized Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Self-Similar Finite Time Blow-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1256 1256 1263 1265 1270 1289 1292 1295
Abstract
This treatise deals with the occurrence of locally separated, three-dimensional, unsteady high Reynolds number flows. As it is well established, such flows are governed by a triple-deck structure where the wall shear stress in the viscous sublayer of the (in general inviscid) boundary layer is utilized to describe the phenomenon of localized separation bubbles. It is then proved that the Cauchy problem for the local wall shear stress is, in general, ill-posed. Thus, regularization methods need to be applied to numerically compute the time evolution. The numerical scheme comprises a novel technique using rational Chebyshev polynomials. Finally, the breakdown of the triple-deck structure in the sense of a finite time blow-up scenario is shown.
The author likes to thank Stefan Braun, Vienna University of Technology and the Austrian Science Fund FWF for supervising and funding this work. M. Aigner Institute of Fluid Mechanics and Heat Transfer, Vienna University of Technology, Vienna, Austria e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_101
1255
1256
1
M. Aigner
General Introduction and Motivation
Contemporary research areas in fluid dynamics range from the design of aircrafts, wind power stations over weather prediction, climate, and meteorological models to classical existence and uniqueness results for the Navier-Stokes equations. Equally numerous are the techniques and sub-disciplines used. Despite the huge variety of interests and utilized methods there is one common open problem for all the aforementioned fields of study – turbulence. It is yet to be fully understood how to properly describe turbulence per se or even to predict and characterize the onset of transition to turbulent flows. The approach in this work comes from singular perturbation theory and matched asymptotic expansions (cf. Eckhaus 1973) describing the separation of a laminar boundary layer from a (smooth) surface. The overall aim, then, is to gain insight into the phenomenon of transition to turbulence. The notion of a boundary layer stems from the works of Ludwig Prandtl. He considered viscous fluids at asymptotically large Reynolds numbers where the velocity at the surface shall vanish entirely (i.e., no-slip condition). Since such a flow can be regarded as ideal, i.e., satisfying the Euler equations, the decrease of the velocity to zero happens in a thin layer adjacent to the surface. By (reasonably) arguing the vertical velocity component to be small within this layer, one immediately arrives at the equations describing the boundary layer. In aerodynamics where high velocities and low viscosities are the dominant fluid flow characteristics, the Reynolds number Re can be assumed to be high, such that it is reasonable to study the limit Re ! 1. Consequently, one obtains a singular perturbation problem. In general, separation starts with the formation of a (short) separation bubble containing reversed flow. This stage is highly unstable and might, hence, lead to either significantly long separation bubbles or a bubble burst. The critical stages (of short bubbles) are commonly referred to as cases of marginal separation which for being at the verge of a bubble burst are often seen to trigger the transition process to turbulence. It is thus of high importance to understand the physics underlying it. In e.g., Sychev et al. (1998) it is demonstrated in a very deductive manner that the theory of marginal separation is embedded into the classical (Prandtl) boundary layer concept.
2
Marginal Separation Theory
When studying Navier-Stokes dynamics from a rather qualitative viewpoint, all involved quantities are normally scaled to be non-dimensional. Nevertheless, such an operation is not necessarily a trivial one. In fact, the Reynolds number is the result of substituting suitably scaled, dimensional coordinates, the velocity, and the
On High Reynolds Number Aerodynamics: Separated Flows
1257
pressure field into the original Navier-Stokes equations, see Eq. (2). Consequently, the Reynolds number then reads Re D
u1 L ;
(1)
with L being some characteristic length, the kinematic viscosity and u1 the free stream velocity. It is due to this relation that one has to be careful when making assertions about the state of a flow at a certain Reynolds number. In aerodynamics, with u1 being comparably high and the viscosity small, one can argue to have high Reynolds number flows for various characteristic length scales. N.b.: In all what follows, we assume the appearing quantities to be nondimensionalized in the above mentioned manner. As we will demonstrate in the following sections, the advantage of studying the case of marginal separation is that one does not have to deal with turbulence models or have to include large separation regions. By being at the verge of separation, the classical laminar boundary layer theory, supplemented with the socalled interaction concepts, provides the necessary frame work. In Aigner (2012) and references therein, an experimental argumentation for the occurrence and importance of marginal separation can be found. It has been established in the original papers (see Ruban 1981; Stewartson et al. 1982) that the asymptotic description of marginally separated boundary layer flows leads to a so-called triple-deck structure. In the following, we shall paraphrase the main ideas presented in these works to demonstrate how the three decks or layers emerge. A thorough and detailed deduction can be found in Sychev et al. (1998) and Ruban (2010). Let us first set up the coordinate system, see Fig. 1. We say, for a given velocity field u , the components .u ; v ; w / are functions of the coordinates .x ; y ; z /. In many theoretical studies, but also in real applications, the oncoming, unperturbed flow is often considered unidirectional, e.g., u D .u ; 0; 0/ (or has only comparably small v and w components). For the following deduction we assume u WD .u ; v ; w / D u .x ; y ; z ; t / and p D p .x ; y ; z ; t / to satisfy the Navier-Stokes equations for incompressible, transient flows on .x ; y ; z / 2 R3 with a given (suction or blowing) wall velocity vw D vw .x ; z ; t / at the surface y D 0 and a Reynolds number Re, defined as in (1), assumed to be large. The governing equations together with the initial and boundary conditions, hence read @t u C u ru D rp C Re1 u div u D 0
) on Œ0; T
u D .0; vw ; 0/
at y D 0 8x ; z ; t
ju j ! 1; p ! 0
as y ! 1
u D u0
at t D 0 on
(2)
1258
M. Aigner
y
z
v
w u
x
0
Fig. 1 The orthogonal coordinate system with origin at the surface and an according flow field
In the limit Re ! 1, applying the techniques provided in Eckhaus (1973) to find the so-called significant degeneration, we arrive at (for planar flows) u u0 .x ; y/; v Re1=2 v0 .x ; y/; p p0 .x ; y/; y D Re1=2 y ; where u0 , v0 , and p0 satisfy the classical boundary layer equations in .x ; y/, subject to the no-slip condition. N.b.: For the sake of readability, the asterisk here shall merely symbolize unscaled i.e., original coordinates and fields, such that for the main part of the work, dealing with rescaled variables, plain symbols can be used. A very useful characteristic to study boundary layer flows is known as the wall shear stress or skin friction, given as ˇ .x / / @y u0 .x ; y/ˇyD0 ;
(3)
which is generally regarded to be positive along x for an attached boundary layer, whereas (for steady flows) the situation of 0 (in some regions) is seen as equivalent to separation of the boundary layer from the surface. Suppose we have a parameter k, connected to the geometry or given flow conditions (e.g., the angle of attack of an airfoil or the height of a backward facing step etc.), such that D .x ; k/ and 9 Š k0 ; x0 W .x0 ; k0 / D 0 and .x ; k0 / > 0 8x ¤ x0 : Additionally say 8k < k0 , .x ; k/ is everywhere positive, then we call the limiting, critical case k D k0 marginal separation (with immediate reattachment).
On High Reynolds Number Aerodynamics: Separated Flows
1259
We refer to Aigner (2012) and references therein, for a more detailed deduction of the thus emerging triple-deck-structure, using the principle of viscous-inviscid interaction. In knowing the triple-deck structure to play the main role in our treatise, we will state the scalings and expansions in all three layers explicitly, since these build the basis of deriving the fundamental equations of marginal separation theory. With the rescaled original independent variables x ; y ; z in the interaction region, the domain for the Navier-Stokes equations (2) is given henceforth as WD R RC R. We set the perturbation parameter WD Re1=20 , such that the coordinates are scaled as (indices 1; 2; 3 denoting the upper, main, and lower deck respectively and the point of zero skin friction shifted into the origin) 8 4 < y1 t D 1 t; x D 4 x; z D 4 z; y D 10 y2 C 14 h : 11 y3 C 14 h
(4)
These scalings already incorporate the potential presence of a (smooth) surface obstacle (also viewed as a flow control device) with height h, which might not only depend on the spatial coordinates but might also very well vary with time. Figure 2 shows a sketch of the triple-deck structure in accordance to the coordinate scalings above. If not otherwise stated, in what follows, the individual expansion terms in these decks are assumed to depend on .x; yi ; z; t/ and the asymptotic expansions are taken in principle from Braun and Kluwick (2004) and references therein.
y upper deck
I II III
main deck lower deck (a)
x
Fig. 2 The triple-deck structure of the interaction region. I –III indicate the potential flow, the main part of the boundary layer and the viscous sublayer, respectively, upstream of x0 D 0, with a recirculation region (local separation bubble) (a)
1260
M. Aigner
We shall stress again that these decks are not assumed to be present, but can be thoroughly deduced using matched asymptotic expansions in the occasion of a point of separation (see for more details Sychev et al. (1998) and Ruban (2010) for the steady, planar case). (i) The upper deck. Here, one essentially has a potential flow region. That is, at the leading order a constant (within the interaction region), uni-directional velocity field is prescribed, i.e., .u1 ; v1 ; w1 / D .U00 ; 0; 0/. Thus, we state the expansions to be u1 u10 C 4 u11 C 10 u12 v1 4 v11 C 10 v12 w1 10 w12
(5)
p1 p10 C 4 p11 C 10 p12 where, by substituting (4) and (5) into (2), one obtains u10 D U00 2 1 U00 p10 D 2
u11 D U01 x v11 D U01 y1 p11 D p00 x:
The imposed pressure gradient p00 (at y1 D 0, to be precise) is obviously constant within the interaction region and consequently U01 D p00 =U00 . As usually done in asymptotic expansion procedures one considers the resulting equations at every order of . Consequently, the next higher order terms, induced through the interaction, then have to satisfy
div .u12 ; v12 ; w12 / D 0;
U00 @x u12 D @x p12 U00 @x v12 D @y1 p12 ) p12 D 0; U00 @x w12 D @z p12
which can be seen by first formally differentiating the momentum equations w.r.t. x, y1 and z (respectively). Addition of the three equations and equating the divergence of the velocity field appearing on the left hand side to zero yields that p12 has to be zero. In other words, as already expected from the potential flow assumption, the induced pressure p12 has to satisfy the Laplace equation on the half space y1 > 0 subject to Neumann boundary conditions (see Aigner 2012). In general, a solution thereof can be derived to be p12 .x; 0; z; t/ D
U00 2
Z R2
1 @ v12 . 1 ; 0; 2 ; t/ d 1 d 2 ; j.x; z/ . 1 ; 2 /j 1
On High Reynolds Number Aerodynamics: Separated Flows
1261
which is the crucial information needed from the upper deck. (ii) The main deck. Sychev et al. (1998) showed this layer to be inviscid, although it resembles a classical boundary layer. Consequently, u20 represents the boundary layer velocity profile at the verge of separation. Overall, the main deck remains two-dimensional and steady at the leading order. The expansions, thus, read u2 u20 C 4 u21 v2 10 .v21 C h21 / w2 10 w21 p2 p20 C 4 p21 C 10 p22 : Here, u20 is found to only depend on y2 and we will henceforth write u20 DW U0 .y2 /. The function h21 can be derived (see e.g., Braun and Kluwick 2002) to be h21 D U0 .y2 /@x h.x; z; t/, with h being the height of the surface obstacle mentioned above. As explained further in Aigner (2012), the important result here is 0 @x A.x; z; t/ v21 .x; y2 ; z; t/ D U0 .y2 / @ C p00
Zy2 0
1 U000 .s/ p00 ds A : U02 .s/
Remark 1. At this point, the function A D A.x; z; t/ above represents an integration constant stemming from generally solving the according momentum equation for v21 . Therefore, it remains undetermined. Finding and solving the governing problem for A is seen as the basis for understanding marginally separated flows (from a theoretical viewpoint). The connection stems from the original works, e.g., Stewartson et al. (1982), which can also be easily seen from the definition (3) – that is, the integration constant A possesses a physical interpretation as it is proportional to the wall shear stress ˇ .x; z; t/ / @y uˇyD0 / A.x; z; t/: Ergo, within the context of the flow being at the verge of separation, or at a potential bubble burst, the (local) structure and time evolution of A can qualitatively describe these crucial processes. (iii) The lower deck. From the deduction of the triple-deck structure, one can see that the interaction actually takes place between the lower deck (as a viscous sublayer) and the outer flow. We shall write the lower deck expansions as
1262
M. Aigner
u3 2 u30 C 5 u31 C 8 u32 v3 12 .v31 C h31 / C 15 .v32 C h32 / w3 8 w32 p3 p30 C 4 p31 C 10 p32 ; Note that all functions with indices 30 and 31 can be determined by the usual straight forward substitution into (2), eventually yielding
u30 D p00
y32 2
u31 D Ay3 y2 v31 D 3 @x A : 2 2 y h31 D p00 3 @x h 2
The crucial part here lies in determining the equations for the higher order terms subscripted with 32. These equations will certainly involve the function A, for example in the simple case of h32 D Ay3 @x h. In order to find solutions to these problems, it is essential to investigate the characteristics of A, cf. Remark 1, which is seen as the fundamental problem of marginal separation. In Aigner and Braun (2015), a novel description and technique can be found how to arrive at this problem, which eventually reads (see also Braun and Kluwick 2004) Zx 2
2
A x C D
R Zx @x p32 C 1 @2z p32 d @t .A h/ d d 1=2 .x / .x /1=4
1
Zx
1
vw d ; .x /1=4
(6)
1
with and being positive constants and 2 R denoting a rescaled control parameter, cf. k following Eq. (3). In order to obtain the problem purely in terms of A as the unknown, one needs to relate A to the pressure term p32 , which is relabeled as the interaction pressure pi .x; z; t/ D
1 2
Z R2
1 @2 .A h/ d 1 d 2 : j.x; z/ . 1 ; 2 /j 1
(7)
Equations (6) and (7) now provide all necessary theoretical instruments, such as a control parameter and flow control devices h and vw , to study the criteria for detecting when and where the flow might break down.
On High Reynolds Number Aerodynamics: Separated Flows
3
1263
Cauchy Problems
By combining (6) and (7), using the polar coordinates .x; z/ ! .r; /, A D A.x; z; t/ shall satisfy the following problem in R2 Œ0; T
A x C D 2 2
Zx
2
1
Zx 1
1 .x s/1=2
Z R2
@3 1 C @ 1 @2 2 j.s; z/ . 1 ; 2 /j
A. 1 ; 2 ; t/d 1 d 2 ds
1 @t A.s; z; t/ ds C g.x; z; t/ .x s/1=4
A.x; z; 0/ D A0 .x; z/; in R2 A.x; z; t/ c. /r as r ! 1; in Œ0; T : (8) Remark 2. The function g above contains terms to include a hump and/or a suction/blowing device (cf. the deduction in Aigner and Braun 2015), and hence one can view g as a general inhomogeneity or forcing term. Due to the linearity of the right hand side operators in (8) it follows immediately that the argument A h as used in the deduction for (6) and (7), can be separated. Viewing such a hump as a surface mounted obstacle it can act as a flow control device, i.e., shifting or delaying separation of the laminar boundary layer. More details on these subjects can be found in Braun and Kluwick (2002). For brevity, we rewrite the equation in (8) by identifying the Abel operators Jˇ˛ and the Riesz potential R˛ (for definitions see Aigner (2012), Section 3) as 1=2 1 3 3=4 J1 R Œ@x C@x @2z A .x; z; t/ J1 @t A .x; z; t/Cg.x; z; t/: 2 (9) and analogously for the planar problem in R Œ0; T A2 x 2 C D
1=2 2 3=4 A2 x 2 C D J1 @x A .x; t/ J1 @t A .x; t/ C g.x; t/;
(10)
Remark 3. The difference between the right-hand side operators in the planar and three-dimensional case can be best seen by assuming the integrals in (8) to act as a function solely depending on x, i.e.,
1264
2
M. Aigner
Zx 1
D
1 .x s/1=2
Z R2
Zx
2
1
1 p @3 f . 1 /d ds D .s 1 /2 C .z 2 /2 1
1 .x s/1=2
Z R2
..s 1
s 1 @2 f . 1 /d ds D C .z 2 /2 /3=2 1
/2
Z Z 1 1 2 .s /@ f . / d 2 d 1 ds D 1 1 1 1=2 2 .x s/ ..s 1 / C .z 2 /2 /3=2 1 R R „ ƒ‚ … Zx
D 2
D2=.s 1 /2
Zx
D
1
D
Z1
1 .x s/1=2
Z @2 f . / 1 1 R
s 1
d 1 ds
Zx
1 d 1 ds D .s 1 /.x s/1=2 1 „ ƒ‚ …
@2 1 f . 1 / x
p D= 1 x
Z1 D x
1 @2 f . 1 /d 1 ; . 1 x/1=2 1
where we used integration by parts in the second line. By imposing the necessary smoothness and far-field decay conditions on f and using the Cauchy principal value, the above modifications can be made rigorous and also hold for a potential solution A of both problems. t u Remark 4. As it is often done in the theory of evolution equations, we consider the problem (9) as an ordinary differential equation with values in some Banach space X , i.e., A W Œ0; T ! X is continuously differentiable and the righthand side operators map their domain onto X as well, where X is the space of bounded, continuous functions with an existing limit at infinity. Normally, one also requires the initial condition A0 to lie in the domain of the operators, but under certain conditions such assumptions can be weakened (cf. the definition of a classical solution of abstract Cauchy problems (25)). Here, we will confine the set of initial conditions to the domain of the integro-differential operators, i.e., three times continuously differentiable functions with at most linear growth at infinity. Nevertheless, since we are also interested in certain local qualitative and quantitative characteristics of solutions, a numerical treatment of (8) is eventually needed.
On High Reynolds Number Aerodynamics: Separated Flows
3.1
1265
Steady Problems
We use the steady version (which has been sufficiently studied) to show convergence of a polynomial discretization scheme applied in spatial coordinates, where we additionally have reference solutions from previous works. Using polar coordinates .x; z/ ! .r; / and setting @t A D 0, we finally say A D A.x; z/ shall satisfy the following problem in R2 A2 x 2 C D
1=2 1 3 J1 R Œ@x C @x @2z A .x; z/ C g.x; z/; 2
(11)
A.x; z/ c. /r as r ! 1; and analogously for the planar case in R 1=2 2 A2 x 2 C D J1 @x A .x/ C g.x/; (12) A.x/ jxj as jxj ! 1; The far field condition above just indicates an at most linear growth behavior, depending on . Alternatively, considering the x and z coordinate separately, we write this condition as A.x; z/ D O.jxj/ as jxj ! 1; A.x; z/ < 1 as jzj ! 1:
(13)
Remark 5. Form the partial derivatives on the right-hand side in (11), we assume classical solutions of the equation to be at least three times continuously differentiable on R2 . The at most linear growth given as a far field condition in (11) or (13) thus renders the argument Œ@3x C @x @2z A.x; z/ of the integral operators bounded and continuous on R2 , with a decay rate of r 2 (in principle). Therefore, using the fact that combinations of compact and bounded operators are compact, one can claim permissible (classical) solutions to satisfy the requirements of the theorems given in Aigner (2012, Section 3) and assert the right-hand side operators to form a compact mapping. Problems (12) and (11) are defined in an unbounded domain, where the unknown function admits an algebraic far-field behavior. Due to the aim of depicting the time evolution of A governed by (8) we need a highly accurate and low-cost approximation scheme in spatial coordinates. Thus, we want to use a polynomially based collocation method, with so-called rational Chebyshev polynomials as basis functions. Let Tn be the classical Chebyshev polynomial of degree n, then Rn .x/ WD Tn . .x// D cos.n .x//; 8x 2 R;
(14)
1266
M. Aigner
with W R ! Œ1; 1 ;
x ; .x/ D p 1 C x2
(15)
W R ! Œ; 0 ; .x/ D arctan.x/ =2: As shown in Aigner (2012, Section 3), these polynomials form aQ complete orthogonal set in the weighted Lebesgue space L2u .Rn /, with u.x/ D niD1 1=.1 C xi2 /. Although such spaces allow for functions to grow (weakly) at infinity, due to the given asymptotic behavior in (12) and (11), one cannot expect the function A to lie in these spaces. Additionally, the collocation method should be set up in, e.g., .Cl .R2 /; k k1 / (for all t > 0) and hence, needs an at least bounded unknown function. From (13), it is clear that A D A.; z/ remains bounded and hence we can subtract the growth with respect to x, such that A.x; / D B.x; / C
p 1 C x 2 ; B.x; / D O.jxj1 / as jxj ! 1:
(16)
It is now reasonable to assume B 2 L2u .R2 /, satisfying the modified equation (subject to the far field above) p 1=2 1 3 B 2 C 2 1 C x2 B C C 1 D J R Œ@x C @x @2z B .x; z/ C f .x/ C g.x; z/; 2 1 (17) where the function f results from substituting (16) into the right-hand side of (11). Using the results mentioned in Aigner (2012, Section 3) one can approximate B by the orthogonal projection BN , such that BN .x; z/ D PN B.x; z/ D
Nz Nx X X
ai k Ri .x/Rk .z/;
(18)
i D0 kD0
with BN reasonably expected to converge to B. Note that by saying Nz D 0 in (18), we immediately obtain the one-dimensional expansion, since R0 1. Additionally, in virtue of computational costs, Nx and Nz should be treated independently. Remark 6. As established in Remark 5, one can expect the operators on the righthand side in (17) to be compact between the space of three times continuously differentiable functions with at most linear growth at infinity and the space of bounded, continuous functions where a limit exists at infinity. This certainly holds for a function B as in (16) when assuming A to be a classical solution. The convergence rate of the series expansion, also taking into account the decay rate of B, can then be estimated using the results in Aigner (2012, Section 3), r which show, that one actually works on the Sobolev-type space Hu;A , defined as follows.
On High Reynolds Number Aerodynamics: Separated Flows
1267
For the multi-index k and x 2 Rn say @kx WD @kx11 @kx22 : : : @kxnn , jkjs D the space Hur .Rn / be defined as the set ˚
Hur .Rn / D f j @kx f 2 L2u .Rn /; jkjs r ;
with
X
kf k2r D
P
ki and let
k@kx f k2u ;
jkjs r r with k ku being the norm of L2u . The space Hu;A shall then be the set
˚
r Hu;A .Rn / D f j kf kA < 1
with kf k2A WD
n X Y r=nCkj 2 k .1Cxj2 / 2 @xjj f u ; jkjs r
j D1
see again Aigner (2012, Section 3) for a detailed motivation and relations of such spaces with globally defined polynomials and projection operators. Consequently, due to the comparably slow decay at infinity of B, with respect r to the requirement to lie in Hu;A , we cannot expect the usual exponentially fast convergence such expansions provide on bounded domains. This is supported by Theorems given in Aigner (2012, Section 3), where one can see that the decrease in the absolute value of the expansion coefficients also depends on the far field behavior of the function. Still, uniform convergence of the approximation follows. In virtue of the spectral collocation scheme set up below, one normally assumes functions to be continuous and to have finite L1 norm. Thus we shall note, that the involved operators are compact (or at least bounded) in such function spaces where then, consistency of the approximation follows immediately. Remark 7. Problems (12) and (11) are, above all, nonlinear. It is well established in the field of spectral methods that nonlinear equations (or nonlinear terms) are best treated using interpolation, meaning that Galerkin and collocation methods are not the best choice. The advantage of using interpolation (with function values being the discrete unknowns) lies in the evaluation of the nonlinearity, where this is done by simple pointwise calculations. It is obvious, having the additional integrals, that the Galerkin approach becomes heavily involved, even for weak nonlinearities (as the quadratic term here). Collocation lies somewhere between these two, as the coefficients are the unknowns, but the equation system is set up by pointwise evaluation. We start by substituting (18) into (17) and take the zeros .xj ; zl / of RNx C1 RNz C1 as the collocation points to obtain Nz Nx X q X 2 ai k Ri .xj /Rk .zl / C C 1 D BN .xj ; zl / C 2 1 C xj2 ƒ‚ … „ „ ƒ‚ … i D0 kD0 DWCij kl
DWsj
D
Nz Nx X X i D0 kD0
ai k
1=2 1 3 J1 R Œ@ 1 C @ 1 @2 2 Ri . 1 /Rk . 2 / .xj ; zl /Cf .xj /C g.xj ; zl /; 2 „ ƒ‚ … DWKij kl
1268
M. Aigner
hence 2 C a C .C s/ a C C 1 D K a C f C g;
(19)
with the matrix C s defined via its elements C sij WD sj Cj i . Here, apart from dealing with the nonlinearity, the obvious substantial task is obtaining the matrix K in a fast and accurate way. The according procedures are presented in detail in Aigner (2012, Section 3). Remark 8. An important advantage of spectral collocation methods can be observed at this point. In setting up the equation system above we never have used any type of boundary conditions or imposed the far field behavior of the unknown function B. This is in virtue of the fact that the expansion (18) is assumed to be uniformly convergent, implying that a consistent scheme yields the same behavior at infinity for BN (N 1) as in B. For the sake of completeness it shall be noted that if one would want to impose certain values at infinity, this can be done by either using another set of collocation points (the extrema of RN , for example, are distributed up to the boundary) or by modifying the polynomials themselves, such that every polynomial satisfies the boundary conditions individually. Remark 9. Solving (19) obviously requires an iteration scheme for the quadratic nonlinearity. Let us go back to Eq. (17), where we replace the unknown B with Bn C ıB DW BnC1 . Assuming ıB to be small (in some norm), hence justifying to cancel the .ıB/2 term, yields an iteration procedure for Bn . This can be formally interpreted as using the Frechét derivative of the nonlinearity, which results in Newton’s method. If we then repeat the (discretizing) modifications above, we arrive at the exact same system as if starting from (19) and substituting anC1 D an C ıa, eventually defining the iteration procedure. The commutativity of linearization (or Frechét differentiation) and discretization, as we have just argued, has been mentioned in Golberg (1979), with a general treatment given by Ortega and Rheinboldt (1966). The iteration scheme can thus be obtained to read
C an
2
C .C s/ an C .C s/ ıa C 2.C an /.C ıa/ C C 1 D K an C K ıa C f
1 2 K an C an .C s/ an C f 1 ) ıa D 2C a n C C .C s/ K iterate a n ! an C ıa: (20) As has been argued in Aigner (2012, Section 3) the approximation via polynomial expansion, cf. (18), can be viewed as the application of a projection operator, eventually mapping the unknowns from infinite dimensional function spaces to sequences of coefficients. Define, for the multi-index i D .i1 ; : : : ; in / and a WD .ai /ji j 0
On High Reynolds Number Aerodynamics: Separated Flows
`1 .Nn / WD fa W kak`1 D
1 P
1269
jai j < 1g and
ji jD0
`2 .Nn / WD fa W kak2`2 D
1 P
jai j2 < 1g;
ji jD0
then, from Parseval’s identity, we have that if f 2 L2u , the coefficients ai , (uniquely) defining PN f , are in `2 . In other words, if the components of the solution vector a from (19) (obtained via (20)) are square summable for all N , then the discretization converges in L2u . Furthermore, if the components are absolute summable, we even have uniform convergence. P As for the iteration scheme (20), one can set ıBN D ıai Ri , where the iteration obviously terminates if ıBN 0. By being more precise, one has to add that ıa depends on the iteration step n, such that ıan forms a (null) sequence with respect to n. Remark 10. A serious issue with the procedure (20) is the choice of the initial vector a0 . Since the iteration is in fact a Newton method, the initial guess has to be close (in some sense) to the solution in order for the process to converge. For further details on consistency and a convergence study see Aigner (2012). Overall, due to the nonlinearity, problems (11) and (12) admit multiple solutions for a given parameter . Hence, besides solving the problems numerically to study the actual shape of solutions, a bifurcation analysis with respect to is also of interest. Figure 3 shows upper and lower branch solutions of problem (12) for D 2, found by using N D 40 polynomials. For a detailed bifurcation study with 2 R see for example Stewartson et al. (1982) and Brown and Stewartson (1983). Most notably here is the fact that for greater than some critical value no real solutions of (12) exist. Remark 11. Note the negative values of A in Fig. 3.b/ indicate the existence of a separation bubble, whereas the upper branch solution at D 2 is fully attached. This already shows the sensitivity of the marginal separation steady states regarding potential bubble burst, since it is not clear which of the multiple solutions is physically realized when setting at a certain value. For numerical solutions of steady state problem (11) the reader is referred to Aigner (2012). Finally, we shall emphasize the small equation systems (when compared to finite difference schemes used in e.g., Scheichl et al. 2008) needed to obtain the required accuracy of the solutions. This then becomes more important in Sect. 3.3, where the above algorithms will be used for the spatial discretizations of time dependent, three-dimensional problems, hence making it possible to obtain fast explicit and implicit Euler procedures.
1270
M. Aigner
5
A(x) 4 3 2
(a)
1
x 0 −1
(b)
−2 −3
−4
−2
0
2
4
Fig. 3 Solutions A.x/ of (12) for the upper (a) and lower (b) branch at D 2
Remark 12. In Fromme and Golberg (1979) it was mentioned that from a converging Galerkin scheme one can assert the general existence and uniqueness of a solution. Since collocation methods can be viewed as Galerkin methods, with the inner product integrals approximated by quadrature schemes, we can formally claim existence of solutions to problems (12) and (11) in a subspace of L2u (with respect to the differentiability requirements).
3.2
Ill-Posedness and Regularized Dynamics
Hadamard (1923) demonstrated that, besides existence and uniqueness, a third requirement should be imposed on Cauchy problems – continuity with respect to given functions. Later this was paraphrased as continuous dependence on the data, which might include coefficients of the differential operators, and is now commonly known as well-posedness. As a definition, ill-posed problems are regarded as those, violating at least one of the three requirements stated above. Petrowsky (1937) defines the Cauchy problem for u, with initial data , to be correctly set if (formally speaking). N which differs only by an from , there exists only one (i) For some initial data , solution uN and (ii) For every there exists a ı, such that the difference of uN and u is less than , if
N is as close as ı to .
On High Reynolds Number Aerodynamics: Separated Flows
1271
He then proved for general, quasilinear Cauchy problems, by means of Fourier expansions, that the Fourier coefficients ai .k/ of the solution satisfy X
X jai .k/.t/j2 e ct c1 jai .k/.0/j2 C c2
i;k
i;k
Zt X
jfi .k/./j2 d ;
(21)
i;k
0
where fi .k/ are the Fourier coefficients of a (if present) inhomogeneity, and c; c1 ; c2 are positive constants. Using Parseval’s identity and the L2 norm yields the wellknown result for well-posed systems of evolution problems (in Banach spaces) X
2
kui .t/k e
ct
c1
i
X
2
kui .0/k C c2
i
Zt X 0
kfi ./k2 d :
i
Next, we apply this fact to a homogeneous partial differential initial value problem to derive a very simple necessary (and in some cases sufficient) condition for well-posedness. Given a function f and variables k; x 2 Rn , with the usual inner product denoted by h; i, the Fourier transform shall be defined as Z f .x/e i hk;xi dx:
F .f /.k/ WD Rn
Consider a linear, partial differential evolution equation of the form @t u D P.@x /u
(22)
subject to some initial condition u0 , where P denotes a polynomial (of arbitrary, finite degree) with real, constant coefficients. As it is well-known for smooth functions f (with sufficient decay) m F .@m x f / D .i k/ F f;
(23)
Q mj m where for the multi-index m D .m1 ; : : : ; mn / 2 Nn , @m x D j @xj and .i k/ D Q mj j .i kj / . Formally, applying the Fourier transform (with respect to x) to (22) yields (denoting F u D uO ) @t uO D P.i k/Ou
)
uO .k; t/ D uO0 .k/e P.i k/t ;
with uO0 being the Fourier transform of the initial condition u0 . Hence, a condition for well-posedness can be readily deduced from (21), i.e.,
1272
M. Aigner
X
jOu.k; t/j2 D
k
X
juO0 .k/j2 je P.i k/t j2 c12 e 2c2 t
X
k
juO0 .k/j2 ; 8t
k
if and only if je P.i k/t j c1 e c2 t 8k; t; or, necessarily, < P.i k/ c < 1; 8k 2 Rn ;
(24)
meaning that the real part of the complex polynomial has to be bounded from above. Originating from physical problems, the notion of dispersion relations is often used to consider the behavior of disturbances in evolution equations, i.e., introducing the perturbation uQ .x; t/ D e t e i hk;xi, 2 C, substitution into (22) yields e i hk;xi @t e t D e t P.@x /e i hk;xi and hence, e t e i hk;xi D P.i k/e t e i hk;xi
)
D P.i k/:
As a necessary condition for well-posedness uQ obviously has to remain bounded (or even decay) for all times, meaning that the real part of has to be bounded from above for all k, i.e., < D < P.i k/ c < 1; 8k 2 Rn ; which is the exact same condition as (24), derived from the Fourier transformed equation. The advantage of the dispersion relation can be best seen in the case of nonlinear problems, where a linearization around some steady state has to be performed. Let a (weakly nonlinear) Cauchy problem be given as @t u D P.@x /u C u2 and u.x; t/ D u0 .x/ C uQ .x; t/, then uQ satisfies @t uQ D P.@x /Qu C 2Quu0 ; where a Fourier transform would introduce the problem of dealing with the term F .Quu0 /, which, in general, leads to F .Qu/ F .u0 /, and one cannot always assume u0 to be Fourier transformable. On the other hand, assuming uQ .x; t/ D e t e i hk;xi , as above, one has
On High Reynolds Number Aerodynamics: Separated Flows
1273
e t e i hk;xi D P.i k/ e t e i hk;xi C 2u0 e t e i hk;xi; with the dispersion relation yielding < D < P.i k/C2u0 .x/. Although this means that the behavior of the disturbance depends on u0 , for well-posedness one still needs the upper bound for < P.i k/. Hence, using the dispersion relation ansatz easily shows that nonlinearities do not change the (necessary) requirements for wellposedness, just maybe enhance or delay the (temporal) growth of perturbations.
Abstract Cauchy Problems We will now briefly present the main aspects of well-posedness of generalized evolution equations, where more details of the theory can be found e.g., in Engel and Nagel (2000). Let us define an abstract Cauchy problem as the initial value problem given by @t u.t/ D Au.t/; t 0; u.0/ D u0 :
(25)
With X being some Banach space, we further assume the initial value u0 2 X , A W D.A/ ! X to be a linear operator and call u W RC ! X a classical solution of the Cauchy problem, if it is continuously differentiable with respect to X , u.t/ 2 X , 8t 0 and (25) holds. The usual definition of a strongly continuous semigroup T .t/ t 0 then gives u.t/ D T .t/u0 . Engel and Nagel (2000) furthermore essentially proved Lemma 1. For a closed operator A the associated abstract Cauchy problem is well-posed, if and only if A generates a strongly continuous semigroup on X . Let us denote such a semigroup by T .t/ t 0 , then (under certain conditions on A) the semigroup can be expressed as an exponential T .t/ WD e At with a solution given by u.t/ D T .t/u0 . Furthermore, there exist constants ! 2 R and M 1 such that kT .t/k M e !t ; 8t 0:
(26)
Then the infimum !0 of the set f! 2 R W 9M! 1; such that (26) holds for M! g is called (exponential) growth bound. Combining this with the fact that for 2 C, < > !, we have 2 .A/, i.e., the resolvent set and defining the spectral bound s.A/ WD supf< W 2 .A/g; D Cn;
1274
M. Aigner
one readily obtains 1 s.A/ !0 < 1. Eventually, by relating P.i k/ to the spectrum of the differential operator P.@x /, we have found the (more or less) exact same conditions for well-posedness as for classical partial differential initial value problems. Moreover, as proved in Engel and Nagel (2000), replacing the partial differential operator @x by .i k/, denoting a.k/ WD P.i k/ (also allowing for complex coefficients) and defining A via these multipliers (cf. (23)), the equivalence A generates a strongly continuous semigroup , sup < a.k/ < 1
(27)
k2Rn
holds for all such A acting on L2 .Rn /. Remark 13. The equation for the Cauchy problem stated in (8) or (9) is inhomogeneous and therefore one has to make the connection according to inhomogeneous abstract problems by assuming the homogeneous problem to be well-posed, where a possible solution is given by the variation of parameters formula, i.e., Zt u.t/ D T .t/u0 C
T .t s/g.s/ds; 8t 0; 0
with g containing all inhomogeneous terms. Such a description is called mild solution, where further details can again be found in Engel and Nagel (2000).
Operator Symbols and Regularization The conditions for well-posedness established above shall now be extended to the operators involved in the problems stated in (9), i.e., combinations of singular integral and classical differential operators. One of the earliest works dealing with Fourier transforms and multipliers of (singular integral) operators is that of Mikhlin (1936). He introduced the notion of a symbol of an operator and first mentioned that sums and products of (singular) integral operators correspond to the sums and products of their according symbols. This has been considered further, providing more details, in the monograph of Mikhlin (1965), where the important connection to the Fourier transform has been made, resulting in the fact that the symbol of a singular integral operator coincides with the Fourier transform of its kernel. That is, let K be a singular integral operator with kernel K then the symbol sb.K/ can be given as (see Mikhlin 1965) sb.K/ D F K: Thus, for f in a suitable function space, another form to represent K would be Kf D F 1 sb.K/F f:
(28)
On High Reynolds Number Aerodynamics: Separated Flows
1275
N.b.: This is a very general and often used form to view operators (and combinations thereof) which possess a symbol. Next we apply the results above, to the operators appearing in (9) (as well as their two-dimensional equivalents). As shown in Gorenflo and Vessella (1991) the Fourier transform of the Abel kernel, KA .x/ D H .x/x ˛1 , see Aigner (2012, Section 3), is ˛ /; F .KA / D .˛/.i k/˛ D sb.J1
(29)
˛ where represents the complete gamma function. Analogously we have sb.J1 /D ˛ .˛/.i k/ . Here, for definiteness, we set (cf. Gorenflo and Vessella 1991)
.˙i k/˛ D jkj˛ exp.i ˛ sgn.k//: 2 Similarly, the Fourier transform of the Riesz potential kernel KR .x/ D jxj˛n , can be calculated (distributionally) to be F .KR / D .˛/jkj˛ D sb.R˛ /;
(30)
with .˛/ D n=2 2˛ .˛=2/= .n=2 ˛=2/. The following lemma states the straight forward extension of the Fourier multiplier property, see again Aigner (2012, Section 3). Lemma 2. Let K be an integral operator with a convolution kernel K, of which the Fourier transform exists (in some sense). Then Ke i hk;xi D F .K/e i hk;xi holds for all x; k 2 Rn . Proof. To see this, we simply modify Z Ke i hk;xi D
Z K.x y/e i hk;yi dy D
Rn
Z
K.y/e i hk;xyi dy D Rn
Z K.y/e i hk;yi dy D e i hk;xi F .K/;
K.y/e i hk;xie i hk;yi dy D e i hk;xi
D Rn
Rn
such that substitution of (29) and (30) shows the validity of Fourier multipliers for integral operators. t u Remark 14. Equation (30) actually represents a formal application of the Fourier transform, where Stein (1970) proved the precise meaning of it in a distributional sense. We shall paraphrase the main idea of this result. Let lie in the Schwartz
1276
M. Aigner
space of rapidly decaying functions, then the assertion of .˛/jxj˛ being the Fourier transform of jxj˛n is understood as jxj˛n D F 1 .F jxj˛n /
formally
D
F 1 ..˛/jkj˛ /;
with the second equality actually meaning Z
Z jxj
˛n
“
.x/dx D
.˛/F
1
.jkj
Z
˛
.˛/jkj˛ e i hk;xid k .x/dx D
/ .x/dx D
Z
Z .˛/jkj˛
D
.x/e i hk;xi dx d k D
.˛/jkj˛ F . /d k;
such that jxj˛n is the inverse Fourier transform of .˛/jxj˛ in the sense of distributions. Remark 15. For the Abel kernel there is another, constructive way, to prove Lemma 2. Consider Z .x y/n e i ky dy D Z D
Z .x y/n e i kx e i k.xy/ dy D e i kx .i k/.nC1/
.x y/n .i k/nC1 e i k.xy/ dy D
Z De
i kx
.i k/
.nC1/
..x y/i k/n .i k/e i k.xy/ dy;
where the coordinate transform t D i k.x y/, dy D .i k/1 dt then yields Z e
i kx
.i k/
.nC1/
.1/t n e t dt D e i kx .i k/.nC1/ .n C 1; t/;
with .n; t/ being the incomplete gamma function, such that .n; 0/ D .n/. In case of the Abel operator n D ˛ 1, 0 < ˛ < 1, and hence ˛ J1 .e i kx / D e i kx .i k/˛ .˛; 0/ .˛; i 1/ D .˛/.i k/˛ e i kx ; ˛ ). since .n; i 1/ D 0 if n < 1 (and analogously for J1
t u
Remark 16. The connection, or even say equivalence, between the symbol of a (singular integral) operator and the Fourier multiplier property, as well as the abstract multiplication semigroup, lead to the common agreement of also calling .i k/m in (23) the symbol of the derivative operator @m x.
On High Reynolds Number Aerodynamics: Separated Flows
1277
Remark 17. For the sake of completeness, we remark that singular integral operators of the kind dealt with here, are sometimes also viewed in the theory of pseudo-differential operators, which are classified via smoothness requirements for the according symbols. Hence, singular integrals might actually not correspond to pseudo-differential operators. The fractional derivative calculus can be used to put such integrals in a more general context. We will now show how these concepts apply to the Cauchy problem (8). Formally 1=4 inverting J1 and writing the time derivative term on the left-hand side yields @t A D
2
3=4 J1
1
1=2 J1 R1 Œ@3x C @x @2z A .x; z; t/ C F .A/;
where F .A/ contains the nonlinearity and the inhomogeneity. Thus, (relating .x; z/ with .k; l/ in the Fourier transform and omitting positive constants) 3=4 1 1=2 1 3 J1 R .@x C @x @2z / ) A3D WD J1 sb.A3D / D .i k/3=4 .i k/1=2 .k 2 C l 2 /1=2 ..i k/3 i kl 2 / D .i k/5=4 .k 2 C l 2 /1=2 ; (31) and by considering 5 < sb.A3D / D < jkj5=4 e i sgn.k/ 8 .k 2 C l 2 /1=2 D jkj5=4 .k 2 C l 2 /1=2 cos 5 ; „ ƒ‚8 … 0, i.e., the semigroup remains bounded in the long time limit and furthermore, a maximum principle holds in such cases. To relate this to the present problem, we state for the regularizing operator A2D in (34), see Aigner (2012), Lemma 4. The operator A2D generates a strongly continuous semigroup on L1 .R/ for t < T . Proof. One needs to show condition (35) to be satisfied by the symbol of the operator. First, boundedness follows from ˇZ ˇ Z Z ˇ ˇ ˇ ˇ ˇ ˇ ˇ e sb.A2D /t e i kx d k ˇ ˇe sb.A2D /t e i kx ˇd k D ˇe sb.A2D /t ˇd k; ˇ ˇ
since je sb.A2D /t j D e < sb.A2D /.k/t < 1, 8k and decays faster than jkj1 at infinity (see Remark 20). In other words, e sb.A2D /t 2 L1 .R/, which is necessary for the
On High Reynolds Number Aerodynamics: Separated Flows
1281
inverse Fourier transform to exist. As for the decay in terms of x, we use integration by parts twice, where the boundary terms vanish due to the strong decay of the exponential, yielding ˇZ ˇZ ˇ ˇ ˇ ˇ ˇ ˇ ˇ e sb.A2D /t e i kx d k ˇ D 1 ˇ @2 e sb.A2D /t e i kx d k ˇ 1 k@2 e sb.A2D /t kL1 : k ˇ ˇ ˇ ˇ x2 k x2 R
R
j@2k sb.A2D /j,
Calculating one can readily deduce the existence of the L1 norm. The estimate of quadratic decay with respect to x of the term on the very left-hand side finishes the proof. t u Remark 21. In contrast to the symbols considered in Droniou et al. (2002), the semigroup generated by A2D does not remain bounded as t ! 1 because of the positive parts of the symbol (cf. Remark 20). Droniou et al. (2002) also mentioned for results in more than one dimension, e.g., for the semigroup generated by A3D , a proof becomes much more involved. Nevertheless, having e sb.A3D /t 2 L1 .R2 / is again straight forward, and we reasonably expect the inverse Fourier transform of e sb.A3D /t to decay sufficiently faster in x and hence, has finite L1 norm. Remark 22. To show well-posedness on Œ0; T for the whole regularized problem (i.e., using the operators in (34) in (9)), one needs to consider the mild formulation of the solution Zt A t A.t/ D e A0 C e A .t s/ F .A/ds; 0
where F includes all other terms appearing in (9). From the (local) Lipschitz continuity of the nonlinearity A2 and the boundedness of the inhomogeneity we formally claim, based on the findings in Achleitner et al. (2011), the boundedness (in the L1 norm) of the second term. We will now turn to the numerical solutions of the regularizedpproblem. As explained in (16), it is sufficient to first subtract the linear growth 1 C x 2 from the unknown A D A.x; z; t/, obtaining the new unknown B D B.x; z; t/. Reformulating the Cauchy problem (8) in terms of B gives p 1=2 1 3 J B 2 C 2 1 C x2B C C 1 D R Œ@x C @x @2z B .x; z; t/ 2 1 3=4 @t B .x; z; t/ C f .x/ C g.x; z; t/ J1 B.x; z; 0/ D B0 .x; z/ < 1 in R2 B.x; z; t/ < 1 as .x 2 C z2 / ! 1 1 B.x; / D O.jxj / as jxj ! 1
in Œ0; T ; (36)
1282
M. Aigner
where we then say 1
B.x; z; t/ BN .x; z; t/ D p 1 C x2
Nz Nx X X
ai k .t/Ri .x/Rk .z/;
(37)
i D0 kD0
with the now time dependent coefficients ai k D ai k .t/. Considering z-symmetric disturbances g, we only need to sum over even polynomials in z, i.e., k is even. 3=4
Remark 23. The weight function .1 C x 2 /1=2 in (37) stems from the operator J1 acting on the time derivative. As shown in Aigner (2012), a special weight function (depending on the far field behavior of B) is needed in order to make the Abel integral acting on rational Chebyshev polynomials exist. Substituting the expansion (37) into problem (36) yields d C w a.t//2 C 2C a.t/ C C 1 D K a.t/ D a.t/ C f C g; dt
(38)
where C w denotes the matrix gained from (37) with Ri .x/ replaced by .1 C 3=4 x 2 /1=2 Ri .x/; analogously for K and D D Rk .zl /J1 .1 C 2 /1=2 Ri . / .xj /, with more details shown in Aigner (2012). One can fully discretize (38) by saying ˇ ˇ am a d tm D mt; a.tm / D am ; a.t/ˇˇ mC1 dt t tm such that the explicit Euler forward scheme for (38) reads amC1 D am C t D 1 ŒK am C f C g C w am /2 2C am 1 ;
(39)
The matrix D can very well be expected to be non-singular for all .Nx ; Nz /, although one shall carefully consider the issues mentioned in Gorenflo and Vessella (1991), since this represents a formal inversion of an Abel operator, cf. (33). As it is well established for well-posed problems, an Euler forward marching scheme is conditionally stable. Hence, applying it here will not (and must not) give any actual results, since the evolution equation is ill-posed. This means that without any restrictions or special requirements, solutions to arbitrarily chosen initial conditions shall not be found. Confirmation, illustrations, and further arguments on this can be found in Aigner (2012). Since well-posedness is sufficiently established by substituting the regularizing operators A from (34), we can apply the explicit Euler scheme described in (39). The term D 1 K is replaced by the collocation approximation of A . An advantage of the polynomial approach is having an exact description of the regularizing
On High Reynolds Number Aerodynamics: Separated Flows
1283
operator (since derivatives of polynomials can be given in closed form), such that the regularization does not introduce additional discretization errors. As expected, the explicit Euler scheme is (conditionally) stable and yields some plausible p time evolution on, e.g., t 2 Œ0; 1 , which is shown in Fig. 4, starting from A0 .x/ D 1 C x 2 . Remark 24. As it is generally known, if an initial value problem is well-posed, an implicit scheme (e.g., backwards Euler or Crank-Nicholson) is unconditionally stable and therefore allows for larger time steps. The discrete problem (39) thus, reads
1 amC1 D D tK C2t C D am Ct.f Cg C w am /2 1/ ;
(40)
which is still first order accurate in time, cf. (39). As usual we can reasonably expect the matrix on the right-hand side to be non-singular. Notice that this should rather be called a semi-implicit scheme, since the quadratic term in a is evaluated in an explicit sense, i.e., at the previous time step. With Fig. 4 showing the time evolution to approach the steady states, one might thus, be led to claim some sort of stability or attractiveness of such stationary solutions in terms of dynamical systems. Heuristically, we state that if a solution is close to the steady state, it remains in a certain neighborhood of it and for t 1, it is identical to the equilibrium (cf. the definitions of Lyapunov and asymptotic stability). In virtue of the bifurcation diagram (see Sect. 3.1 and Aigner (2012) and references therein) we claim further, the existence of a stable upper branch and an unstable lower branch. This is confirmed by some additional calculations performed within this type of regularization by taking as an initial condition a slightly perturbed lower branch solution. Caveat: As shown in Remark 20, certain parts of the spectrum of A have positive real parts, such that, not necessarily but likely, some destabilization occurs in longtime asymptotics. With argumentation made in Aigner (2012), we formally assert that the regularized dynamics sufficiently describe, at least qualitatively, the time evolution of solutions to the original Cauchy problem, as long as ˛ is small enough, such that regularized solutions remain close to the sought ones. Remark 25. Although from a theoretical viewpoint, the Cauchy problems remain well-posed for all ˛ > 0 (where maybe T changes), this does not hold for the actual computations. The conclusion drawn from this is, on the one hand, that all numerical parameters are connected through some functional relation in order to act regularizing, and on the other hand, that the parameter ˛ reveals a very typical behavior of regularized problems – it has to be neither too small nor too large. That is, when solving (10) with, say N D 50 polynomials, as done above, using ˛ D 1=1;000 shows some small oscillations occurring, when nearing the steady state.
1284
M. Aigner
A(x,t) 3
2
1 −3
x
t −2
−1
0
1
2
3
A(x,t) 3
2
t
1
Ast
x 0 −3
−2
−1
0
1
2
3
p Fig. 4 Regularized solution of (10) using N D 50, D 2, g 0, ˛ D 1=100, A0 D 1 C x 2 . 8 Top: explicit Euler, t D 10 at t D 0 (dashed), t 2 f0:1; 0:5; 1g. Bottom: implicit Euler, t D 102 at t 2 f1; 5; 10g, Ast (dashed) is the steady state including the regularization
These oscillations will eventually grow arbitrarily (as expected). On the other hand, using ˛ > 1 yields solutions approaching a steady state that is notably different from the original one. A neat example for such occurrences can be found in Louis (1989) for the case of discrete differentiation. Simply put, if ˛ gets smaller, it acts destabilizing (and we would have to use more polynomials), and if it increases, the solution deviates more and more from the sought result.
On High Reynolds Number Aerodynamics: Separated Flows
1285
So far we have shown by regularizing the Cauchy problem by adding operators, the problem itself was altered and consequently a different problem was solved. Theoretically, we would have to prove that solutions of the Cauchy problem containing A can be made arbitrarily close to solutions of the original problem as ˛ ! 0. Since our approach is primarily via approximated solutions and numerical computations, we heuristically state the more polynomials used, the smaller ˛ can be chosen to act regularizing and thus, by the overall convergence of the scheme the requirement is met empirically. To avoid adding additional terms to the equation or altering certain operators, we consider the possibility of filters. The following section shows how this can be achieved by implicit time stepping methods.
Explicit Versus Implicit Time Integration To study the difference between explicit and implicit (or forward and backward) discrete marching schemes, we consider again the usual abstract homogeneous Cauchy problem on some Banach space @t u.t/ D Au.t/; u.0/ D u0 and assume the solution can be expanded into a Fourier series, i.e., u.x; t/ D
X
uO k .t/e i hk;xi :
k2Z
Let the operator A possess a symbol, then substitution of the ansatz yields, using Lemma 2, X
@t uO k .t/e i hk;xi D
k2Z
X
sb.A/ uO k .t/e i hk;xi :
k2Z
In other words, the Fourier coefficients shall satisfy @t uO k .t/ D sb.A/ uO k .t/; 8k 2 Z: Next, we will do the exact opposite of what is often called method of lines and thus utilize the advantage of Fourier multipliers – only discretize in time. Applying forward differences gives @t uO k .tm /
uO kmC1 uO m k D sb.A/Oum k t
)
uO kmC1 D 1 C t sb.A/ uO m k;
)
1 m uO kmC1 D 1 t sb.A/ uO k :
whereas the backward differences yield @t uO k .tm /
uO kmC1 uO m k D sb.A/OukmC1 t
1286
M. Aigner
Note that replacing 1 with the identity and the symbol with its according operator transfers the above relations back to the same result as if applying finite differences in time directly to the Cauchy problem. As usual, we want to study the behavior of the absolute values of the Fourier coefficients over time regarding their summability (cf. (21)). Thus, denoting the multipliers qQe D 1 C t sb.A/ and qQi D .1 t sb.A//1 for the explicit and implicit scheme, respectively, yields jOukmC1 j D < qQ jOum ukmC1 j D q m jOu0k j; < qQ DW q; 8k 2 Z: k j or jO Obviously, we gained a geometrical sequence, where one has for qD1
jOum u0k j; k j D jO
q D 1
m 0 uk j; jOum k j D .1/ jO
9 > > > > > > : : : alternating > =
jqj < 1
jOukmC1j < jOum kj
: : : decay
jqj > 1
jOukmC1j > jOum kj
: : : growth
: : : constant
> > > > > > > ;
8m:
(41)
For the sake of presentability, we only consider symbols in the form of < sb.A/ D cjkja , k 2 R. As stated in Lemma 3, for well-posedness of the Cauchy problem it is necessary and sufficient for the real part of the symbol to be bounded from above. In the example here, this reduces to the sign of c. Now say c D 1 and consider the operator < A2D / jkj9=4 from the right-hand side of (10), hence qe D 1 C t jkj9=4 ;
qi D
1 : 1 t jkj9=4
(42)
By formally substituting c D 1 and thus obtaining a well-posed problem, we obtain according multipliers, which shall be denoted by qe and qi . These multipliers are depicted in Fig. 5. It becomes absolutely clear from Fig. 5 (left) why an explicit scheme cannot work at all, independently of how small t is chosen. Since every value of k defines a Fourier coefficient uO and a multiplier q.k/, it is also easy to see that the more polynomials are used for a truncated Fourier series (or in the case here, Chebyshev series), the faster the absolute value of the unknown function u grows. As for the well-posed situation, i.e., the solid lines in Fig. 5 corresponding to qe D 1 tjkj9=4 , the smaller the time step gets, the more coefficients lie within the strip ˙1 and thus, their value decreases over time, cf. (41). The more important implications for the case studied can be drawn from the graphs on the right of Fig. 5, i.e., the implicit scheme. Here, the well-known unconditional stability of implicit schemes for well-posed problems is depicted by
On High Reynolds Number Aerodynamics: Separated Flows
1287
q Δt ↓
1 k 0
−1
Δt↓
q Δt↓
1 Δt ↓
k 0 −1
Fig. 5 The multipliers from (42) as functions of k for various t . Top: qe (solid) and qe (dashed), bottom: qi (solid) and qi (dashed)
qi .k/ D .1 C tjkj9=4 /1 (solid lines), which remain independent of t, within the strip ˙1 and thus, the coefficients do not grow for all times. Also, we have thus proved that an implicit scheme can regularize ill-posed evolution problems. Considering the dashed lines, it is obvious that the coefficients uO k , for k 1 decay for all values of t, but this does not imply unconditional stability, since the smaller t, the less multipliers qi have absolute value less than 1. On the other hand, for those k where qi > 1, decreasing t means slowing down the growth, by having more and more qi .k/ just slightly greater than 1. When
1288
M. Aigner
choosing instead t too large, although damping more uO k , those with associated qi .k/ > 1 are much more amplified. In general, we therefore state that the implicit time integration filters the fast growing parts of a Fourier (or other types of orthogonal) decomposition of the solution, provided the time step is neither too small nor too large. For the fully discretized system (40), the interval from which to choose t depends on the number of polynomials appearing in the expansion. As a rule one can claim that the more polynomials used, the larger the allowed interval for the time step gets. Ergo, implicit schemes for ill-posed problems are not unconditionally stable. Finally, the multipliers applied to the regularizing operators (34) are given as qe D 1 C t jkj9=4 .1 ˛jkj3=2 /;
qi D
1 1 t
jkj9=4 .1
˛jkj3=2 /
:
(43)
As mentioned in Remark 20, in principle there are always regions of the regularized spectra which have positive real parts. For the multipliers in discretized time integration here, this means that there are k, such that q.k/ > 1, i.e., growing coefficients. As indicated in Fig. 6, this can be alleviated by decreasing the time step, although one shall not be mislead by the graphs, since ˛ D 1 and hence the denominator in qi has no real zeros. Decreasing ˛ in qe results in more k where qe .k/ > 0 and in larger absolute values of these qe .k/, but a remedy again is lowering t. As for the implicit case, the interplay between ˛ and t can lead to singularities in qi , similar to the non-regularized situation (cf. Fig. 5). In general, decreasing ˛ needs a decrease in t as well, to have a non-negative qi , see Fig. 7. All conclusions made above also hold for problems in more than one dimension in the exact same way. For a detailed study of implicit time marching vs. quasireversibility (i.e., adding higher derivatives) methods, we refer to Aigner (2012). Remark 26. By reversing the time in the original Cauchy problems one can also obtain an upper bound for the resulting real parts of the symbols. This is straight forwardly done by just changing the sign in (32) and is very well supported by using the numerical methods described above, explicit as well as implicit, with negative time steps. The most interesting consideration here is to go backward in time from some previously computed regularized solution at some T > 0. By then, running the explicit Euler scheme (with small enough t to be stable) backward in time without any regularization and using as an initial condition the solution A.x; T /, one can observe that at all times T t, the solution passes through every regularized solution at that time and eventually approaches the original initial condition. Therefore, we have found another way to provide some meaning to the filter acting in implicit schemes. That is, one can allow to dampen decomposition values of solutions, if these solutions correspond to initial conditions with certain regularities.
On High Reynolds Number Aerodynamics: Separated Flows
1289
q 1
Δt↓
0
k
−1
q
1
Δt↓
0
k
Fig. 6 The multipliers for the regularized operators as given in (43) with ˛ D 1, Top: qe , bottom: qi
3.3
Self-Similar Finite Time Blow-Up
The so far presented considerations for the time evolution did not include the perturbation g, cf. (8). For the planar case, e.g., Scheichl et al. (2008) used a vertical wall velocity vw .x; t/, representing a blowing slot, with a continuous “switch on – switch off” behavior. For the three-dimensional problem we multiply vw by p D a p.z/, a 2 R, which has to decay to zero at infinity.
1290
M. Aigner
q
(b)
(a) (c) 0
Fig. 7 The multiplier qi D qi .˛; t / from (43). .a/: ˛ D 5 1 .c/: ˛ D 100 , t D 10
1 k −1
1 , 10
t D 12 , .b/: ˛ D
5 , 100
t D 12 ,
Now say the initial condition is set to be some steady state from the upper branch, A0 D Ast , cf. Fig. 3. From the evidence on stability mentioned above and in Aigner (2012), one can claim, that A.x; t/ D Ast .x/, 8t, with g 0 (and the numerical findings confirm this). So, by having a non-zero disturbance in the sense that ZT kg.t/kL1 dt D const: ¤ 0;
(44)
0
the solution moves away from the equilibrium. Taking T less than the overall time, for which the regularization holds, reveals additional insight into the dynamics of the system. First, if the constant in (44) is too small this means A.; T / is still within the basin of attraction of Ast and thus, the solution has to re-approach its equilibrium. Again, this is sufficiently confirmed by numerical computations, yielding additional evidence for the stability of the upper branch steady states. Second, strong enough perturbations pushing the solution outside the basin of attraction result in a finite time blow-up scenario, which we will study in the following. Note, the Cauchy problems here and the solutions presented are always to be understood in the regularized sense as established in the previous sections. Also, numerical results are gained with highest resolution necessary to depict the sought characteristics, using the filter (i.e., direct implicit) method. Consider the stationary two-dimensional upper branch solution at a certain , cf. Fig. 3(a) as an initial condition. By introducing the three-dimensional blowing slot
On High Reynolds Number Aerodynamics: Separated Flows
1291
mentioned above, the flow, so far independent of z, will experience disturbances in z direction and is thus governed by the Cauchy problem (8). Remark 27. As this treatise only considers locally three-dimensional flows (governed by the problem (8)) it is necessary to remark that Duck (1990) studied a more general case, i.e., globally (symmetric) three-dimensional flows, but without any perturbation present. Nevertheless, for the three-dimensional case Duck first showed some evidence that a finite time singularity can occur with z dependency present. However, the matter of ill-posedness and regularization was completely neglected in this work. Remark 28. Ball (1977) provides an existence and uniqueness theorem for (maximally defined mild) solutions on Œ0; T / of such problems, where A has to be the generator of a strongly continuous semigroup and a nonlinearity F has to be locally Lipschitz. He shows further, that if T < 1 the solution becomes unbounded in the given norm. A textbook example for this would be the heat equation with quadratic nonlinearity or more general reaction-diffusion equations as mentioned in, e.g., Galaktionov and Vázquez (2002). Including now the full perturbation g in (8), Fig. 8 shows A.x; z; t/ for 0 t T D 2, i.e., the evolution with the perturbation “switched on”. The resulting solution A.x; z; T /, now taken as an initial condition for the unperturbed Cauchy problem, has deviated sufficiently from the steady state as not to re-approach it for t > T . Instead the minimum formed in these first time steps will become more and more pronounced with larger negative values, where eventually the blow-up occurs in the point .xs ; ˙zs /, cf. z-symmetry of the solution, see Fig. 9. Plotting the minimum of A versus the time, the existence of the finite time singularity becomes even more apparent, see Fig. 10. In the three-dimensional set-up, solutions of the unperturbed problem are always symmetric with respect to z and thus multiple, simultaneously occurring singularities in z (at one common xs ) are possible (see Fig. 9, where the applied 1 1 blowing device has two symmetric maxima, e.g., p.z/ D 1C.z1/ 2 C 1C.zC1/2 ). With the results in Aigner (2012) and Aigner and Braun (2015) it is sufficiently shown that blow-up occurs at different points and different times. The usual question then is whether some generality behind the formation of the singularity can be found, i.e., a structure, independent of .xs ; zs ; ts /. As mentioned in the monograph by Barenblatt (1979), one shall investigate the problem and its solutions regarding the so-called self-similarity. This became a prominent technique to gain further insight into blow-up phenomena of time dependent problems, cf. the survey by Eggers and Fontelos (2009). In Aigner (2012) and Aigner and Braun (2015) more details on why to study the blow-up phenomenon in its various occurrences and characteristics, further ramifications and the therein found self-similar structure are presented.
1292
M. Aigner
A(x,0, t)
4
Ast
2
x
0
t1 t2 t3
−2 −4 −4 1
−2
0
2
4
A(0,z, t)
0
z t1 t2 t3
−1 −2 −3 −4 −5 −6
0
1
2
3
4
5
6
7
Fig. 8 The solution A.x; z; t / at .t1 ; t2 ; t3 / D .1; 1:5; 2/, Top: at z D 0, bottom: at x D 0
4
Concluding Remarks
In this text, we studied the ill-posedness and the finite time singularity of the triple-deck description of marginally separated flows in a locally three-dimensional set-up. Other works have termed these characteristics as the breakdown of the tripledeck and its asymptotic expansions. We would like to emphasize here that the ill-posedness of the Cauchy problem might not necessarily be associated with the non-uniform validity of the expansions of the flow and pressure field.
On High Reynolds Number Aerodynamics: Separated Flows
1293
10 5
x
0 −5
t
−10 −15 −20 −25 −4
−3
−2
−1
0
1
2
3
0
4
z
−20
t
−40
−60
−80 0
0.5
1
1.5
2
2.5
3
Fig. 9 The solution A.x; z; t / of the unperturbed problem continuing Fig. 8 at t f2:306; 2:363; 2:386; 2:4; 2:405g, Top: at z D 0, bottom: at x D 0
2
We have demonstrated the ill-posedness to be an intrinsic property of the given Cauchy problem. Nevertheless, from the deduction of the problem, using the coordinate scalings from matched asymptotic expansion procedures, one can see that different scalings can lead to a different Cauchy problem, especially with respect to the time derivative term. Furthermore, as shown in Aigner and Braun (2015), by considering more expansion terms, higher derivatives of the unknown function A appear, which are proven to act regularizing. These derivatives can be related to the local streamline curvature. As a conclusion, one might assert curvature
1294 Fig. 10 The evolution of the minimum of A.x; z; t /, with the results taken from Aigner (2012) and Aigner and Braun (2015)
M. Aigner 0 −20 −40
min (A(x,0, t)) x∈R
−60 −80 −100 −120
t 1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
effects to be non-negligible (or of similar order) when studying the time evolution of the local wall shear stress. To speak of an actual breakdown of the asymptotic structure one needs to track possible singularities which are connected to higher order terms in the expansions. Such singularities violate the principle of an asymptotic series (i.e., every expansion term needs to be asymptotically small compared to the previous one) indicating the non-uniform validity (with respect to all independent variables and parameters) of the expansion. In Aigner (2012), it has been sufficiently demonstrated that a singularity appears in the time evolution even though the Cauchy problem was regularized (using analytical and numerical methods). Additionally, general results for well-posed reaction-diffusion problems show finite time singularities to depend mainly on certain nonlinear terms in the equation and the initial conditions. To show the finite time blow-up to occur under certain conditions, as we have done here, is only one aspect of developing the theory of marginal separation. To embed the singularity into a more general concept, one needs to find an independent, intrinsic structure of the blow-up. The thus resulting self-similar blow-up profile was first mentioned in Duck (1990) for the three-dimensional case, where Aigner (2012) and Aigner and Braun (2015) first presented numerical computations to find this structure and also showed its generality in the sense that it is unique and independent of whether the underlying flow is considered locally or globally three-dimensional. Eventually, the blow-up induces new spatio-temporal coordinate scalings and a “next stage”, i.e., a nonlinear triple-deck problem for which the unique self-similar structure poses as an appropriate initial condition, see Aigner (2012) and Aigner and Braun (2015) for more details. Overall, the occurrences of singularities and the resulting shorter and faster spatio-temporal scales build a cascade, which one could associate with transition to turbulence from a laminar boundary layer flow.
On High Reynolds Number Aerodynamics: Separated Flows
1295
References Achleitner F, Hittmeir S, Schmeiser C (2011) On nonlinear conservation laws with a nonlocal diffusion term. J Differ Equ 250:2177–2196 Aigner M (2012) On finite time singularities in unsteady marginally separated flows. Doctoral thesis, Vienna University of Technology Aigner M, Braun S (2015, to appear) On the self-similar blow-up in unsteady three-dimensional marginally separated flows Ball JM (1977) Remarks on blow-up and nonexistence theorems for nonlinear evolution equations. Q J Math 28:473–486 Barenblatt GI (1979) Similarity, self-similarity, and intermediate asymptotics. Consultants Bureau, New York Braun S, Kluwick A (2002) The effect of three-dimensional obstacles on marginally separated laminar boundary layer flows. J Fluid Mech 460:57–82 Braun S, Kluwick A (2004) Unsteady three-dimensional marginal separation caused by surfacemounted obstacles and/or local suction. J Fluid Mech 514:121–152 Brown SN, Stewartson K (1983) On an integral equation of marginal separation. SIAM J Appl Maths 43:1119–1126 Droniou J, Gallouët T, Vovelle J (2002) Global solution and smoothing effect for a non-local regularization of a hyperbolic equation. J. Evol Equ 3:499–521 Duck PW (1990) Unsteady three-dimensional marginal separation, including breakdown. J Fluid Mech 220:85–98 Eckhaus W (1973) Matched asymptotic expansions and singular perturbations. North-Holland, Amsterdam Eggers J, Fontelos MA (2009) The role of self-similarity in singularities of partial differential equations. Nonlinearity 22:R1–R44 Engel K-J, Nagel R (2000) One-parameter semigroups for linear evolution equations. Springer, New York Fromme JA, Golberg MA (1979) Numerical solution of a class of integral equations arising in two-dimensional aerodynamics. In: Golberg MA (ed) Solution methods for integral equations. Plenum Press, New York, pp 109–146 Galaktionov VA, Vázquez JL (2002) The problem of blow-up in nonlinear parabolic equations. Discrete Contin Dyn Syst 8:399–433 Golberg MA (1979) A survey of numerical methods for integral equations. In: Golberg MA (ed) Solution methods for integral equations. Plenum Press, New York, pp 1–58 Gorenflo R, Vessella S (1991) Abel integral equations. Lecture notes in mathematics. Springer, Berlin/Heidelberg Hadamard J (1923) Lectures on Cauchy’s problem in linear partial differential equations. Yale University Press, New Haven Louis AK (1989) Inverse und schlecht gestellte Probleme. Teubner, Stuttgart Mikhlin SG (1936) Singular integral equations with two independent variables (in Russian). Mat Sbor 1 4(43):535–550 Mikhlin SG (1965) Multidimensional singular integrals and integral equations. Pergamon Press, New York Ortega JM, Rheinboldt WC (1966) On discretization and differentiation of operators with application to newtons method. SIAM J Numer Anal 3:143–156 Petrowsky IG (1937) Über das Cauchysche Problem für Systeme von partiellen Differentialgleichungen. Mat. Sbor. 2(44):815–870 Rodino L (1993) Linear partial differential operators in Gevrey spaces. World Scientific, Singapore Ruban AI (1981) Asymptotic theory of short separation regions on the leading edge of a slender airfoil. Izv Akad Nauk SSSR Mekh Zhidk Gaza 1:42–51 (Engl. transl. Fluid Dyn 17:33–41)
1296
M. Aigner
Ruban AI (2010) Asymptotic theory of separated flows. In: Steinrück H (ed) Asymptotic methods in fluid mechanics: survey and recent advances, CISM, vol 523. Springer, Wien/New York, pp 311–408 Scheichl S, Braun S, Kluwick A (2008) On a similarity solution in the theory of unsteady marginal separation. Acta Mech 201:153–170 Stein EM (1970) Singular integrals and differentiability properties of functions. Princeton University Press, Princeton Stewartson K, Smith FT, Kaups K (1982) Marginal separation. Stud Appl Maths 67:45–61 Sychev VV, Ruban AI, Sychev VV, Korolev GL (1998) Asymptotic theory of separated flows. Cambridge University Press, Cambridge Tikhonov AN, Arsenin VY (1977) Solutions of ill-posed problems. Winston & Sons, Washington, DC
Turbulence Theory Steffen Schön and Gaël Kermarrec
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Physical Description of the Tropospheric Turbulence . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 An Intuitive Approach of Turbulence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Troposphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Scales of Turbulence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Navier-Stokes Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 A Spectral Approach of Turbulence: The Concept of Eddies . . . . . . . . . . . . . . . 2 Statistical Approach of Turbulence for Geodetic Applications . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Random Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Spectral Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Covariance Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Covariance: Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Previous Work on Physical Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Treuhaft and Lanyi model (1987) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Schön and Brunner model (2008a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Comparison Between the Schön and Brunner Model and the Treuhaft and Lanyi Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Applications of the Covariance Matrix Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Simulation of Slant Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Improving the Stochastic Model of GPS or VLBI by Taking into Consideration Physical Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Wavelet and Turbulence: Covariance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1298 1298 1298 1299 1301 1303 1306 1310 1310 1310 1318 1320 1321 1321 1322 1323 1324 1326 1337 1337 1337 1340 1342 1343 1344
S. Schön () • G. Kermarrec Institut für Erdmessung, Leibniz Universität Hannover, Hannover, Germany e-mail: [email protected]; [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_77
1297
1298
S. Schön and G. Kermarrec
Introduction The word turbulence comes from late Latin “turbulentia” which means “full of commention.” It is defined as a “violent or unsteady movement of air or water, or of some other fluid” (Oxford Dictionary of English 2010). Thus, it is a process that dissipates or mixes. The antonyms are unity or homogeneity: they help us to understand more clearly what turbulence concretely means – turbulence mixes and disperses the medium in which it develops, and then it disappears once homogeneity returns. Turbulence has a strong impact on terrestrial and space geodetic observations. Indeed, fluctuations of temperature, humidity, and pressure generate changes of the refractive index and consequently variations in the measurements. Prominent examples are “tropospheric slant delay” variations in the case of GPS (Global Positioning System) and VLBI (Very Long Baseline Interferometry) or image fluctuations in theodolites or levels as well as beam wander in laser devices. In this chapter, we will focus on atmospheric turbulence, particularly on the one that takes place in the troposphere. After an introduction to turbulence by an intuitive definition of the Reynolds number, we will give a short overview of the troposphere and focus on the boundary layer. The production of turbulence will be described and the different scales of turbulence presented. The existence of a so-called spectral gap is an important step in understanding the statistical approach of turbulence which will be introduced in Sect. 2. A short description of the Navier-Stokes equations will give a physical background of turbulence by means of forces, fluxes, transport, and viscosity. The spectral approach, introduced by Kolmogorov (1941), describes the behavior of eddies or “swirls of motion,” thanks to the “energy cascade model.” The expression of the structure and power spectral density functions as well as power spectrum for turbulent fluctuations will be given based on this theory. The cross-links between those statistical quantities as well as their respective advantages will be explained. A third part concentrates on the variance-covariance model developed by Treuhaft and Lanyi (1987) as well as the Schön and Brunner (2008a) model. The parameters of this last model will be described based on physical considerations, and the influence of the wind vector will be shown. In a last paragraph, some applications of these functions will be given such as the simulation of tropospheric slant delay, thanks to a case study based on the Schön and Brunner covariance model.
1
Physical Description of the Tropospheric Turbulence
1.1
An Intuitive Approach of Turbulence
Turbulent flows can be intuitively associated with eddies in a streaming current, whereas laminar flows are calm, as canal water. If water is replaced by a viscous substance, such as honey, a relationship between turbulence and viscosity
Turbulence Theory
1299
becomes evident. The Reynolds number Re, ratio of inertial forces to viscous ones, gives therefore a good way to characterize turbulence: Re D
inertial forces D viscous forces
U U=L vU=L2
D
UL ; v
with L being a scale length [m], U the velocity [ms1 ], and the kinematic viscosity [m2 s1 ]. Typical values for the surface layer of the troposphere are L D 100 m; U D 5 ms1 , D 1; 5 105 m2 s1 giving a value of Re > 107 (Stull 1988, p. 93). For such large Reynolds numbers, acceleration and instabilities dominate. This is the case for the troposphere where the Reynolds number can even be larger than 107 .
1.2
Troposphere
The troposphere is a layer of the atmosphere between 0 and 11 km altitude (Stull 1988, p. 2). It includes the Earth’s surface where forcing and friction strongly influence the air flow and contribute to its turbulent character. The troposphere contains more than 80 % of the atmospheric mass and the main part of the water vapor which fluctuations have an impact on the propagation of microwave or optical signals. The tropopause at the top plays the role of a “cover” before the stratosphere begins. Please refer exemplary to Danielson et al. (2003) for more informations on the structure of the atmosphere.
The Lower Part of the Troposphere: The Boundary Layer Following Stull (1988), the boundary layer is “where we live”: “the part of the troposphere that is directly influence by the presence of the earth’s surface and responds to the surface forcing with a timescale of about one hour or less” (Stull 1988, p. 2). Figure 1 gives a schematic representation of the three main parts of the boundary layer also known as the planetary boundary layer (PBL): The mixed layer is where heat, moisture, and momentum are uniformly mixed by turbulence. This layer is located above the surface layer and below the entrainment zone. Although wind shear can also generate mechanical turbulence within the mixed layer, this one is mainly convectively driven from heat transfer or radiative cooling (Stull 1988). In the atmospheric mixed layer, the potential temperature (temperature which would result if the air were brought adiabatically to a standard pressure p0 D 1;000 hPa) and specific humidity (ratio of water vapor to dry air) are nearly constant with height. Wind speeds are below the geostrophic value of about 8–10 ms1 and wind profiles are nearly logarithmic. Two forces act on the atmosphere in a rotational frame: the Coriolis effect and a force resulting from the pressure difference that moves the air particle in the direction of low pressure. Thus, the wind direction rotates as one moves away from the surface, and the Ekman spiral is a good approximated description of the wind vector in the boundary layer.
1300
S. Schön and G. Kermarrec
Fig. 1 Boundary layer representation adapted from Stull (1988)
For terrestrial geodetic measurements that take place during the day, the turbulent activity of the mixed layer is particularly important. The stable (nocturnal) boundary layer is created via radiational cooling after sunset as the surface cools. It is characterized by a strong static stability and weak and calm winds at the surface which increase to supergeostrophic speeds aloft (nocturnal jets). Due to the strong stability, turbulence is more sporadic and may occur as “bursts”. The residual layer which appears approximately 1/2 h before sunset does not come directly in contact with the ground and is therefore not a boundary layer. The thermals of the convectively mixed boundary layer shut off as the surface is cooling.
Production of Turbulence in the Boundary Layer In the boundary layer, atmospheric turbulence is driven by four main components: • The mechanical production of turbulence which comes from the vertical variation of the wind vector. It represents a conversion of kinetic energy of the wind into turbulent kinetic energy. Mechanical turbulence is mainly isotropic and eddies are small (maximum a few meters). • The convective production is a conversion of the mean potential energy into turbulent kinetic energy by the mean work of the buoyancy force. As can be seen in Fig. 2, the buoyancy force is the net upward force exerted by a fluid of density fluid on an immersed body of density with fluid ¤ . Following Archimedes’ principle, the buoyant force is vertical, opposite to the gravity force, and equal to the weight of fluid it displaces. • The transport which redistributes the turbulence in the boundary layer. • The molecular dissipation which corresponds to a conversion of kinetic energy into heat by viscous dissipation of turbulence.
Turbulence Theory
1301
Fig. 2 Buoyancy and gravity force
Thus, turbulence is not uniquely associated with large values of the Reynolds number. The instability of the air particle in terms of buoyancy is playing an important role. When the atmosphere is considered as stable, the buoyancy tends to suppress turbulence, whereas wind shears tend to generate turbulence (Stull 1988). The flux Richardson number Rif as the ratio of the buoyancy term to the mechanical production term of turbulence was introduced to quantify this effect: Rif D
buoyancy force mechanical force
with mechanical force > 0. This ratio allows a characterization of the stability of the atmosphere: • If Rif > 0, the atmosphere is statically unstable • If Rif > 0, the buoyancy term is negative and the atmosphere stable. It is considered that if Rif is smaller than Rifcrit a critical value, mechanical production of turbulence can be generated. However, for Rif > Rifcrit , the flow becomes laminar. Rifcrit is taken in the atmosphere between 0.25 and 1 depending on the turbulence being considered (3D or 2D). Because of the difficulty to estimate flux and thus Rif , the gradient or bulk Richardson number (Stull 1988, p. 176) is more often used.
1.3
Scales of Turbulence
Turbulence in the atmosphere is having different scales – the largest ones are responsible for the heat exchange between the equator and high latitude and are called Hadley cell (Laing 1991). Depressions and anticyclones scales can be more than
1302
S. Schön and G. Kermarrec
Fig. 3 Temperature spectrum in the boundary layer
1,000 km long. Finally, the smallest scale, mainly near the ground, is called the Kolmogorov scale and is smaller than 1/10 mm. The largest scale with respect to the size of the Earth corresponds to two-dimensional turbulence (2D), whereas the microscales are more isotropic and considered to be three-dimensional (3D). Figure 3 depicts the energy spectrum E.!/ in the boundary layer and highlights the coexistence of the two scales in the atmosphere. It presents an idealized picture of the temperature spectrum in the boundary layer (please refer to Van der Hoven (1957) for the wind speed spectrum). The classical representation !E.!/ versus log(!/, where ! D 2=T is the angular frequency and T the period, is drawn. The surface under the curve between two angular frequencies !1Rand !2 is proportional to R the energy contained in the spectral band [!1 ; !2 ] since !E.!/d .log.!// D E.!/d !. For an easier interpretation, the period on the abscissa is also given. The first peak values (1) are associated with 2D movements that take place above the boundary layer. The eddies have a size of 3,000–4,000 km and a period of 4 days. The second peak (2) highlights the existence of a 3D isotropic turbulence, identical in all directions, with a mean short period of 3 s and a maximum size of 400 m which characterizes turbulence in the boundary layer. The spectral gap with periods between 30 minutes and a few hours was experimentally showed by Gage (1979). The “valley” that separates 2D and 3D turbulence is of strong importance for the statistical description of turbulence. Because of the many scales of turbulence coexisting in the troposphere, the behavior of each individual eddy cannot be computed. A statistical approach is therefore more appropriate to describe the turbulent behavior of parameters such as temperature, pressure, wind, moisture, or refractive index. Thanks to the existence of a spectral gap, the splitting of the parameters in a mean, slowly varying part and a residual or turbulent, fast part is possible. For a parameter f , this can be expressed mathematically by
Turbulence Theory
1303
f D hf i C f 0 ; where hf i is the mean value of f and f 0 are its turbulent fluctuations. In a similar way, the velocity vector given by u D Œux ; uy uz T can be decomposed as u D huiCu0 . This formulation is known as “Reynolds decomposition.” Thus, the NavierStokes equations which we will shortly present in the following paragraph can be expressed for both the mean and the turbulent flows.
1.4
Navier-Stokes Equations
Conservation Equations The Navier-Stokes equations are differential equations that describe the motion of fluids, here the atmospheric particles. There exist many ways to present these equations: as system of differential equations, in Cartesian or spherical coordinates, for incompressible flow, for dry air, etc. All the formulations are based on conservation equations of mass, moisture, momentum, and heat and on the equation of state for an ideal gas. The following abbreviations will be used for the divergence operator and the Laplace operator, respectively: div.u/ D
@u @u @u @2 u @2 u @2 u C C ; u D 2 C 2 C 2 @x @y @z @x @y @z
The material or Lagrangian derivative of f taken with respect to a moving coordinate system is used in fluid mechanics. It is defined as the sum of the partial (Eulerian) and advective terms: @f df D C u:grad.f /: dt @t This derivative gives information about the temporal evolution of a field by following it in its movement. The advective term is a description of the transport of f by the wind vector u. Equation of State The equation of state of a gas reads p D RT; with p the pressure of dry air [mbar], R the gas constant for dry air (R D 287 J K1 kg1 ), the mass density of air (1.3 kgm3 ), and T the absolute temperature in [K].
1304
S. Schön and G. Kermarrec
Conservation of Mass The conservation of mass is expressed by d D div.u/: dt For an incompressible flow, the so-called Boussinesq approximation holds and 0. This equation is usually simplified: div.u/ D 0:
d dt
D
Conversation of Momentum Newton’s second law in an ECEF (Earth-Centered, Earth-Fixed) frame reads
du D grad.p/ C g C grad.div.u// C u 2 u; dt 3
where grad(p/ is the pressure gradient force, g the influence of the gravity field with g the gravity field vector jgj 9:81 ms2 , and 3 grad.div.u// C u are viscous stress forces with Œkg m1 s1 the dynamic viscosity coefficient. Finally, 2 u is the Coriolis force due to the Earth’s rotation with jj 7:29105 s1 , the norm of the Coriolis vector of the Earth, and being the cross product of two vectors. Conservation of Internal Energy: First Law of Thermodynamics The conservation of interval energy is expressed by
Q deint D p div.u/ C T C " C ; dt dt
where eint D CvT is the internal energy, assuming that the air is an ideal gas at temperature T [K]. The variation of internal energy depends on p div(u) which is a term describing the heat transport via adiabatic process and # T via molecular conduction, where Œkg m1 s1 is called the thermometric conductibility coefficient. The heat transport via molecular dissipation of kinetic energy is denoted ", where "Œm2 s3 is the dissipation rate of kinetic energy into Q heat. Finally, dt is a dissipation term via radiation. This equation can be simplified (De Moor 2006) and written in the following form: d D v dt where v D 2 105 m2 s1 is the kinematic coefficient for air and the potential temperature.
Turbulence Theory
1305
Potential Temperature and Virtual Potential Temperature The potential temperature represents the temperature which would result if the air were brought adiabatically to a standard pressure level p0 : DT
p p0
0:286
The virtual potential temperature Tv is the temperature required for dry air in order to have the same pressure and the same density as a sample of moist air: Tv D T .1C 0; 61q/, where q is the mixing ratio of air (mass of water vapor to the mass of dry air). The virtual potential temperature is a conserved variable in unsaturated ascent. If Tv is constant with height, the atmosphere is statically neutral. If it increases with height, the atmosphere is stable, and unstable if the virtual potential temperature decreases with height. Conservation of Specific Moisture Finally, the conservation of specific moisture q reads
dq D q q dt
where q Œkg m1 s1 is the water vapor diffusion coefficient, taken as constant. These equations can be complemented by taking into consideration the condensation of water vapor. Please refer to Stull (1988, chapter 3) for more details.
Approximations The Navier-Stokes equations describe physically the turbulence in the atmosphere and are important to understand the fluxes, transport, and forces that govern the atmosphere. From these five equations, two sets are derived: one for the mean flow and the other for the turbulent flow by using the relation u D hui C u0 . This @..huz iCu0z /.hf iCf 0 // D complex set of equations cannot be solved directly. Indeed, as @z @.u0z f 0 / @.huz ihf i/ 0 0 C @z (third axiom of Reynolds), nonlinear terms of advection uz f @z appear, making the system unclose without a “closure assumption” (Boussinesq 1877). This latter approximation postulates that the unknown turbulent moments can be expressed in terms of known ones. For the order one, it gives, for instance, N for the variable f : huz f 0 i D Kf @@tf , with Kf being the closure coefficient. Other approximations can be made if the magnitude of some terms is neglected like shallow motion approximation (Mahrt 1986) and shallow convection approximation. The “Boussinesq” approximation states that the vertical scale is small compared to the effective height of the atmosphere; thus, the high frequency can be neglected and the state of the atmosphere varies little from the hydrostatic and adiabatic one. The incompressibility of the atmosphere is the most important application of this approximation. Refer to Stull (1988, p. 80) for more details.
1306
S. Schön and G. Kermarrec
Derived from the Navier-Stokes equations, the energy cascade is another way to describe turbulence that was introduced by Kolmogorov (1941). We will focus on this approach in the next section.
Mean Turbulent Kinetic Energy The TKE is the mean turbulent kinetic energy per unit mass associated with turbulence. It is often used to quantify the intensity of turbulence TKE D
E 1D 0 ux 2 C u0y 2 C u0z2 ; 2
where u0x , u0y , u0z denotes the turbulent fluctuations of the velocity field. Thus, for u D 0, the flow is laminar. From the Navier-Stokes equations, the evolution of TKE with time is given in a simplified form by @TKE D T C M C F C "; @t where T is a redistribution term of existing turbulence without creation nor destruction of turbulence and M describes the transfer between the mean kinetic energy and the turbulent kinetic energy. It contributes to a gradient homogenization per mixing. F is a production term via the buoyancy forces. The diurnal variation of F is important (unstable during day and stable at night). Finally, " is a viscous dissipation term of turbulence and acts to reduce turbulence.
1.5
A Spectral Approach of Turbulence: The Concept of Eddies
Energy Cascade The turbulent kinetic energy can be expressed using the wavenumber spectrum (Wheelon 2001, p. 22): Z TKE D ˆv ./d 3 3
where ˆv ./ is called the 3D velocity turbulence spectrum. It describes the mean energy of the wavenumber vector D .x ; y ; z /T : ˇ2 + *ˇˇ ˇ Z ˇ ˇ 1 0 i x 3 ˇ ˇ ˆv ./ D ˇ u .x/e d xˇ 3 ˇ ˇ .2/ 3
The kinetic energy of the turbulence can be interpreted as a superposition of Fourier modes of the wave vector also called eddies of size 1 D 2 that all contribute to the global TKE. This energy is computed over all directions of . Thus, it is possible
Turbulence Theory
1307
Fig. 4 Energy cascade model (from Schön and Brunner 2008a)
Turbulent energy / eddy instability
Energy input
eddy
L0
Cascading enery redistribution
Energy dissipation
I0 κo Energy input region
Wavenumber Inertial subrange
κs Energy loss region
to define the energy spectrum E./ as the integral of the wavenumber spectrum over a sphere of radius D jjjj: Z1 TKE D
E./d : 0
Only the information about the mean energy of scale D jjjj is retained. This description of the energy is only complete in the case of isotropy. Based on this assumption, Kolmogorov (1941) postulates by dimensional analysis, i.e., by inspecting the physical units of the quantities in the equations, that in a so-called “inertial subrange” E./ "2=3 5=3 holds. If [ ] states for “dimension of,” we have ŒTKE D m2 s2 ; Œ" D m2 s3 ; Œ D 1 m ; thus, ŒE./ D
ŒTKE D m3 s2 Œ
A dimensional analysis gives the following combination of units: [ 8=3 5=3 D m3 s2 ]; hence, ) E./ D C "2=3 5=3 with C being a dimensionless constant. Therefore, turbulent processes are often explained and illustrated by the energy cascade model (Fig. 4), consisting of three regimes, the energy input region, the
1308
S. Schön and G. Kermarrec
inertial subrange, and the energy dissipation region. All eddies coexist and are imbricated: • Energy input region: At large scales L (small wavenumbers D 2=L/, a small part of the kinetic energy in the ambient wind field is converted into turbulent energy. This is the scale where turbulent energy is produced. Large eddies are highly elongated (Wheelon 2001) and contain considerable kinetic energy. The initial size of the eddies is called the outer scale length L0 . • Inertial subrange: The eddies begin to break down as soon as they are created. The input energy is subsequently redistributed and divided to smaller and smaller scales (energy cascade). During this process, the size of the eddies is reduced and they become more and more symmetrical. No energy is created or dissipated in this range. The cascade continues until the Reynolds number is sufficiently small, i.e., Re u0vL0 , where u0 , L0 are, respectively, the velocity and the outer scale length and the viscous coefficient. • Energy loss region: The process ends when the size of the eddies is comparable to the size of the inner scale length l0 . The remaining energy is dissipated by viscosity as heat. At the inner length scale, the Kolmogorov microscale can be 3 1=4 defined as D v" . For the atmospheric turbulence, this scale is in the order of a fraction of mm. For more details, please refer to Tatarskii (1971a), Wheelon (2001, chapter 2), or Kolmogorov (1941). The Kolmogorov theory, first developed for isotropic and homogeneous turbulence, has shown to work under a wide range of conditions where neither isotropy nor homogeneity can be assumed (Kraichnan 1967).
Wavenumber Spectrum of Turbulence The turbulent kinetic energy can be expressed using the wavenumber spectrum (Wheelon 2001, p. 22) as shown in section “Energy Cascade.” Derived from the power law dependency of the energy spectrum, Kolmogorov 0:033 C 2 proposed the following 3D velocity turbulence spectrum: ˆv ./ D 2 2 2v 11=6 , .x Cy Cz / where ˆv ./ is the velocity turbulence spectrum valid for the inertial range 2 L0 , where L , l are the outer and inner scale length of turbulence that bounds 2 0 0 l0 the inertial range, respectively, and Cv2 is called the velocity structure constant. This model was confirmed by measurements of passive scalars such as temperature or refractive index (Monin and Yaglom 1975). The velocity can therefore be replaced by n, the refractive index. The main problem of this model is that it gives infinite values for some quantities such as the mean square fluctuations of the refractive index, h.n0 /2 i. Therefore, the von Karman model was proposed for the energy input region
Turbulence Theory
1309
Fig. 5 Comparison of von Karman and Kolmogorov spectrum model
0:033Cn2 ˆn ./ D 11=6 ; x2 C y2 C z2 C 02 where Cn2 Œm2=3 is the refractive index structure constant. This scaling parameter describes the intensity of turbulence. Its value is different for microwave frequencies more influenced by water vapor content and optical frequencies influenced by temperature fluctuations. Depending on the outer length scale through 0 , this empirical model which showed to be also valid for the inertial range is more easy to handle and provides finite estimates for all quantities of interest, particularly the structure function. A graphical representation of these spectra is depicted in Fig. 5. The blue line represents the logarithmic Kolmogorov energy spectrum versus the logarithmic wavenumber, and the red one, the von Karman model. The saturation property of the von Karman model for small logarithmic wavenumber is highlighted by the nearly horizontal slope. Please refer to Voitsekhovich (1995) or Wheelon (2001, chapter 2) for a presentation of other models such as Greenwood model or Exponential model. By spatially filtering the small scales and only dealing with scales that are at equilibrium in the energy cascade, the spectral analysis is a good way to “calculate” turbulence. It is the objective of the so-called large eddy simulation (Lesieur 2008)
1310
S. Schön and G. Kermarrec
where the large scales are calculated explicitly and the effect of small scales is modeled using a sub-grid scale model.
2
Statistical Approach of Turbulence for Geodetic Applications
2.1
Introduction
In geodetic applications, the statistical approach of turbulence by means of structure functions, power spectral density, or correlation as well as covariance functions is used. Indeed, when dealing with time series of GPS- or VLBI-derived tropospheric slant delay, residuals of least-squares adjustments, or phase differences, these functions are an easy way to describe or quantify turbulence effects by searching, for instance, a Kolmogorov power law dependency. In this section, the focus will be on the definition of stationary processes, stationarity with increments, as well as the concept of local homogeneity. The Kolmogorov theory gives access to a scaling for the spatial structure function of the refractive index and allows us to derive the temporal and spatial structure function for phase measurements with the “5/3 2/3” power law dependency. A brief link to semivariograms will be given. The introduction of power spectral density will allow an interpretation of the results by means of low-pass filtering. We will conclude this statistical part by highlighting the connection between all these quantities making a short comparison between turbulent time series and processes such as white noise or random walk, the reader may be more familiar with.
2.2
Random Fields
Stationary Increments and Local Homogeneity Stationary Increments Time series of temperature, pressure, moisture, wind velocity, or refractive index used to describe atmospheric turbulence are not stationary. In the following, such a random process will be called f which can be written as the sum of a mean part hf i and turbulent fluctuations f 0 : f D hf i C f 0 with hf 0 i D 0. The mean value hf i is time dependent; thus, f .t/ is not stationary. However, the difference f .t C / f .t/; being a time increment, is considered to be stationary over a large range of time increments. Such a process is called a process with stationary increment. It is defined (Ishimaru 1984, p. 512; Yaglom 1987) by the following properties:
Turbulence Theory
1311
f .t C / f .t/ function of only and Df ./ D hŒf .t C / f .t/ 2 i function of only. Df ./ is called the structure function of the first increment or of the second order. The structure function can always be written by means of the covariance function: Df ./ D Bf .t C ; t C / C Bf .t; t/ Bf .t C ; t/ Bf .t; t C / where the covariance Bf is given by Bf .t; t C / D hŒf .t/ hf .t/i Œf .t C / hf .t C /i i : The resulting expression is time dependent which can be verified by taking a linear function f D a C bt, where a and b are random variables (Tatarskii 1971a, p. 14). Stationary Process A process which statistical properties do not vary with time is defined as a stationary process and is a special case of a process with stationary increments. Its probability density function is independent of any time shift. Only in this case, the covariance function can be expressed by the structure function Df ./ D 2Bf .0/ 2Bf ./; or Bf ./ D 12 Df .1/ Df ./ since hŒf .t/ 2 i D hŒf .t C / 2 i D Bf .0/. Thus, for stationary processes, the structure function converges or saturates at infinity and can be used equivalently to the covariance function. Given an unknown process which is not obviously stationary, its structure function can be used for a first characterization. Errors in hf i will not affect the results of the structure function estimation. In a strict sense, turbulence parameters cannot be considered as stationary. However, from physical considerations (Williams et al. 1998) and for very large time increments, the amplitude of long-wavelength fluctuations should behave as white noise and converges to a given value. Thus, it is possible to compute their covariance function by the previous equation. For example, it was experimentally shown (Hogg et al. 1981) that for > 350 s, the temporal structure function for Water Vapor Radiometer data saturates. Local Homogeneity and Isotropy Equivalently to the time domain, a locally homogeneous and isotropic process can be defined by replacing by a spatial increment . If r is a 3D position vector, the spatial structure function Df ./ D Df ./ D h.f .r C / f .r//2 i depends only on the magnitude of the separation distance and not on its direction. It can be shown Ishimaru (1984, p. 516) that in the case of isotropy,
1312
S. Schön and G. Kermarrec
Z1 sin ˆf ./ 2 d 1 Df ./ D 8 0
where D jjjj is the norm of the wavenumber ˆf ./ D R vector iand 1 r B ˆf .jj/; ˆf ./ being the 3D spectrum: ˆf ./ D .2/ .r/e dr . f 3 If f is locally homogeneous and isotropic, the spectrum is having a singularity at the origin of type x n with n > 5 for x ! 0. This is required for the convergence of the integrals (Tatarskii 1971a). Dealing with Anisotropy The Kolmogorov theory postulates the isotropy of turbulence in the inertial range. This limitation can be weaken by the use of the so-called stretched wavenumber coordinates that take into account anisotropy to some extent (Wheelon 2001, p. 59). The wavenumber spectrum can be expressed, thanks to an affine transformation of the wavenumbers, yielding q ˆf ./ D abc ˆf
a2 x2 C b 2 y2 C c 2 z2 ;
with a,b,c being stretching factors that describe the elongation of the eddies in the three dimensions. Locally Inhomogeneous Field with Smoothly Varying Mean Characteristics Inhomogeneity can be expressed (Tatarskii 1971a, p. 36) by a product of a slowly varying function which describes the spectral distribution of turbulent fluctuations 2 for the whole medium ˆ0f ./ and a term with more quick variations Cf2 r1 Cr . 2 This second term describes the intensity of the fluctuations in a given region of the medium: r1 C r2 r1 C r2 ˆf ; D C2 ˆ0f ./; 2 2 where r1 and r2 denote two different position vectors as described in Fig. 6.
Kolmogorov Theory Following Kolmogorov (1941), similarity hypothesis results in a power law dependency for the structure function: Fig. 6 Representation of the position vectors and the separation distance
Turbulence Theory
1313
Df ./ D Cf2 2=3 or for the index of refractivity: Dn ./ D Cn2 2=3 , where Cn2 is the structure constant of refractivity in [m2=3 ]. This popular 2/3 power law for the structure function is surprisingly experimentally applicable over a wide range of separations (Wheelon 2001, p. 31) even when the turbulence cannot be supposed isotropic anymore. Indeed, Kraichnan (1967) postulates that for 2D turbulent processes, energy should be transferred from forced scales to larger ones at the R constant rate ": E./
2=3 5=3 ; the second invariant called enstrophy Z./ D 2 E./d goes this time from the forced to the smaller scales at a constant rate. The Global Atmospheric Sampling Program (GASP) showed evidence for this law in the atmosphere between 3 and 3,000 km altitude where isotropy cannot be assumed (Nastrom and Gage 1985). Difference Between Structure Function of Nth Order and of Nth Increments In the domain of turbulence, the study of the structure function of the first increment is often sufficient, the first increment being considered stationary. If this cannot be verified, random process with stationary nth increments can be computed subsequently (Lindsey and Chi 1976): N f .t/ D
N X kD0
.1/k
N k
f .t C .N k//
and the structure function of the nth increment reads n
2 o DfN ./ D E N .f .t// ; where E denotes the mathematical expectation. Independently, Kolmogorov (1962) introduced the structure function of nth order. He postulates that Dfn ./ D hŒf 0 .r/ f 0 .r C / n i ˛Cfn "n=3 n=3 . However, for high-order structure functions, this scaling law is showing a clear discrepancy with measurements (Mathieu and Scott 2000). The phenomenon of intermittency, i.e., brief episodes of turbulence with intervening periods of relatively weak or unmeasurable small fluctuations, may be related to this behavior (Vincent and Meneguzzi 1991). Intermittency is currently a major field of research.
Phase Structure Function Differences of phase observations are widely used in geodesy or astronomy in optical or microwave interferometers, like VLBI. Therefore, phase structure functions, sometimes called eikonal structure functions, are introduced. If '1 ,'2 denotes the phase measurements for the previously defined ˝ ˛ ˝ 2 ˛ position vectors r and r , and assuming that the medium is homogeneous, i.e., '1 D '22 D 1 2 ˝ 2˛ ' , then the structure function reads
1314
S. Schön and G. Kermarrec
˛ ˝ ˛ ˝ D' ./ D .'1 '2 /2 D 2 ' 2 h'1 '2 i : If the field is considered as locally homogeneous, the phase structure function for spatial measurements is given as an integration of the refractivity spectrum along the line of sights ds1 , ds2 : Z1 D' ./ D k
2
ds2 Cn2 .r/
ds1 0
Z
Z1
d 3 ˆ0n ./e i
T .r
1 r2 /
T 1 ei ;
0
2 where r D r1 Cr is the average slant range along the rays, k the electromagnetic 2 wavenumber, the separation between the receivers, ˆ0n the refractivity spectrum, and Cn2 the structure constant. Assuming that the 2/3 power law dependency for the spatial structure function of refractive index can be extended to the entire troposphere (Stotskii 1973) and after some simplifications (Wheelon 2001, p. 212), the phase structure function for atmospheric transmission can be expressed as follows:
8 R1 ˆ ˆ D' ./ 5=3 Cn2 .z/ca5=3 d z; if ac H; ˆ ˆ < 0 R1 2 2=3 D' ./ zCn .z/ca5=3 d z; if ac H; ˆ ˆ ˆ 0 ˆ : D' .1/ const; else; with H the tropospheric height, usually taken between 1,000 and 2,000 m. For long baselines, the atmosphere is seen as a “thin” atmosphere. Thus, the turbulence to be taken into account is 2D: eddies that influence the measurements are elongated in the horizontal direction (c a/ and not in the vertical direction. At the contrary, for short baselines, the turbulence is seen to be more isotropic, due to smaller eddies (a D b D c/ in a thick atmosphere.
Links to Semivariogram Although in geodetic applications, the word structure function is preferred, semivariogram and structure function of first order are synonymous when considering stationary process of the first increment. The general formulation of the semivariogram of the process f is defined as (Cressie 1993) .s1 s2 / D
1 var Œf .s1 / f .s2 / ; 2
where var denotes the variance. If S1 is replaced by r C and S2 is replaced by , an increment of r, then the semivariogram for a stationary process of the first increment reads
Turbulence Theory
1315
Linear Semivariogram Model
Gaussian Semivariogram Model
6
4 3
4 2 2 1 0
0
1
2
3
0 0
8
3
6
2
4
1
2
0
1
2
3
2
3
Power Semivariogram Model
Exponential Semivariogram Model 4
0
1
0
0
1
2
3
Fig. 7 Different semivariogram models
./ D
1 1 varŒf .r C / f .r/ D EŒ.f .r C / f .r//2 : 2 2
Thus, the semivariogram is equivalent to the structure function in this case. Semivariograms are a good visualization of spatial or temporal correlations. The following figure (Fig. 7) presents the different semivariogram models (linear, spherical, Gaussian, exponential, or sine wave) that can be used in Kriging, a statistical technique used to interpolate the value of a random field at one point knowing nearby values (Cressie 1993, p. 121). By definition, the semivariogram has the following properties: .0/ D 0; ./ D ./ 0 Semivariograms are characterized by three main quantities (Cressie 1993): • Nugget: Height of the jump of the semivariogram at the discontinuity at the origin. Although per definition the value of the semivariogram at the origin
1316 Fig. 8 Vocabulary for semivariograms
S. Schön and G. Kermarrec
sill
nugget range
Distance increment
should be zero, due to a large variability between two closed values for small increment, a local variability can occur, e.g., due to measurement errors. • Sill: Limit of the variogram at infinity. Indeed, very often, the semivariogram stops increasing for large and is stable around a given value which is the a priori variance of the random process. • Range: Distance in which the difference of the variogram from the sill becomes negligible. The following figure resumes the different terms used to characterize a semivariogram (Fig. 8). As mentioned for the structure functions, the behavior of the variogram near 0 gives informations about the continuity properties of the process. Moreover, if the process is considered to be Gaussian, the semivariogram will be a linear combination of independent chi-squared random variables (Cressie 1993). It should be noted that as for the empirical autocorrelation, the empirical variogram or structure function can only be computed for temporal or spatial time lags of up to 1/10 1/6 of the length of the data. For more details on semivariogram estimation, please refer to Cressie (1993, chapter 2).
Taylor’s Frozen Hypothesis: The Temporal Structure Function Taylor’s hypothesis of frozen turbulence is often implicitly or explicitly assumed in studies involving turbulence in geodesy. Taylor (1938) postulates that “If the velocity of the air stream which carries the eddies is very much greater than the turbulence velocity, one may assume that the sequence of changes in u at a fixed point are simply due to the passage of an unchanging pattern of turbulent motion over the point.” Thus, he assumed a “frozen” atmosphere during the measurement. For the random field f blowed by the mean wind hui, it results that f .r; t C / D f .r hui; t/ where the variable component of the wind velocity can be ignored. As shown in Fig. 9, the frozen eddies are transported horizontally at the mean wind velocity. This approximation can be used both in the inertial and the dissipation range where the eddies remain much smaller than the outer length scale. It performs better as the mean wind speed increases. The concept of locally frozen random medium was later introduced by Tatarskii (1971b). It only considers that a small volume of the element remains frozen in time
Turbulence Theory
1317
Fig. 9 Concept of Taylor’s frozen hypothesis
and is used for problems that are sensitive to velocity fluctuations (Wheelon 2001, p. 248). Working under classic Taylor’s frozen hypothesis, one can write D hui. As a consequence, the temporal structure function behaves as the spatial one by replacing with hui. Thus, for the phase structure function of spatial measurements, if 0 hui 1; D' ./ ˛ 5=3 :
Application of the Structure Function of the Second Increment: Allan Variance The Allan variance (Allan 1987) was introduced to deal with trends or drifts in data since the temporal structure function in this case increases quadratically with time. The Allan variance describes phase noise in terms of fractional frequency stability and is often used to characterize the oscillator stability. It is proportional to the structure function of second-order increment (Thompson et al. 2004): ˝ Af ./˛
Œf .t C / 2f .t/ C f .t C / 2 2
˛
The Allan variance can be written as a linear combination of the structure function 4D . /D .2 / of the first increment (Wheelon 2001): Af ./˛ f 2 f . From the Kolmogorov theory, the Allan variance for turbulence phase fluctuations should exhibit a 1=3 and 4=3 power law for the 3D and 2D case, respectively. Naudet (1996) showed that for tropospheric delays estimated from GPS carrierphase observations, this scaling is more or less consistent with theory. The Allan variance can be used to compare the tropospheric fluctuations with remaining limitation precision factor such as clock instabilities (Armstrong and
1318
S. Schön and G. Kermarrec
Sramek 1982). More details on AVAR (Allan VARiance) can be found in Barnes et al. (1971), Rutman (1978), Allan (1987), and Riley (2008).
Structure Functions Limitation Many investigations are directed towards the estimation of power law dependency for tropospheric delay or phase time series in GPS or VLBI data to characterize turbulence. This objective can be reached with more or less success (Dravskikh and Finkelstein 1979; Armstrong and Sramek 1982; Lay 1997; Davis 2001; Stotskii and Stotskaya 2001; Schön and Brunner 2006; Stotskii et al. 2006). Indeed, many other effects can also be reflected in power laws (Nichols-Pagel et al. 2008). Moreover, the structure function estimates are highly correlated between adjacent time lags which mask the variability of the power law exponent (Nichols-Pagel et al. 2008). The amount of uncertainty in the estimates of the power law exponent cannot be achieved for structure functions, the sampling theory being quite complicated (Muzy et al. 1993). Consequently, spectral density functions are to some extent an alternative approach.
2.3
Spectral Density Functions
General Expression of the Spectral Density The spectral density function Wf .!/ of the process f can be defined from the temporal covariance function Bf ./: Z1 Wf .!/e i ! d !;
Bf ./ D 1
with ! being the angular velocity. The previous integral converges only for Wf .!/ ! n if n > 1 for ! 0. R1 Bf ./e i ! d . Using the Wiener-Khinchin theorem, Wf .!/ D 1
The temporal structure function Df ./ can be expressed in terms of Wf .!/ (Tatarskii 1971a): Z1 Df ./ D
.1 cos.!//Wf .!/d !: 1
The convergence in this integral is ensured for Wf .!/ ! n if n > 3 for ! 0. Large eddies, corresponding to the low-frequency part of the spectrum, contain much more energy than the smaller isotropic ones. Thus, turbulent processes with stationary increments may have a spectral density function with a singularity at the origin. However, structure functions in which convergence requirements are
Turbulence Theory
1319
weaker than for the spectral density are preferred to covariance functions to describe turbulent processes. Atmospheric Transmission Tatarskii (1971a) showed that if the structure function of a process behaves like a power law with exponent and a constant Cf2 W Df ./ D Cf2 jj ; 0 < < 2, then the spectral density function will have following form: Wf .!/ D
Cf2 2
. C 1/ sin
2
j!jC1 ;
where denotes the gamma function. Consequently, for a Kolmogorov power law dependency, the spectral density of phase will behave as ! 5=3 for 2D turbulence and ! 8=3 for 3D turbulence, respectively. A similar formula to the one for terrestrial measurements, by the use of stretched coordinates, can be found in Wheelon (2001, chapter 6). For short baselines and short averaging times, a 8/3 scaling can be assumed. However, for long baseline and long sampling intervals, Taylor’s frozen hypothesis has to be used carefully. Spectral Density for Interferometric Measurements The spectral density function for phase differences is related to the spectral density ! 2 of phase via the relation W' .!; / D 4 sin 2hui W' .!/. depending on the mean wind As previously, a threshold frequency !c D 2hui L0 speed and the outer scale length can be introduced, yielding i h ! 8=3 ; ! > !c ; W' ˛ sin2 ! hui W' ˛! 2=3 ; ! > !c : For a mean wind speed of 10 ms1 , it gives a transition at approximately 0.1 Hz for a separation of 30 m (Wheelon 2001, p. 304). Spectral Density and Structure Function: A Comparison Following Tatarskii (1971a) or Nichols-Pagel et al. (2008), the spectral density is having some advantages over the structure function. First of all, the spectral density function exists for both stationary processes and processes with stationary increments. Moreover, the physical meaning in terms of frequency filtering is much more understandable: the troposphere acts as a low-pass filter via the large eddies. When estimating power laws, it is important to access to the covariance structure of the power law estimate (Davis 2001; Nichols-Pagel et al. 2008). For spectral density function with a power law dependency, the log estimates can be written as the sum of a stochastic noise with known variance (approximately Gaussian) and a non-stochastic part and estimated in a least-squares model. However, due to the high correlation of structure functions, no such confidence interval in this case can be obtained.
1320
S. Schön and G. Kermarrec
Table 1 Power law for some common processes Random walk Brownian noise
White noise
Flicker noise
Gauss-Markov process
Spectral densityWf (!)
˛! 2
˛! 0 = const
˛! 1
˛ ! 2 Cˇ 2
Autocorrelation B' . /
˛
constı. / Not exist ı Dirac function
Table 2 Power law dependency
2 2 ˇ
2 e ˇ
Structure function Power spectrum Allan variance Spectral density
Power law a aC2 a2 .1 C a/
Moreover, tropospheric delays are often modeled as random walks in Kalman filtering. Davis (2001) showed that the structure function can wander off for some random walk processes for longer lags. Therefore, the use of the spectral density function is said to be more accurate and precise than structure functions. Power Law Dependencies and Statistical Processes Kolmogorov power law dependencies are often searched in time series to characterize the signature of turbulence. However, it is worth being aware that some common processes also reflect in power law (Wright 1996; Thompson et al. 2004). This can lead to a misinterpretation of the results when time series are estimated, for instance, with Kalman filtering as random walk (Davis 2001). Table 1 summarizes some common stochastic processes in terms of power law processes. In addition, Table 2 explains the corresponding power law dependence (cf. Allan (1987) or Thompson et al. (2004, p. 538)).
2.4
Conclusions
Figure 10 is a representation of the typical temporal structure function (b) and Allan variance (c) of 3 different time series (a). It shows the relationship between the scaling as in Table 2. The magenta time series (time series 3) represents the sum of two power laws (time series 1 and time series 2, respectively, 3/2 and 0 for the structure function) and highlights the breaking in the statistical functions, as can be seen for the phase structure function of turbulent time series. The time series 2 is a white noise simulation. We defined mathematically the different statistical quantities (structure functions, power spectrum, spectral density) usually used to quantify the effects of turbulence in geodesy. Figure 11 summarizes the power law dependencies for the functions previously described and helps us to understand their relationships.
Turbulence Theory
1321
Fig. 10 Power law dependency of four different time series
3
Covariance Models
3.1
Introduction
With increasing data rates of VLBI and especially GNSS (Global Navigation Satellite System) observations (1 Hz), temporal correlations induced by clock frequency and refractive index variations are getting more important. Structure or spectral density functions were described in Sect. 2 as a possible tool to characterize turbulent fluctuations in time series. In this section, we will focus on a modelization of physical correlations between observations, thanks to fully populated variancecovariance matrices which can be incorporated in the stochastic model. After a definition of the covariance and a short review of existing work on physical correlations for GPS observations, we will describe two covariance models used for spatial data analysis: the first one was developed by Treuhaft and Lanyi (1987) based on the Kolmogorov structure function, whereas the second one by Schön and Brunner (2008a) derived from the von Karman power spectrum of turbulence by taking into consideration anisotropy and inhomogeneity.
1322
S. Schön and G. Kermarrec
Fig. 11 Power law dependencies for turbulent process
3.2
Covariance: Definition
Geodetic data analysis by weighted least-squares adjustments or Kalman filtering necessitates the knowledge of covariance functions for a reliable description of the stochastic models. However, considering fully populated variance-covariance model (VCM) can entrain some numerical and computational issues (El-Rabbany 1994; Howind et al. 1999). A priori VCM for parameters of interest, like VCM for tropospheric wet delays, can help to improve the estimation (Kleijer 2004). We will in the following part concentrate on the estimation of covariance functions which can be used to model the impact of tropospheric fluctuations. The covariance function is defined (Tatarskii 1971) as
Turbulence Theory
1323
Bf .t1 ; t2 / D h.f .t1 / hf .t1 /i/ .f .t2 / hf .t2 /i/i : For stationary processes, this quantity depends only on the time difference D t2 t1 . Thus, the statistical dependency between the fluctuations of f at times t and t C is given by Bf ./ D h.f .t C / hf i/.f .t/ hf i/i. For a homogeneous medium, t C is replaced by r C .
3.3
Previous Work on Physical Correlations
Only few works on physical correlations between code observations are available probably due to the rather large noise with standard deviation of 0.3–3 m depending on the code type and receiver. Considering GNSS, most studies have been published exclusively on GPS. However, the results can be transferred directly to other GNSS, like the Russian GLONASS System or the upcoming European Galileo or Chinese BeiDou. Mathematical correlations occur if new observables are formed based on the initial measurements, like double differences in GNSS (Seeber 2003). Physical correlations are having a temporal and/or a spatial nature. However, for space geodetic observations, the whole transmitter-medium-receiver geometry varies with time so that a categorization in temporal and spatial correlations may be misleading since both effects are linked. Physical correlations are induced in a strict sense by random parts of satellite orbit errors, clock errors, and ionospheric or tropospheric propagation effects. It is obvious that also remaining systematics will yield “correlated” observations. We will however focus on the first-mentioned contribution. Up to now, several concepts have been proposed to improve the initially very simple stochastic model of GPS carrier-phase observations. Both the heteroscedasticity of the undifferentiated and the mathematical correlations for double differences are nowadays taken into account in high-end GPS analysis software (e.g., Beutler et al. 1987). However, despite many known investigations (e.g., El-Rabbany 1994; Teunissen and Kleusberg 1998; Tiberius et al. 1999; Howind et al. 1999; Wang et al. 2002; Bischoff et al. 2005; Luo 2013), the incorporation of physical correlations in the stochastic model of GPS observables still remains a major issue in GPS data processing. As a consequence, the estimated parameters from repeated observations are not as accurate as predicted by the associated formal variance-covariance matrix (VCM) of the parameters. Thus, this matrix is said to be optimistic and therefore not appropriate to evaluate the significance of GPS-derived results (El-Rabbany 1994; Jin et al. 2010). Moreover, the homogeneous stochastic model is inappropriate. Elevation-dependent models or models based on the carrier-to-noise density ratio (Hartinger and Brunner 1999; Brunner et al. 1999; Wieser and Brunner 2000) are taking the heteroscedasticity of the observations into account. However, the physical correlations between GPS observations (Schön and Brunner 2007) should be better modelized.
1324
S. Schön and G. Kermarrec
1 2
Cn2
e B2
e A1 B A
r
Fig. 12 Variation of the refractive index in the troposphere and induced path delay and correlations
El-Rabbany (1994) investigated physical correlations by fitting empirical correlation functions of exponential and of polynomial type to the autocorrelation data of double-differenced (DD) phase residuals for different baseline lengths. Howind (2005) analyzed the autocorrelation function of the ionosphere-free linear combination and developed an empirical correlation function of a modified exponential type. A more mathematical approach using an iterative whitening procedure to decorrelate the observations was also proposed by Wang et al. (2002) with different covariance models. Williams (2003) proposed a general form for power law noise model (colored noise) and derived the rate uncertainty for estimated components. Empirical approaches were studied by Leando and Santos (2006) who developed an empirical stochastic model based on the residual autocorrelation and showed that the height component biases can be significantly reduced. If two GPS signals travel through the “same” turbulent atmosphere, they will encounter the same eddies and thus the same variation of refractive index as shown in the following figure (Fig. 12). In the next section, we will focus on the two covariance functions proposed by Treuhaft and Lanyi (1987) and Schön and Brunner (2008a) to model physical correlations due to the tropospheric refractive index fluctuations, based on the turbulence theory.
3.4
Treuhaft and Lanyi model (1987)
Using the general expression of the tropospheric delay and assuming a homogeneous medium, Treuhaft and Lanyi developed a “slab model” for the troposphere that postulates a uniform turbulent activity until a given height h and no turbulence anymore above. They expressed the structure function for the two-phase
Turbulence Theory
1325
measurements '1 , '2 separated by a distance on the Earth’s surface, looking along a ray with, respectively, an elevation " and azimuth ˛ by directly integrating the structure function of the refractive index along the line of sight s1 , s2 : 2 # 2 ZH ZH ( " s / .s 1 2 Dn 2 C 2.s1 s2 / cot " cos ˛ C sin " 1
1 D' ./ D 2 sin ."/
0
0
Dn
js1 s2 j sin "
) ds1 ds2 ;
where H is the effective height of the troposphere. This function has to be integrated numerically. Treuhaft and Lanyi showed that the 5/3 and 2/3 power law for 3D and 2D turbulence holds, respectively. Being aware of its limitation, Taylor’s frozen hypothesis can be used to express the temporal structure function. Based on the relation Bf .1 ; 2 / D 12 Df .1/ Df .1 2 / for a stationary process and using the structure function for the refractive index of Kolmogorov, Treuhaft and Lanyi (1987) determined the covariance between two tropospheric slant delays 1 , 2 : 0 1 ZH ZH 0 .z/ s .z / tj 1 1 hui js 1 2 @H 2 2 A: d zd z0 Dn B' .1 ; 2 / D sin."1 / sin."2 / 2 sin " 0
0
where hui is the wind velocity, s1 and s2 denote the lines of sight, and 2 the variance of the wet refractivity fluctuations which is assumed independent of s and can be expressed assuming that the troposphere is completely uncorrelated as ! 1: 2 D
1 Dn .1/ 2
However, the structure function for the refractive index was adapted from the Kolmogorov law Dn .R/ D Cn2 R2=3 in order to get a saturation at large time lag which avoids a divergence at infinity. Treuhaft and Lanyi postulated that for VLBI measurements, Dn .R/ D Cn2
R2=3 2=3 : 1C R L
The parameter L, chosen in physically reasonable range, was taken to 3,000 km, a value at which empirical structure functions for VLBI data saturate. Longperiod fluctuations do not influence much VLBI parameter estimations, making the precision for the saturation scale value sufficient. The geostrophic value of
1326
S. Schön and G. Kermarrec
the wind speed was 8 ms1 and the structure constant Cn D 2:4 107 m1=3 , a mean value at average midlatitude. The effective tropospheric height H for isotropic turbulence was assumed to be H D 1 km. All those parameters have to be in accordance with each other for a good description of the turbulence; mean values seemed to be sufficient in a first approximation to reach good results for 3D RMS VLBI estimations (Romero-Wolf et al. 2012). The Treuhaft and Lanyi model was first developed for VLBI measurements assuming homogeneous and isotropic turbulence and a Taylor frozen medium. For the estimation of the dry tropospheric component, the given constants have to be adapted. This development can be also used for optical astronomical measurements by changing the values of the wind speed and the saturation scale which should be taken up to a few meters for measurements in the boundary layer (Treuhaft and Lowe 1995). Simulation In the following figure, we simulated the covariance versus time for one station. We computed the value of the covariance B' .t0 ; tj / for one transmitter at a given elevation ".t0 / and azimuth ˛.t0 / at ti D t0 ; thus, t0 < tj < tend . The case with two stations would give the same results. The parameter L was changed from 3,000 m (left) to 3,000 km (right). The wind speed is playing an important role on the correlation length. In the second case (right plot L0 D 3;000 km), the behavior of the covariance versus time is close to B' .ti ; tj / D sin.".t //ˇsin ".t / ; ˇ D const . j/ i (Fig. 13). For L D 3;000 m and small values of the wind speed (2 ms1 /, the influence of the variation of the separation distance versus time can be seen, whereas with very strong values of the wind speed, the covariance versus time is decreasing much more faster.
3.5
Schön and Brunner model (2008a)
Starting from the power spectrum expression for the phase covariance, Schön and Brunner (2008a) developed a second model for describing the covariance between spatial measurements induced by the refractivity fluctuations in the troposphere. Local inhomogeneities as well as anisotropy are modelized, thanks to stretched wavenumber coordinates. Following Wheelon (2001, p. 136), the most general case of covariance between phase observation from station A to transmitter i at epoch tA and station B from the transmitter j at time tB induced by refractivity fluctuations reads as ˝
Ai .tA /; Bi .tB /
˛
Z1 Z1 Z1 Z1 Z1 D 0
0 1 1 1
r1 C r2 ei ˆn ; 2
T d
d 3 dsA dsb
Turbulence Theory
1327
Fig. 13 Covariance versus time for different wind speeds and azimuth and different values of L
where ˆn ./ is the power spectrum of the refractive index and the 3D vector of wavenumbers. As the electromagnetic signals coming from satellites travel through the whole troposphere, they encounter eddies of different sizes, from large and elongated eddies to small and isotropic ones near the ground. Phase measurements in the microwave domain are mostly influenced by water vapor fluctuations in a medium that can be considered as locally homogeneous with smoothly varying mean characteristics (Tatarskii 1971). The power spectrum is expressed in spherical wavenumbers .!; ; q/ by
1328
S. Schön and G. Kermarrec
Fig. 14 Matern kernel covariance for v D 1=3 and 5/2
r1 C r2 2 r1 C r2 D Cn ˆ0n .!; ; q/: ˆn ; 2 2 ˆn is the product of a term evaluated at the mean point (Cn2 the structure constant) and a correlation term that changes rapidly with the separation ˆ0n . Schön and Brunner (2008a) chose the von Karman spectrum and showed that the triple integral can be simplified by integrating over !, , and q: D E j
Ai .tA /; B .tB / D D
p 12 0:033 5 5 6
2=3 1 0 2=3 2 sin "i sin "j 3 A B 3
Cn2
1 2= 3 0 3 2 =3
sin "iA
RH RH 0 0
j sin "B
.0
Cn2
1 d / =3 K
RH RH 0 0
1
.0 d / =3 K1=3 .0 d /d z1 d z2
1=3 .0 d /d z1 d z2
where denotes the gamma function and K the modified Bessel function of second kind, also named the Macdonald function (Abramowitz and Segun 1972). "iA and "iB denote the elevation of the satellites at stations A and B. This equation must be evaluated A double-integrated “Matern Kern” B.r/ D pnumerically. p 21v .v/
2vr l
v
2vr can be recognized, where l is a scale length. It is plotted Kv l for different values of in the following figure (Fig. 14). A direct integration for the case a D b D c D 1 can be performed, and the variance simplifies to
˝
˛
2 D
p 1 2= 1
2 3 z2 3 0 3 2 =3 2 2 12 0:033 2 =3 1 C H f F ; 1 ; 3; 2; 1 ; 4 p 2 3 n 2 5 5 2 .sin "/2 3 3 6
2 2 2 27 2 =3 23 z =3 1 F2 56 ; 11 ; 7 ; z4 g; 80 6 3
Turbulence Theory
1329
Fig. 15 Geometrical interpretation of the separation distance d , after Schön and Brunner (2008a)
where F denotes the hypergeometric function (Abramowitz and Segun 1972). The 0H dimensionless argument z is given by z D p , where the factor p 2 D r T Rv MRvT r sin " describes the impact of anisotropy on the variance. More details can be found in Schön and Brunner (2008a). We will now describe more deeply the physical meaning of the parameters that are involved in the computation of the phase covariances, namely, the separation distance, the tropospheric height, the structure constant, and the stretched parameters.
Separation Distance For space geodetic measurements, the separation distance can be seen as the distance between two “rays” of a signal passing through the atmosphere. Its variation with time is the key parameter for understanding the correlation and decorrelation process of the phase observations or tropospheric delays. In a first attempt and for ease of visualization, we can consider a constant height h. The following figure introduced the quantities that are used to express exactly the separation distance (Fig. 15). 0 1 cos Aziwind The distance d depends on the wind vector hui D hui @ sin Aziwind A in a 0 topocentric coordinate system, where Aziwind is the wind azimuth and hui is the constant geostrophic wind vector. Thus, the components of the vector of separation distance d can be expressed as follows: cos ˛ i .t / dx D h cos Aziwind tan "iA.t ref/ C A ref
j
cos ˛B .t / j
tan "B .t /
hui t
cos ˛ i .t / dy D h sin Aziwind tan "iA.t ref/ C A ref
j
A ref
cos ˛B .t / j
tan "B .t /
sin ˛ i .t / C h sin Aziwind tan "iA.tref / C sin ˛ i .t / Ch cos Aziwind tan "iA .tref/ C
dz D 0
A ref
j
sin ˛B .t / j
tan "B .t / j
sin ˛B .t / j
tan "B .t /
1330
S. Schön and G. Kermarrec
Fig. 16 Separation distance versus time influence of the wind vector
d .t; h/ D jjd.t; h/jj, where tref is a reference epoch. Depending on the geometry and on the wind vector, d .t; h/ D jjd.t; h/jj computed at height h can have a @d minimum. The total differential of d is given from dd D @d @h dh C @t dt: By solving for a given h @d @t h D 0, the epoch at which the separation is having a minimum can be computed. In some cases, a strict monotone function is obtained, so that the minimum occurs at one border of the interval [t1 , tn ]. The exact mathematical form of the derivation is complicated and not very useful for the understanding of the separation distance behavior. In order to illustrate the basic principles of decorrelation, the following scenario was selected. We assumed a moving satellite during a time interval [t1 , tn ]. The distances between the intersection points of the lines of sight with a plane in a constant height h describe the geometry effect of the variation of the separation distance. A second contribution which generally dominates the overall separation distance comes from the wind speed. Assuming Taylor’s hypothesis, the second contribution is visualized by the block arrows in Fig. 16 Finally, for t1 as reference epoch, the overall separation distance is represented by the distance between the respective intersection point and the resulting black dot for ti . Thus, it is obvious that the satellite geometry, the wind speed, and the direction as well as their geometric relationship play a key role for the variation of the separation distance. More details can be found in Schön and Brunner (2007). In our exemplary scenario, a continuous increase with time is depicted (Fig. 16). In order to highlight the effect of the wind azimuth on the minimum of separation distance, we considered two receivers separated by 1,000 m in north-south direction. The stretching parameters are a D b D 1; c D 0:01, i.e., elongated eddies. One satellite at a 75ı elevation and 250ı azimuth is considered over 1,500 s (50 epochs per 30 s). Fig.17(a) shows that if the angle between the baseline vector and the wind vector is between 90ı and 270ı , a minimum of separation distance will occur. Otherwise, the separation distance will continuously increase. Maximal correlations should correspond to minimum of separation distance.
Turbulence Theory
1331
Fig. 17 Separation distance versus time for different wind vectors ((a) changing wind azimuth, (b) changing wind speed)
In Fig. 17a, the strong influence of the wind azimuth is illustrated. The evolution of the separation distance with and without the influence of the wind speed for a given wind azimuth is described in Fig. 17b. When no wind is taken into consideration, the separation distance is increasing slowly and only the change in the geometry is shown, whereas when a wind velocity of 8 ms1 is taken, the separation distance is increasing with time and thus the covariance versus time will decrease more rapidly.
Influence of the height at which the separation distance is computed It should be noted that the results shown before depend on the height h chosen. In order to illustrate this impact, the following figure (Fig. 18) shows exemplary the influence of the height at which the separation distance is computed. The epoch at which the minimum of separation distance, for a given wind speed and wind azimuth, occurs changes depending on this parameter. We chose here values of h between 0 and 1,000 m.
1332
S. Schön and G. Kermarrec
Fig. 18 Separation distance versus h
Fig. 19 Values of separation distance between satellites at one station – GPS constellation
Finally, we will give a short note about the typical range of values for the separation distance for observations at one station. Figure 19 highlights which values of the separation distance can be found for an exemplary but typical GPS constellation. The separation distances for all satellite pairs d .t D 0; H D 1;000 m/; r1 D r2 D .0; 0; H/T ; r1 D r2 D .0; 0; H /T ; D 0 were computed. Values of the ray intersection points on the horizontal plane at a height of H D 1;000 m between a few hundreds of meters between PRN 4 and PRN 6 and 6,000 m
Turbulence Theory
1333
Fig. 20 Elongation of eddy in the stretched coordinate
between PRN 12 and PRN 21 can be found. For baseline scenarios, the baseline length should be taken into account.
Anisotropy Thanks to the stretched parameters, the covariance model allows to deal with anisotropy of the atmosphere. Assuming that the horizontal parameters a D b D 1, the parameter c describes the vertical elongation of the eddies. L0x D aL0 ; L0y D bL0 ; L0z D cL0 Figure 20 is a schematic representation of an elongated eddy that can be found in the free atmosphere, where L0x D L0y is the horizontal elongation and L0z the vertical one. For a given L0 : • a D b D c D 1 represents an isotropic 3D well-mixed turbulence. The turbulence in the boundary layer can be considered as isotropic near the ground. • c < 1 represents eddies that are elongated in the horizontal direction. It should correspond to a layered turbulence above the planetary boundary layer, in the loosely called free atmosphere where the effect of the Earth’s surface friction on the air motion is negligible. • a D b D 1, c D 5 would represent eddies with vertical expansion, as can be found in the mixed boundary layer under strong convection when warm air rises (De Moor 2006)
Tropospheric Height The tropospheric height is assumed to have value between 1,000 and 1,500 m. It corresponds to the part of the atmosphere where transport processes dominate. Between the ground and H D 1;500 m, the fluctuations of the refractive index influence strongly both geodetic and astronomical measurements. Following Stull (1988) or De Moor (2006), we can consider that the “loosely called” free atmosphere, where large elongated eddies predominate, starts at a height of 1,500 m. Due to the fact that anisotropy is taken in consideration in the Schön and Brunner covariance model, the effective tropospheric height Heff D ac H has to be used. Thus, the elongation of the eddies through the parameter c is quite important. Outer Scale Length With the concept of energy cascade, the outer length scale can be defined as the size of the eddies that carry the maximum energy, i.e., the “biggest eddy.” At the contrary, the inner scale length is the size of the smallest eddies, just before they dissipate by viscosity. The outer scale length is sometimes defined by the break
1334
S. Schön and G. Kermarrec
point of the phase structure function power law dependency from a 5/3 to a 2/3 behavior (Stotskii 1973). Tatarskii (1971a) gives the value of 0.4 h for the outer scale length near the surface with h being the height at which the outer length scale is computed. Experimental studies have shown that in the boundary layer, eddies can be very small and mainly isotropic (a D b D c D 1). Coulman (1990), Coulman and Vernin (1991) showed that the value of L0 can have only centimetric values, particularly in the lower bound of the boundary layer. It is assumed that the value of the outer scale length for isotropic turbulence. L0x D L0y D L0z “saturates” to values around 100–200 m for the isotropic eddies of the boundary layer (Fried 1967). Following Wheelon (2001), eddies are becoming anisotropic as height increases and can be considered to be strongly elongated in the free atmosphere. The value of the outer scale length, measured with sounding radars or interferometers, can be quite different, making this value difficult to fix (Wheelon 2001). Moreover, it depends strongly on the kind of turbulence (2D, 3D) that is taken into consideration. At height H D 1;500–2;000 m, the horizontal length scale can be of the order of 6,000 m for a vertical length scale of 10–70 m. Since the value of the structure constant remains large in the free atmosphere and satellite signals travel through the whole troposphere, large elongated eddies should be taken into consideration. Thus, anisotropy is an important feature. Moreover, coherent values for wind speed, structure constants, and the outer scale length are essential. Furthermore, the values are depending on the experiment that is performed, like VLBI, VLA (Very Large Aperture), zero, short or long baselines, GPS, and terrestrial measurements.
Wind The geostrophic wind can be defined as an unaccelerated horizontal wind blowing tangent to the isobars. It is a theoretical wind balanced by the Coriolis and pressure gradient forces. Informations about the geostrophic wind vector are given from radiometer measurements, and a mean velocity of 8 ms1 should be a good approximation at the top of the boundary layer. The true wind is reduced by friction against the Earth or sea surface and is deflected towards the center of the lowpressure system. Near the ground, the wind vector is having a logarithmic profile (Ekman spiral). Some strong values of wind velocities can be found in the nocturnal jet (20 ms1 ; Banta et al. 2002). However, the following figure (Fig. 21) gives schematically a good indication of the wind profile that can be found in the boundary layer at daytime. In the mixed layer, the value of the wind speed is subgeostrophic and nearly constant with height, whereas in the free atmosphere, the wind reaches its geostrophic value of approximately 8 ms1 . For correlations between two different satellites with a large separation distance at one station or for larger baseline length, 2D turbulence can be assumed and geostrophic value of the wind taken. Figure 22 emphasizes once more the strong influence of the wind speed by increasing the length of correlation time. For low
Turbulence Theory
1335
Fig. 21 Wind speed profile in the boundary layer, adapted from Stull (1988)
Fig. 22 Changing the wind velocity for the same geometry, covariance one satellite-one stationsimulations with a D b D 1, c D 0, 01, L0 D 6;000 m, H D 1;000 m and wind azimuth D 0ı . (a) wind speed 20 ms-1 and (b) wind speed 2 ms-1
wind velocity (2 ms1 Fig. 22a), the correlation lengths are larger than for strong wind (6 ms1 Fig. 22b). Thus, the knowledge of the wind vector (azimuth and speed) is important for a good modelization of the covariance between measurements. Furthermore, we can here notice that the simulated covariance matrices are not Toeplitz matrices. This reflects the non-stationarity of the time series.
1336
S. Schön and G. Kermarrec
Taylor’s Frozen Hypothesis and VLBI Taylor’s hypothesis supposes a frozen atmosphere and is a good approximation for short baselines. For large baseline lengths such as those encountered in VLBI (more than 100 km), the frozen assumption even for large eddies is often assumed. Lin (1953) showed that Taylor’s frozen hypothesis is only strictly valid if the turbulence is not too strong and the mean shear is small. Thus, its applicability is strongly dependent on the scale of the turbulent structures that are investigated. It is assumed that large atmospheric structures tend to be especially long-lived and are the good candidates for the frozen assumption. Thus, the application of Taylor’s hypothesis for large eddies should only be valid when the mean wind speed remains greater than the turbulent velocity of large eddies: hui u.L0 /. Wright (1996) highlights this problem by mentioning that large eddies should probably move faster than small ones but with a speed on the order of the mean wind velocity. Moreover, the wind speed of strongly elongated eddies is not well defined (Wheelon 2001, p. 246) since the geostrophic wind speed can vary. However, Lay (1997) postulates that frozen turbulence should be a good approximation if the wind speed exceeds the speed of convective motions. Thus, taking the geostrophic wind value as well as mean values of the structure constant (Nilsson et al. 2007) in a first approximation is enough to improve the 3D RMS of VLBI in a significant way.
Structure Constant A height dependence of the critical scaling parameter can be modelized following Tatarskii (1971a). Kleijer (2004) or Nilsson et al. (2007) implemented models for
zCz0
2 the exponential height decreased structure constant as Cn2 .z; z0 / D Cn0 e H or as layers. A seasonal as well as diurnal site dependence has been shown by Naudet (1996), or Thomson et al. (1980). Moreover, the profile and values are quite different for microwave and optical frequencies (Wheelon 2001, p. 68), particularly at low altitude where for the optical regime, values can be more than 100 times smaller than for the microwave domain. More information on structure constant measurements can be found in Wheelon (2001) and an exhaustive list of reference in Nilsson (2007). In the variance-covariance models, the structure constant is primarily acting as a scaling parameter. It describes the intensity of turbulence and is directly related to the rate of dissipation of the fluctuation of the index of refraction Nn . It can be written as Cn2 D ˇ "N1=3n , where ˇ is a constant and the energy dissipation (Ishimaru 1984). The refractive structure constant can be estimated from Water Vapor Radiometer measurements, thermosondes, or radiosonde data. Following Treuhaft and Lanyi (1987), values between 1013 and 1014 m2=3 should be good approximations in the boundary layer and the first kilometers of the loosely called free atmosphere. Once more, this constant should be taken in adequation with other values such as wind speed or tropospheric height.
Turbulence Theory
3.6
1337
Comparison Between the Schön and Brunner Model and the Treuhaft and Lanyi Model
The Treuhaft and Lanyi and Schön and Brunner models are nearly equivalent in their formulation. However, the Schön and Brunner equation is a generalization of the model given in Treuhaft and Lanyi. Using the small argument development for the Macdonald function, the integral kernel of the Schön and Brunner model can be approximated as 12 12 23 2=3 2=3 2= 4 0 d D A1 C A2 d 3 2 2 =3 43 2 =3 Using the structure function representation, the integral has the form of Treuhaft and Lanyi (1987), with the constants 2=
6 0:033 2 0 3 A D C2 5 5 56 sin "1A sin "2B n
2 6 0:033 23 2 2 =3 and B D : 5 56 43 sin "1A sin "2B
For arguments x0 d 100 s, the AVAR is showing a white noise power law dependency. The spectral density highlights the low-passband effect of the simulated matrix. At high frequency, the noisy behavior is a result of the cosecant dependency of the covariance model (Davis 2001). The structure function, plotted in logarithmic scales, is having a 2/3 power law dependency and saturates for large time lag (i.e., for 0 hui ! 1, thus in our case for 150 s/ at approximately twice the standard deviation of the simulated slant delay (3 cm2 ). The eigenvectors of the simulated covariance matrices (Fig. 24a) are highlighting the “Toeplitz-like” form of the matrices (Reddi 1984) and a periodic behavior, particularly for the eigenvectors corresponding to the largest eigenvalues. However, it should be noted that the covariance matrices are not Toeplitz matrices due to the
1340
S. Schön and G. Kermarrec
non-stationarity. The eigenvalue decomposition is having an exponential decrease, only the first 20 being relevant. In the following figure, the eigenvalues and the first 12, the 30th, and the 50th eigenvectors of the previous covariance matrix (Fig. 24) are plotted in order to highlight the (and not their) periodic proprieties of the first eigenvectors, the last one having a more noisy behavior. Thus, the Schön and Brunner covariance matrices are acting as low-pass filters on the simulated random vectors which is a direct consequence of the turbulence model used (Sect. 3).
4.2
Improving the Stochastic Model of GPS or VLBI by Taking into Consideration Physical Correlations
The previous studies of, e.g., Wang et al. (2002) and Koch et al. (2010) demonstrated the importance of an adequate estimation of the covariance matrix in least-squares models in order to access to a more realistic description of the estimated parameters. For the upcoming high-rate GNSS data with sampling rate of up to 100 Hz, considering the physical correlations will become crucial. Thus, a good modelization of the covariance matrices and thus of the observation would improve the results. As one proposal, Schön and Brunner (2008b) developed the SIGMA-C model. It combines the before described covariance induced by tropospheric fluctuations with a receiver-antenna-dependent white noise component. The variance of this additional component is described by the C /N0 -based SIGMA- variance model (Hartinger and Brunner 1999). Choosing two arbitrary stations A and B, two j i arbitrary satellites i and j , and two times of observation tA ; tB , respectively, the j
elements C2 A; B; k; j; tAk ; tB of the resulting fully populated VCM †C can be represented as follows: The variances read C2 .X; l; tXl / D "2 .X /q" X; l; tXl C T2 .X /qT X; l; tXl ; where X stands for the stations A and B, respectively. The covariances read C2 A; B; i; k; tAi ; tBk D T .A/T .B/qT .A; B; i; k; tAi ; tBk /;
where the subscript T identifies the parts coming from the turbulence model and the subscript " those coming from the SIGMA-" model. Only the variances depend on both model parts where X stands for the considered station, l the satellite, and tXl the respective observation epoch. The cofactor of the SIGMA-" part is given by l C =N0 .X;l;tX / 10 q" X; l; tXl D 10 ;
where C =N0 .X; l; tXl / denotes the carrier-to-noise density ratio in dB-Hz of a signal from satellite/measured by the receiver X at time tXl (cf. Hartinger and Brunner 1999). The variance component "3 .X / depends on the equipment used at
Turbulence Theory
1341
Fig. 24 Eigenvalue decomposition of a covariance matrix
station X , especially the antenna-receiver combination. If this combination differs at the considered stations, then for each station a different value for the variance component has to be used. The second part of the variance is caused by refractivity fluctuations in the troposphere such as already presented in paragraph 3.
1342
S. Schön and G. Kermarrec
The very complex analysis of the data from a specially designed GPS test network was carefully carried out in Schön and Brunner (2006, 2008b). A comparison of the empirical auto- and cross-correlation functions with the correlation patterns obtained by the SIGMA-C model was made to evaluate the model. It was shown that using proposed reference values for the model parameters of the SIGMA-C model and the geostrophic wind direction and velocity, the SIGMA-C model predicts adequately the correlation patterns for GPS double difference (cf. Schön and Brunner 2008b). More analysis of real data is necessary to quantify the impact and improvements. First studies were made by Luo (2013) with a simplified covariance model based on signal-to-noise ratio.
4.3
Wavelet and Turbulence: Covariance Analysis
Wavelets are already widely used in turbulence since they can adequately deal with non-stationarity. We cite here exemplarily works of Farge (1992) for the study of vortex extraction (checker) in turbulent flows, Hagelberg and Gamage (1994) for intermittent turbulence, Cornish and Bretherton (2006) for the Maximal Overlap Wavelet Statistical Analysis, Mahrt (1991) and Nichols-Pagel et al. (2008) for a comparison between structure functions and wavelet variance, and Whichter et al. (2000) for the multiscale analysis of covariance using the discrete wavelet transforms (the relationship between southern oscillation index time series and station pressure series was analyzed). Domingues et al. (2004) present a good overview of the possible applications of the wavelet analysis in atmospheric science with relevant references. The wavelet-based estimators (Abry et al. 1995) are to some extent a better alternative to structure functions for the characterization of power law processes, such as those encountered in turbulence theory (Nichols-Pagel et al. 2008). In this case, the statistics of the wavelet-transformed variable, “wavelet variance spectra,” show maxima that are a measure of the scale of the event size. Using the maximal overlap discrete wavelet transform (MODWT) wavelet coefficients, an unbiased estimator of the wavelet variance can be estimated: _2
v X .j / D
1 Mj
N 1 X
2
W j;t
t DLj 1
_2
where v X is the estimator of the wavelet variance, j D 2j 1 a unitless standardized scale, and Mj D N Lj C 1. LP j 1 W j;t D hQ j;t Xt l mod N ; t D 0; 1; : : : ; N 1 are the MODWT (maximal lD0
overlap discrete wavelet transform) wavelet circular convolution of the n coefficients, o Q process f with a jth level wavelet filter hj;t ; j D f0; 1; : : : ; Lj 1g.
Turbulence Theory
1343
Percival and Walden (2000) show that this quantity is related to the spectral density function WX () of a process X : j 1=2 Z
vX2 .j / 2
WX .f /df;
1=2j C1
where f denotes the frequency. Assuming a Kolmogorov behavior for the structure function, the power law can be estimated using least squares and an estimation of the precision with appropriate confidence intervals computed. Using the least asymmetric LA(16) wavelet based on Daubechies wavelets, Nichols-Pagel et al. (2008) analyzed turbulent aerothermal time series to demonstrate the superiority of wavelet variance estimates and spectral density to estimate the power law dependency. They also showed by analyzing the residuals of the power law estimates that quasiperiodic noises, confined to small scales for wavelet variance and high frequency for spectral density, are spread over the whole separation distance for the structure function. Thus, the wavelet approach is a better alternative to the usual structure functions and should even be superior when dealing with non-Gaussian distributions. Wavelet scalograms, i.e., a visual method of displaying a wavelet transform time as abscissa axes, scale as ordinate axes, and the coefficient as brightness, were used by Luo (2013) to study the GPS residual components of the least-squares models. The Morlet wavelet as mother wavelet in a continuous wavelet transform (CWT) was used to analyze the signal composition in the time and frequency domain. Effects such as multipath or sidereal stacking can be identified and filtered to obtain white noise-like residuals after an adequate ARMA modelization. Satirapod et al. (2001) decomposed the GPS residuals into a low-frequency bias and high-frequency noise terms in order to try to achieve a better ambiguity resolution.
5
Conclusions
In this chapter, we have introduced a simplified description of turbulent processes based on the Navier-Stokes equations and on the energy cascade model from Kolmogorov. The statistical concepts of turbulence by means of structure functions, spectral density, as well as covariances were introduced. The Kolmogorov power law dependencies are often searched in geodetic time series such as tropospheric slant delay or residuals to characterize turbulence. Based on this theory, two models of covariance were presented and the different parameters that have to be taken into consideration described in deep. The concept of the separation distance, important to understand the non-stationarity of the covariance models, was presented. Finally, possible applications of such models to simulate tropospheric slant delays have shown the potential of these covariance functions. An exemplary case study linked
1344
S. Schön and G. Kermarrec
all the statistical functions presented in this chapter. The potential of wavelets in comparison with spectral density or structure functions was shortly introduced. In the future, turbulent fluctuations will become even more essential due to higher sampling rates, e.g., of the VLBI 2010 telescopes as well as GNSS of up to 100 Hz. Also, the availability of high-rate products of the GPS services up to 5 s will make a better modelization of turbulent effects important. Thus, statistical studies of GNSS or VLBI time series by means of AVAR, spectral density, structure functions, or wavelet, as well as the improvement of the stochastic models by the use of better VCM, will be of great interest in the next years. Acknowledgements The authors thank FK Brunner for his introduction into turbulence theory. The stay at his institute at TU Graz, Austria, was funded by a Feodor Lynen Fellowship of Alexander von Humboldt Foundation, which is gratefully acknowledged. The first author thanks Dr. Markus Vennbusch for fruitful discussions and new development at IfE Hannover. The German Research Foundation (DFG) is thanked for the financial support to study the subject in the projects SCHO1314/1-1, 1-2 as well as SCHO1314/3-1.
References Abramowitz M, Segun IA (1972) Handbook of mathematical functions. Dover, New York edition Abry P, Goncalves O, Flandrin P (1995) Wavelets, spectrum estimation and 1/f processes. In: Antoniadis A, Oppenheim G (eds) Wavelets and statistics. Lecture notes in statistics, vol 103. Springer, New York pp 15–30 Allan DW (1987) Time and frequency (time domain) characterization, estimation, and prediction of precision clocks and oscillators. IEEE Trans Ultrason Ferroelectr Freq Control UFFC34(6):647–654 Armstrong JW Sramek RA (1982) Observations of tropospheric phase scintillations at 5 GHz on vertical paths. Radio Sci 17(6):1579–1586 Banta RM, Newsom RK, Lundquist JK (2002) Nocturnal low-level jet characteristics over Kansas during cases-99. Bound Layer Meteorol 105:221–252 Barnes J, Chi AR, Cutler LS, Healey DJ, Leeson DB, McGunigal TE, Mullen JA, Smith WL, Sydnor RL, Vessot RF, Winkler GM (1971) Characterization of frequency stability. IEEE Trans Instrum Meas 20:105120 Beutler G, Bauersima I, Gurtner W, Rothacher M (1987) Correlations between simultaneous GPS double difference carrier phase observations in the multistation mode: implementation considerations and first experiences. Manisc Geod 12(1):40–44 Bevis G, Businger S, Chiswell S, Herring TA, Anthes RA, Rocken C, Ware RH (1994) GPS meteorology: mapping zenith wet delays onto precipitable water. J Appl Meteorol Climatol 33(3):379–386 Bischoff W, Heck B, Howind J, Teusch A (2005) A procedure for testing the assumption of homoscedasticity in least-squares residuals: a case study of GPS carrier phase observations. J Geod 78(7–8):397–404 Boussinesq J (1877) Essai sur la theorie des eaux courantes, Memoires presentes par divers savants ‘a l’ Academie des Sciences XXIII, 1–680 Brunner FK, Hartinger H, Troyer L (1999) GPS signal diffraction modelling: the stochastic SIGMA-Dmodel. J Geod 73(5):259–267 Coulman CE (1990) Atmospheric Structure, Turbulence and Radioastronomical “Seeing”. Proceedings URSI/IAU Symposium on Radio Astronomical Seeing Beijing/Oxford, International Academic Publishers/Pergamon Press, pp. 11–20
Turbulence Theory
1345
Coulman CE, Vernin J (1991) Significance of anisotropy and the outer scale of turbulence for optical and radio seeing. Appl Opt 30(1):118–126 Cornish CR, Bretherton CS (2006) Maximal overlap wavelet statistical analysis with application to atmospheric turbulence. Bound Layer Meteorol 119(2):339–377 Cressie N (1993) Statistics for spatial data. Wiley, New York/Chichester/Toronto/Brisbane/ Singapore Danielson EW, Levin J, Abrams E (2003) Meteorology, McGraw Hill, Boston Davis JL (2001) Atmospheric water-vapor signals in GPS data: synergies, correlations, signals and errors. Phys Chem Earth 26(6–8):513–522 De Moor G (2006) Couche Limite atmosphérique et turbulence, les bases de la micro météorologie dynamique. Météo-France, Cours et manuels n˚16, Toulouse Domingues MO, Mendes O, Mendes da Costa A (2004) On wavelet techniques in atmospheric sciences. Adv Space Res 35(5):831–842 Dravskikh AF, Finkelstein AM (1979) Tropospheric limitations in phase and frequency coordinate measurements in astronomy. Astrophys Space Sci 60(2):251–265 Emardson TR, Jarlemark POJ (1999) Atmospheric modeling in GPS analysis and its effect on the estimated geodetic parameters. J Geod 73(6):322–331 El-Rabbany A (1994) The effect of physical correlations on the ambiguity resolution and accuracy estimation in GPS differential positioning. PhD thesis, Department of Geodesy and Geomatics Engineering, University of New Brunswick Farge M (1992) Wavelet transform and their applications to turbulence. Ann Rev Fluid Mech 4:395–457 Fried DL (1967) Propagation of a spherical wave in a turbulent medium. J Opt Soc Am 57(2):175–180 Gage KS (1979) Evidence for a k to the -5/3 law inertial range in mesoscale two-dimensional turbulence. J Atmos Sci 36:1950–1954 Hagelberg CR, Gamage NKK (1994) Structure-Preserving wavelet decompositions of intermittent turbulence. Bound Layer Meteorol 70:217–246 Hartinger H, Brunner FK (1999) Variances of GPS phase observations: the SIGMA-" model. GPS Solut 2(4):35–43 Hogg DC, Guiraud FO, Sweezy WB (1981) The short-term temporal spectrum of precipitable water vapor. Science 213(4512):1112–1113 Howind J (2005) Analyse des stochastischen Modells von GPS-Trägerphasenbeobachtungen. Deutsche Geodätische Kommission, Munich Heft Nr. 584 Howind J, Kutterer H, Heck B (1999) Impact of temporal correlations on GPS-derived relative point positions. J Geod 73(5):246–258 Ishimaru A (1984) Wave propagation and scattering in random media, vol II. Academic, New York Jin SG, Luo O, Ren C (2010) Effects of physical correlations on long-distance GPS positioning and zenith tropospheric delay estimates. Adv Space Res 46(2):190–195 Kleijer F (2004) Tropospheric modeling and filtering for precise GPS leveling. PhD thesis, Netherlands Geodetic Commission, Publications on Geodesy 56 Koch KR, Kuhlmann H, Schuh WD (2010) Approximating covariance matrices estimated in multivariate models by estimated auto- and cross-covariances. J Geod 84(6):383–397 Kolmogorov NA (1941) Dissipation of energy in the locally isotropic turbulence. Proc USSR Acad Sci 32:16–18. (Russian), translated into English by Kolmogorov, 8 July 1991. “The local structure of turbulence in incompressible viscous fluid for very large Reynolds numbers”. Proc R Soc Lond Ser A Math Phys Sci 434(1980):15–17 Kolmogorov NA (1962) A refinement of previous hypotheses concerning the local structure of turbulence in a viscous incompressible fluid at high Reynolds number. Journal of Fluid Mechanics, 13, pp 82–85 Kraichnan RH (1967) Inertial ranges in two-dimensional turbulence. Phys Fluids 10:1417–1423 Laing D (1991) The earth system: an introduction to earth science, Wm. C. Brown Publishers, Dubuque University of California
1346
S. Schön and G. Kermarrec
Lay OP (1997) The temporal power spectrum of atmospheric fluctuations due to water vapor. Astron Astrophys Suppl Ser 122:535–545 Leandro R, Santos MC (2006) An Empirical Stochastic Model for GPS. International Association of Geodesy Symposia (Ed. C. Rizos), IAG, IAPSO and IABO Joint Assembly “Dynamic Planet”, Cairns, Australia, 22–26 August 2005, Springer, pp. 179–185. Lesieur M (2008) Turbulence in fluids, 4th edn. Springer, Dordrecht Lin, CC (1953) On Taylor’s hypothesis and the acceleration terms in the Navier-Stokes equation. Q Appl Maths 10:295–306 Lindsey WC, Chi CM (1976) Theory of oscillator instability based upon structure functions. Proc IEEE 64(12):1652–1666 Luo X (2013) GPS stochastic modelling – signal quality measures and ARMA processes. Springer theses: recognizing outstanding Ph.D. research. Springer, Berlin/Heidelberg Mahrt L (1986) On the shallow motion approximations. J Atmos Sci 43:1036–1044 Mahrt L (1991) Eddy assymmetry in the sheared heated boundary layer. J Atmos Sci 48(3):472–492 Mathieu J, Scott J (2000) An introduction to turbulent flow. Cambridge University Press, Cambridge Monin AS, Yaglom AM (1975) Statistical fluid mechanics, vol 2. MIT, Cambridge Muzy JF, Bacry E, Arneodo A (1993) Multifractal formalism for fractal signals: the structurefunction approach versus the wavelet-transform modulus-maxima method. Phys Rev E 47(2):875–884 Naudet CJ (1996) Estimation of tropospheric fluctuations using GPS data. TDA progress report, pp 42–126 Nastrom GD, Gage KS (1985) A climatology of atmospheric wavenumber spectra of wind and temperature observed by commercial aircrafts. J Atmos Sci 42(9):950–960 Nichols-Pagel GA, Percival DB, Reinhall PG, Riley JJ (2008) Should structure functions be used to estimate power laws in turbulence? A comparative study. Physica D Nonlinear Phenom 237(5):665–677 Nilsson T, Haas R (2010) Impact of atmospheric turbulence on geodetic very long baseline interferometry. J Geophys Res 115:B03407 Nilsson T, Haas R, Elgered G (2007) Simulations of atmospheric path delays using turbulence models. In: Böhm J, Pany A, Schuh H (eds) Proceedings of 18th European VLBI for Geodesy and Astrometry (EVGA) working meeting, Vienna University of Technology, Vienna, pp 175–180 Oxford Dictionary of English (2010) Oxford University Press, Auflage, 2nd edn., revised (11 Aug 2010) Percival DB, Walden AT (2000) Wavelet methods for time series analysis. Cambridge University Press, Cambridge Reddi SS (1984) Eigenvector properties of Toeplitz matrices and their application to spectral analysis of time series. Signal Process 7:45–56 Riley WJ (2008) Handbook of frequency stability analysis. NIST special publication 1065 Ripley BD (1981) Spatial Statistics. Wiley, pp. 252 Rutman J (1978) Characterization of phase and frequency instabilities in precision frequency sources: fifteen years of progress. Proc IEEE 66(9)1048–1075 Romero-Wolf A, Jacobs CS, Ratcli JT (2012) Effects of tropospheric spatio-temporal correlated noise on the Analysis of space geodetic data, IVS general meeting proceedings, Madrid, 5–8 Mar 2012 Satirapod C, Ogaja C, Wang J, Rozos C (2001) GPS analysis with the aid of wavelets. In: 5th international symposium on satellite navigation technology and applications, Canberra, 24–27 July, paper 39 Schön S, Brunner FK (2006) Modelling physical correlation of GPS phase observations: first results. Kahmen H, Chrzanowski A (eds) Proceedings of the 3rd IAG symposium on geodesy for geotechnical and structural engineering/12th FIG symposium on deformation measurements, Baden, 22–24 May 2006, pp PS-18.1–8
Turbulence Theory
1347
Schön S, Brunner FK (2007) Treatment of refractivity fluctuations by fully populated variancecovariance matrices. In: Proceedings of the 1st colloquium scientific and fundamental aspects of the galileo programme, Toulouse Okt Schön S, Brunner FK (2008a) Atmospheric turbulence theory applied to GPS carrier-phase data. J Geod 82(1):47–57 Schön S, Brunner FK (2008b) A proposal for modeling physical correlations of GPS phase observations. J Geod 82(10):601–612 Seeber G (2003) Satellite geodesy. de Gruyter, Berlin Stotskii A (1973) Concerning the fluctuation of characteristics of the Earth’s troposphere. Radiophys Quantum Electron 16(5):620–622 Stotskii A, Stotskaya IM (2001) Structure analysis of wet path delays in IRIS-S experiments. In: Behrend D, Rius A (eds) Proceedings of the 15th working meeting on European VLBI, Barcelona, 7–8 Sept 2001, p 154 Stotskii A, Elgered KG, Stotskaya M (2006) Structure analysis of path delay variations in the neutral atmosphere. Astron Astrophys Trans J Eurasian Astron Soc 17(1):59–68 Stull RB (1988) An introduction to boundary layer meteorology. Springer, Dordrecht Tatarskii VI (1971a) Wave propagation in a turbulent medium. McGraw-Hill, New York Tatarskii VI (1971b) The effects of the turbulent atmosphere on wave propagation. National Technical Information Service. Springfield VA VA22161 Taylor GI (1938) The spectrum of turbulence. Proc R Soc Lond Ser A CLXIV:476–490 Teunissen PJG, Kleusberg A (1998) GPS for Geodesy 2nd ed. Springer Verlag Berlin Heidelberg Thomson MC Marler F, Allen K (1980) Measurement of the microwave structure constant profile. IEEE Trans Antennas Propag, 28(2):278–280 Thompson AR, Moran JM, Swenson GW (2004) Interferometry and synthesis in radio astronomy. Wiley, Hoboken Tiberius C, Jonkman N, Kenselaar F (1999) The stochastics of GPS observables. GPS World 10(2):49–54 Treuhaft RN, Lanyi GE (1987) The effect of the dynamic wet troposphere on radio interferometric measurements. Radio Sci 22(2):251–265 Treuhaft RN, Lowe ST (1995) Vertical scales of turbulence at the Mt Wilson observatory. In: JPL TRS 1992+BEACON eSpace at Jet Propulsion Laboratory, California Institute of Technology Van der Hoven I (1957) Power spectrum of horizontal wind speed in the frequency range from 0.0007 to 900 cycles per hour. J Meteorol 14:160–164 Vennebusch M, Schön S (2012) Generation of slant tropospheric delay time series based on turbulence theory. In: Kenyon S, Pacino M, Morti U (eds): Proceedings of geodesy for planet earth – IAG 2009, Buenos Aires, International association of geodesy symposia, vol 136, pp 801–808. Springer, New York/Berlin Heidelberg. Vennebusch M, Schön S, Weinbach U (2011) Temporal and spatial stochastic behavior of highfrequency slant tropospheric delays from simulations and real GPS data. Adv Space Res 47(10):1681–1690 Vincent A, Meneguzzi M (1991) The spatial structure and statistical properties of homogeneous turbulence. J Fluid Mech 225:1–20 Voitsekhovich VV (1995) Outer scale of turbulence: comparison of different models. J Opt Soc Am A 12(6):1346–1353 Wang J, Satirapod C, Rizos C (2002) Stochastic assessment of GPS carrier phase measurements for precise static relative positioning. J Geod 76(2):95–104 Wheelon AD (2001) Electromagnetic scintillation part I geometrical optics. Cambridge University Press, Cambridge Whichter B, Guttorp P, Percival D (2000) Wavelet analysis of covariance with application to atmospheric time series. J Geophys Res Atmos 105:14941–14962 Wieser A, Brunner FK (2000) An extended weight model for GPS phase observations. Earth Planet Space 52:777–782 Williams S (2003) The effect of coloured noise on the uncertainties of rates estimated from geodetic time series. J Geod 76(9–10):483–494. OU 2002?
1348
S. Schön and G. Kermarrec
Williams S, Bock Y, Fang P (1998) Integrated satellite interferometry: tropospheric noise, GPS estimates and implication for interferometric synthetic aperture radar products. J Geophys Res 103(B11):27051–27067 Wright MCH (1996) Atmospheric phase noise and aperture synthesis imaging at millimeter wavelengths. Publ Astron Soc Pac 108(724):520–534 Yaglom AM (1987) Correlation theory of stationary and related random functions I, II, 1st edn. Springer, New York/Berlin/Heilderberg
Forest Fire Spreading Sarah Eberle, Willi Freeden, and Ulrich Matthes
Contents 1 2
Problem Statement and Status Quo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Physical Model, Development of the Equations, and Preliminary Tools . . . . . . . . . . . . 2.1 Physical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Development of the Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Preliminary Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Collocational Solution Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Space Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Time Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Stabilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Weather and Fuel Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Parameter Influence on the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Parameter Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Perspective and Future Application Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Climate Change and Fire Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Climate Change Projections for Rhineland-Palatinate . . . . . . . . . . . . . . . . . . . . . 6.3 Future Application Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Forest Management Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1350 1351 1353 1355 1360 1363 1363 1365 1365 1371 1371 1372 1372 1372 1376 1376 1377 1378 1380 1381
S. Eberle () • W. Freeden Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany e-mail: [email protected]; [email protected] U. Matthes Rhineland-Palatinate Center of Excellence for Climate Change Impacts, Trippstadt, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_70
1349
1350
S. Eberle et al.
Abstract
Due to climate changes, more and more woodlands will be endangered by forest fires in the future. Because of this observation, it is very important to obtain information about how forest fires expand. In this contribution, we are interested in the interacting factors which influence forest fires. Our particular interest is the use of physical models, which consider heat and mass transfer mechanisms. As a result, we are led to a convection-diffusion-reaction problem which is nonstationary and nonlinear. Furthermore, we have a look at the different parameters, which are involved in these equations, especially at the meteorological and fuel data. Finally, we discuss some approaches to solve fire expansion numerically. Moreover, we give some simulations of forest fire spreading.
1
Problem Statement and Status Quo
In applied mathematics, forest fire models are usually based on a number of dynamical systems characterizing three fundamental ingredients of fire, namely, oxygen, heat, and fuel. The presence of appropriate fuel, enough oxygen, and adequate heat results in the ignition and burning of a fire (cf. Pyne et al. 1996). In the absence of one of these ingredients, the fire is caused to stop burning. An essential aspect in modeling wildfires is their particular behavior. Following the standard approaches, fire behavior may be characterized by the magnitude, direction, and intensity of a fire spread that depends seriously on the interaction of environmental conditions like vegetation (fuel), weather, and topography. The understanding of wildfires is important due to the sometimes disastrous consequences of large fires and the tremendous costs caused by ineffectual fire fighting (for a more detailed study, the reader is referred to Pyne 2004). The recognition of economical, social, and environmental impacts of forest fires demands significant studies in the field of modeling and simulation of fire prevention as well as fighting and to create risk scenarios. Even more, global problems resulting from the change in climate and environment (see Mallet et al. 2009, for more details) require that, for the practically desired scenario of wind-aided fire spread, not only a computationally efficient off-line model (for precrisis planning and post-review) should be available but also real-time guidance for evacuation and fire fighting during a crisis should be expected. Unfortunately, at the moment, the attention of scientific investigation has to be focused mainly on a simplified semiempirical treatment of fire fronts, which is idealized as an interface between expanses of burned and unburned vegetation and limited by the difficulty to anticipate vagaries of ignition events. Reflecting these assumptions, the following models are currently used in mathematical application: Fire risk models: These models qualify and quantify the fire danger at a given location within a certain time. Most of these models contain local weather patterns
Forest Fire Spreading
1351
(like temperature, humidity, wind), vegetation features (fuel type, moisture level), and topographic information (profiles, elevation, slope) (see, e.g., Finney et al. 2011). The goal is an integrative single index of the fire risk. Deterministic and statistical models use fire danger indices based on vegetation patterns, fuel moisture conditioning, meteorological variables, amount of involved people, and past history of fires. Fire behavior models: Fire behavior models aim at the description, the propagation, and the spread of fires under various relevant conditions. Fire spread models can be distinguished roughly as empirical, semiempirical, and physical. The fire spread models usually model the fire growth in two- and three-dimensional space over time. Rothermel (1972) attempts a description of fuel including depth, loading, percentage of dead fuel, moisture of extinction, heat content, surface area to volume ratio, mineral content, silica content, and particle density. His approach is interested in predicting the details of fire physics adequately. It should be noted that Anderson’s two-dimensional concept (see Anderson 1983) assumes that the fire spread area is an ellipse in which the source lies at one focus and the major axis is parallel to the maximum spread direction, which is influenced by local wind and slope. Fire effect models: These models are formulated to predict the effects of fire on various components of particular ecosystems (see, e.g., Flannigan et al. 2000). Temperature profiles in forests are studied to explain the impacts on soil processes. Expert systems attempt to imitate an actual fire event to validate the expected economical and financial in order cost among other aspects. In our work, we concentrate on the following aspects: we start with mathematical modeling of forest fire spreading based on a physical model, where we consider the physical processes (heat and mass transfer mechanisms) as well as the chemical processes (combustion). In doing so, we are canonically led to a convectiondiffusion-reaction problem. Afterwards, we come to the mathematical solution theory, where we focus on the numerical approximation of the aforementioned problem. Before we present some results of the numerical simulation of forest fire spreading, we investigate the required data and their influence on the model behavior. Finally, we conclude our work with the formulation of perspectives and future fields of application involving climate change and fire risk, in particular in its specification to the German state Rhineland-Palatinate.
2
Physical Model, Development of the Equations, and Preliminary Tools
Before we deal with the chemical and physical background of forest fires, we give a short overview of the different model types which are used to describe the behavior of forest fire propagation. It is conventional to distinguish three different model types, thereby following a classification which can be, found, for example, in Pastor (2003):
1352
S. Eberle et al.
Empirical model: Empirical models do not describe physical processes. Instead, they use statistical methods. In consequence, these models can be applied only under the same prerequisites (e.g., for the same fuel types and weather conditions). One of the most famous models is seen in the McArthur model (see McArther 1967). Semiempirical model: In semiempirical models some physical basics are in use. In doing so, one has to be aware of the fact that the capability of simulating conditions governing the reality is bounded (for more details, the reader is, e.g., referred to Rothermel 1972). Physical model: Physical models are based on physical laws. These models have some advantages over the aforementioned models in the following sense: • The application is only limited by the assumptions of the model itself and does not depend on a certain amount of the experimental data. • Physical models can also be applied, at least to some extent, if one is confronted with a lack of data for the actual forest fire behavior. A survey on the different models (including the mathematical/physical features, which have been used so far) can be found in various publications, for example, in Pastor (2003) and Schöning (2009). In the following, different models from 1946 to 2000 are presented: Models which use a theoretical approach can be found in Emmons (1946), Fons (1946), Thomas (1967), Van Wagner (1967), Anderson (1969), Hottel et al. (1971), Pagni and Peterson (1973), Steward (1974), Telisin (1974), Konev and Sukhinin (1977), Cekirge (1978), Albini and Baughman (1979), Fujii et al. (1980), Grishin et al. (1983), Huang and Xie (1984), Albini (1986), De Mestre et al. (1989), Weber (1989), Weber (1991), Catchpole and Catchpole (1994), Croba et al. (1994), Dupuy (1997), Grishin (1997), Linn (1997), Larini et al. (1998), Santoni and Balbi (1998), and Margerit and Sero-Guillaime (1999). But in the last years, some extended models were developed as, for example, Ferragut et al. (2004) and Mandel et al. (2007). More empirical approaches are given in McArther (1967), Frandsen (1971), Rothermel (1972), Griffin and Allan (1984), Sneeuwjagt and Peet (1985), Burrows et al. (1991), Forestry Canada Fire Danger Group (1992), Marsden-Smedley and Catchpole (1995), Catchpole et al. (1998), Cheney et al. (1998), Fernandes (1998), McCaw (1998), Vega et al. (1998), Viegas et al. (1998), and Burrows (1999). Our purpose is to base our studies on a physical model that is quite similar to the approach proposed by Asensio and Ferragut (2002). Within this context, we have to be aware of the fact that all models represent simplifications describing the reality as close as possible. In order to get a first impression of the factors which essentially influence the forest fire behavior, we illustrate the main ingredients of the conceptional model in Fig. 1 thereby following Weibel et al. (2010). The constituting components (listed in Fig. 1) will be analyzed later.
Forest Fire Spreading
1353
Fig. 1 Conceptional model of the interacting components influencing forest fire behavior
2.1
Physical Model
Based on the aforementioned considerations of the influencing factors of forest fire, we continue our study with the introduction of a physical model for forest fire spreading.
Combustion Process Our point of departure is a short discussion of ignition and combustion of fuel (for comparison see, e.g., Joos 2006; Quintiere 2006). In order to characterize the combustion process, we have to observe the chemical components of wood and other forest vegetation. The main ingredients of wood and other plants are lignin and cellulose, which react with oxygen under formation of carbon dioxide and monoxide. At a certain temperature, most of the reactions are exothermic, which means that heat is dissipated into the surrounding. This is the reason why we start from the following simplified reaction equation (see, e.g., Quintiere 2006): fuel C oxidant ! products C energy
Heat and Mass Transfer Mechanisms Propagation of fire happens through heat transfer between particles. To describe this phenomenon, we consider heat transfer as heat conduction and radiation and mass transfer as diffusion and convection, respectively: Heat conduction: Heat conduction is understood as heat transfer within a body or between different bodies by direct contact. This aspect can be neglected for our consideration, since it only plays a small role for propagation. Nevertheless, it is important for the ignition.
1354
S. Eberle et al.
Fig. 2 Graphical illustration of the propagation of a fire front
Absorption: The heat transfer is caused by electromagnetic energy, which is transformed into internal energy and absorbs thermal energy. Diffusion: Diffusion is mass transport through the mixture of gases due to temperature differences. It can be formulated in accordance with Fick’s law jd D DrT;
(1)
where jd is the diffusive flow, D the diffusion coefficient, and T the temperature. Convection: Mass transport happens due to streaming gases. It is given by the law jc D vT;
(2)
where jc denotes the convective flow and v the velocity of gas. For the convenience of the reader, Fig. 2 gives some insight into the transfer mechanisms of a wildfire following Chandler et al. (1983). In accordance with the scheme of Fig. 2, the description of the burst of fire is proposed by Schöning (2009), thereby pointing out the following time phases of fire generation: 1. Ignition starts due to contact of the flame with the fuel depending on the kind of fuel and water content. 2. The first acceleration begins, i.e., the propagation and the increasing of the flames. 3. A formation of a fire ring can be observed, where the fuel in the center is burnt out such that the flame temperature decreases. 4. Finally, if there is wind and, as the case may be, slope, the fire grows and wildfire begins. Concerning the until now used physical models, we have to state that they are basically formulated under the following assumptions (similarly to Mandel et al. 2007):
Forest Fire Spreading
1355
Conservation of energy: No energy is lost. It is just transformed into other forms of energy. This holds true especially for the energy of the fuel, which is converted into thermal energy due to the reaction of the oxygen with the fuel. Balance equation of fuel supply: During the combustion, the fuel gets smaller. Thus, it may be assumed that the mass fraction of the fuel is between the value 0 and 1, where 0 denotes the case that no fuel is left and 1 designates the case, where the total amount of fuel is available (see, e.g., Torero and Simeoni 2010). Fire reaction rate: For the description of the speed of the combustion, the Arrhenius law comes into play EA ; r D A exp RT
(3)
where A is the reaction factor, EA the activation energy, and R the universal gas constant. As noted before, the prediction of the fire reaction rate can be attacked by a reduction of heat transfer determination. Convection and diffusion transfer of the heat of the burning area to the unburnt area can be formalized mathematically. Hence, the energy balance of a fuel element before the burning connects the enthalpy of fuel with the energy input of the flames and the energy loss to environment (cf. Weber 1991).
2.2
Development of the Equations
Now we come to the derivation of the convection-diffusion-reaction problem, which is based on the previously stated laws, i.e., the conservation of energy, the balance equation of fuel supply, and the fire reaction rate. We collect all material just for modeling purposes without going into a detailed study of mathematical prerequisites. The point of departure is the consideration of mass and fuel, where the conservation of mass is given by Z dV D m:
(4)
is the domain of consideration, which is bounded and connected, the density, dV the volume element, and m the mass of fuel. Accordingly, d dt where
Z
dV
D
d denotes the total time derivative. dt
d mDm P D 0; dt
(5)
1356
S. Eberle et al.
Furthermore, we introduce the mass change at the balance volume as @Yi @mi D dV ; @t @t
(6)
where mi is the mass element, i the number of mass elements, and Yi the mass fraction of fuel concerning the mass element. With this knowledge in mind, we can continue with the balance equation of fuel supply, where the change of the mass fraction obeys the following relation @Y D Y r; @t
(7)
where r is the reaction rate. Next, we have a closer look at the conservation of energy which can be characterized by the following features (for a more detailed explanation, see, e.g., the monograph by Ansorge and Sonar 2009): (i) The thermal energy inside is given by Z cT dV;
(8)
where c denotes the heat capacity and T the temperature. (ii) The thermal energy flowing through the boundary @ is determined by Z j n dS;
(9)
@
where j is the thermal flow, n the outer (unit) normal field with respect to @, and dS the surface element. (iii) The thermal energy in generated by the reaction s in at time t is obtained by Z s dV: (10)
At this stage, we are led to the following integral equation d dt
Z
cT dV
Z
Z
D
j ndS C @
s dV:
(11)
s dV:
(12)
By applying the Gauss theorem, we therefore end up with d dt
Z
cT dV
Z D
Z r jdV C
Forest Fire Spreading
1357
In consequence, we have Z
Z Z @ .cT / C vr .cT / dV D r j dV C s dV; @t
(13)
thus, we obtain the integral identity Z Z Z @T C cvrT dV D r j dV C c s dV: @t
(14)
Finally, from the integral representation (14), we can derive the partial differential equation for specifying the conservation of energy as c
@T C cvrT D r j C s: @t
(15)
In Eq. (15) we can understand the total flow jtotal in a separated way jtotal D cvT C j:
(16)
This canonically leads to a convective, diffusive and radiative part jtotal D jc C jd C jr :
(17)
The radiative part jr is given by jr D q.T /;
(18)
where q.T / denotes the nonlinear heat flux. Furthermore, we need the well-known Fourier’s law (see, e.g., Joos 2006). This law states that the (local) thermal flow through a material is equal to the product of the (local) heat conductivity and the (local) negative gradient of the temperature jd D krT:
(19)
Inserting the radiative part of the flow (18) and the diffusive part (19) in the conservation equation (15), we arrive at c
@T C cvrT r .q C krT / D s: @t
(20)
The source term s is represented by the released heat Q and the cooling term s D Q h.T T1 /;
(21)
1358
S. Eberle et al.
where T1 is the ambient temperature and h the convection coefficient. Altogether, as already announced earlier, our approach leads to a similar equation, as developed by Asensio and Ferragut (2002). Recapitulating all details we end up with Eq. (22), which describes the change of the temperature c
@T C cvrT r .q C krT / D Q h.T T1 /; @t
(22)
containing the following constituents: c T k t T1
v q k Q h
Density Specific heat capacity Temperature Thermal conductivity Time Ambient temperature
Wind velocity Nonlinear heat flux Thermal conductivity Released heat Convection coefficient
After these preparations, our purpose is to turn over to the released heat and the nonlinear heat flux (once again, our formulation essentially is in accordance with the concept due to Asensio and Ferragut (2002)). First we consider the nonlinear heat flux q and use the Fourier law of the heat transfer to see that qD
QP ; dS
(23)
where QP denotes the change of the released heat. In addition, we make use of the Boltzmann law QP D p D AT 4 ;
(24)
where p denotes the radiated power, a constant, and the Boltzmann constant. Letting D 1 we get p D AT 4 ;
(25)
q D T 4 .x C ı/ T 4 .x/ :
(26)
such that
By virtue of the linear Taylor expansion, we therefore arrive at the identity q D 4ıT 3 rT:
(27)
Forest Fire Spreading
1359
In this context the released heat Q can be specified by the identity Q D HSf ;
(28)
where H denotes the combustion heat and Sf the descent of educts (fuel). The descent of educts can be expressed by Sf D rY:
(29)
For the fire reaction rate r, we have to take into account the Arrhenius equation (3). Moreover, our framework leads us to a reformulation of the released energy Q in the form EA Q D HA exp Y: RT
(30)
If we plug Eq. (30) of the released energy Q in Eqs. (22) and (7), we are finally led to the convection-diffusion-reaction problem which can be formulated as follows: @T EA c D cvrT C r .q C krT / C HAY exp h.T T1 /; @t RT (31) EA @Y D AY exp : (32) @t RT
In order to observe the change between solid and gas phase, we make the following assumption (in accordance with, e.g., Mandel et al. 2007) to distinguish two different stages, thereby observing the fact that the reaction rate only occurs if the temperature of the fuel T is greater than the ignition temperature Tig . More concretely, we have
c
@T @t @Y @t
8 3 ˆ ˆ Tig ; ˆ ˆ :cvrT C r ..k C 4ıT 3 /rT /; T T ; ig ( EA AY exp RT ; T > Tig ; D 0; T Tig :
(33)
(34)
Obviously, these equations are coupled. Equation (33) describes the temperature of the fuel and Eq. (34) shows the behavior of the mass fraction of the fuel. We want to remark that this model does not include the wind generated by the fire itself.
1360
S. Eberle et al.
For a numerical realization, a simplified model is usually proposed, where we assume a constant diffusion coefficient such that ( EA @T h.T T1 /; T > Tig ; cvrT C DT C HAY exp RT D c @t cvrT C DT; T Tig ; (35) ( EA @Y AY exp RT ; T > Tig ; D (36) @t 0; T Tig : In order to take the inherent slope into account, the convection-diffusion-reaction problem can be modified by considering a multivalued operator (for more details the reader is referred to the work by, e.g., Ferragut et al. 2004, 2005).
Comparison of Different Models A survey on the different models (including the mathematical/physical features, which have been used so far) can be found in various publications, for example, in Pastor (2003) and Schöning (2009). It should be mentioned that the physical model we refer to in this chapter is based on a convection-diffusion-reaction formulation which will be studied later on in numerical manner.
2.3
Preliminary Tools
Next, we are interested in the mathematical background of the Eqs. (35) and (36). It is helpful to start with the nondimensionalization.
Nondimensionalization We adopt the nondimensionalization for the formulation of Eqs. (31) and (32) by Ferragut and Asensio (2004). In doing so, we also include the variable transform as proposed by Frank-Kamenetskii (as presented by, e.g., Joos 2006): in other words, our consideration starts with fixing the initial conditions, i.e., the initial temperature T0 D T .0/ T1 and the initial fuel Y0 D Y .0/ 0. Additionally, the characteristic parameters t0 ; l0 > 0 for time t0 and q length l0 , respectively, are 1 t0 k " 1 introduced as follows: t0 D q A exp " and l0 D c . In addition, the quantities arising in the nondimensionalization are as follows: the nondimensionalized position xN D lx0 , the nondimensionalized time tN D tt0 , the nondimensionalized T1 temperature uN D T"T , the nondimensionalized mass fraction of fuel yN D YY0 , 1 and the nondimensionalized wind w D lt00 v, where the occurring parameters are understood as the inverse activation energy ", the thermal flow q, the reaction factor A, the thermal conductivity k, the density , and the specific heat capacity c. Observing all these manipulations in the convection-diffusion-reaction problem (see, e.g., Asensio and Ferragut 2002), we are led to the nonlinear equation system
Forest Fire Spreading
1361
uN @Nu D wr uN C Nu C yN exp ˛ uN ; @tN 1 C "Nu @yN " uN D yN exp @tN q 1 C "Nu
(37) (38)
in Œ0; tend , where R2 denotes the domain of consideration, which is canonically supposed to be bounded and connected. In addition, is the thermal conductivity and ˛ the convection coefficient. The essential questions which should be clarified are the existence and uniqueness of a solution of the coupled system (37) and (38). In fact, some extensive information can be already deduced from the literature. Usually, one deals with the convection-diffusion-reaction problem (37) and (38) in a more general setup by discussing the equations @Nu C wr uN r .r uN / D f .Nu; y/; N @tN
(39)
@yN D g.Nu; y/ N @tN
(40)
in Œ0; tend . Burmann and Ern (2005) verify the existence and uniqueness of the linear convection-diffusion-reaction equation, but, since we have to deal with a coupled system representing a nonlinear and time-dependent convection-diffusionreaction equation, their problem covers only a special case of the aforementioned problem. Diez and Vrabie (1994) are concerned with a nonlinear reaction-diffusionreaction model. Their proof for existence and uniqueness was adopted by Asensio and Ferragut (2002) in order to discuss the nonlinear convection-diffusion-reaction problem. Keeping all prework concerning a reasonable understanding of existence and uniqueness in mind, we are attempted to formulate a numerical solution theory of the system (37) and (38) by collocational procedures.
Numerical Prework Before we start with our collocational solution process, we are well advised to provide a short overview of the relevant numerical methods which have been used so far for the simulation of forest fire spreading. In doing so, we only give some insight into the numerical solution technique without any consideration of all the involved assumptions and conditions, which can be found in the quoted literature. Usually a weak formulation of the differential system was considered as appropriate tool: Asensio and Ferragut (2002) used finite element methods, where the convective term was discussed separately in form of a so-called splitting method. More specifically, they started with a convective step and performed another step for the diffusive and reactive term afterwards. They subsequently applied the Godunov method for the convective term, an implicit Euler scheme for the reaction term, a finite element method for the diffusion, a Runge-Kutta scheme
1362
S. Eberle et al.
for the time discretization, and finally a upwinding scheme. Perminov (2010) based his approach on a splitting method for the numerical solution of a convectiondiffusion-reaction problem. In more detail, he considered a splitting complying the physical processes. Mandel et al. (2007) applied a central finite difference scheme for the space discretization, an explicit Euler method for time discretization and combined it with a data assimilation technique to get some initial conditions. Clearly, the weak formulation is the conventional way to solve a nonlinear convection-diffusion-reaction problem. Because of the nonlinearity, however, this concept shows some severity. Indeed, we are usually confronted with the difficulty of artificial oscillation phenomena, which appear in the numerical realization. One possibility to overcome this problem seems to be the application of a discontinuous Galerkin scheme (as proposed, e.g., for the Navier-Stokes equation by Fengler and Freeden 2005). To avoid any integration to realize numerical efficiency and economy, we will present a stabilization technique by flux-corrected tools in the next section. As a matter of fact, artificial oscillations are caused by Gibb’s phenomena and usually occur in the convection-dominated case. The Gibb’s phenomenon appears in numerical approximations at discontinuities and causes over- and undershoots. In order to overcome the calamity concerning the convection dominance, we investigate the so-called Péclet number in a more explicit way. According to its definition, the Péclet number Pe displays the ratio of heat transport by convection and conduction Pe D
l 2 cw l0 v l0 vc D D 0 ; a k kt0
(41)
where l0 is the characteristic length, w the velocity, a the thermal diffusivity, the density, c the specific heat capacity, k the thermal conductivity, and t0 the characteristic time. Obviously, the most important parameter is the wind velocity. In fact, the wind influences the Péclet number seriously. We are interested in two different cases: Pe > 1; i.e., the convection-dominated case Pe 1; i.e., the diffusion-dominated case As mentioned before, in order to obtain a solution avoiding oscillations caused by the Gibb’s phenomenon, we have to stabilize the convection-diffusion-reaction part. In this respect it should be remarked that the tools for flux-corrected transport are a classical technique. In the meantime, some other approaches like essentially non-oscillatory (ENO) and weighted essentially non-oscillatory (WENO) schemes have been proposed and can be found, e.g., in Harten (1983) and Shu (1997). Furthermore, John and Novo (2012) compared flux-corrected transport tools with ENO/WENO schemes in the context of a linear convection-diffusion-reaction problem. For more details and numerical results, the reader is referred to Eberle (2015).
Forest Fire Spreading
3
1363
Collocational Solution Theory
Next, we deal with the discrete numerical solution theory based on a collocational method for the space discretization and a Crank-Nicolson scheme for the time discretization instead of methods involving the weak formulation of the problem. In consistency with our theoretical concept, we have to discuss ( EA @T cvrT C DT C HAY exp RT h.T T1 /; T > Tig D c @t cvrT C DT; T Tig (42) ( EA @Y AY exp RT ; T > Tig D (43) @t 0; T Tig in Œ0; tend with initial conditions T .0/ D T0 and Y .0/ D Y0 . However, for the sake of simplicity, we consider a more general case of the system (42) and (43) @T EA c D cvrT C DT C HAY exp h.T T1 /; @t RT EA @Y D AY exp @t RT
(44) (45)
in Œ0; tend with initial conditions T .0/ D T0 and Y .0/ D Y0 . Since the coupled equations (44) and (45) depend on time and space, we have to perform both discretizations.
3.1
Space Discretization
The collocational method for the space discretization consists of the specification of a finite-dimensional space of candidate solutions and a certain number of points lying in the domain . We select a solution in such a way that Eqs. (44) and (45) as well as the corresponding initial conditions at the collocation points are satisfied. In more detail, we introduce two types of grids, the grid X R2 reflecting the collocation points and the grid Z R2 relating to the trial functions. In fact, our decision is to choose radial basis functions in R2 , which are at least twice continuously differentiable as well as the same grid for the collocation points and the trial functions. First we want to handle the initial conditions. For that purpose, we use the following sum to determine the approximation of the initial temperature u0 u0i D
N X j D1
aj0 .xi ; zj /;
xi 2 X; zj 2 Z; i D 1; : : : ; N;
(46)
1364
S. Eberle et al.
0 T where denotes the basis function and a0 D .a10 ; : : : ; aN / is the vector of 0 coefficients aj of the temperature. The expression (46) yields the mass matrix M with entries mij D .xi ; zj /. In shorthand notation, the approximation of the initial conditions leads to the equation system
M a 0 D u0 :
(47)
Clearly, the coefficients are needed for the space discretization within the time step scheme, which we will consider later. We use as the ansatz function for the temperature of the fuel T a radial basis function, e.g., the function (see, e.g., Wendland 2005)
.r/ D
( 4 4r 1 Rr C1 ; 0r R R 0;
r>R
;
(48)
where, as usual, r Dk x z k, x; z 2 R2 . Obviously, the radial basis functions of type (48) possess a compact support. Of course, other kinds of locally supported radial basis functions can be used (e.g., Eskin 1981; Cui et al. 1992; Schaffeld 1988). Indeed, our comparison showed that the influence of locally supported radial basis functions within the collocation procedure is relatively small for the local supports chosen correspondingly. Under this assumption, the collocation mass matrix contains the values of the function (48) for different distances between collocation points and points of the centers. In conclusion, the structure of the mass matrix is symmetric and sparse. Furthermore, all entries are positive and the matrix is regular. After the approximation of the initial conditions, we use the following ansatz for the space discretization of the temperature T in the convection-diffusion-reaction problem (44) and (45), i.e.,
ui D
N X
aj j .xi /;
xi 2 X; i D 1; : : : ; N;
(49)
j D1
where j .xi / D .xi ; zj / are the entries of the basis function as defined in Eq. (48). Altogether, we arrive at the following equations for the approximations u and y of the temperature T and the mass fraction of the fuel Y, respectively: N N X X @u EA D cv r c aj j C D aj j C HAy exp @t Ru j D1 j D1 N X h aj j T1 ; j D1
(50)
Forest Fire Spreading
EA @y D Ay exp : @t Ru
1365
(51)
Since the differential operators are linear, we are able to justify the superposition of the terms such that N X @u EA c D ; cvr j C D j h j aj C hT1 C HAy exp @t Ru j D1 EA @y D Ay exp : @t Ru
(52) (53)
But, because of its nonlinearity, we have to consider the reaction part exp ERuA occurring in Eqs. (52) and (53) separately. This is the reason why we use an iterative approach, that is, we apply the approximation of the former time step in our timestepping scheme.
3.2
Time Discretization
As Eqs. (44) and (45) also depend on time, we need to do some time discretization. An appropriate concept seems to be a time-stepping scheme like the Crank-Nicolson scheme. Actually, our approach enables us to simulate a forest fire for the diffusiondominated case appropriately. However, the convection-dominated problem still causes some trouble because of the occurrence of undesired oscillations. As compensation, some stabilization procedure within the temperature approximation is needed.
3.3
Stabilization
For the stabilization using tools of flux-corrected transport, we follow the procedure proposed by Kuzmin (2012) as well as John and Novo (2012). The basic idea is to add an artificial diffusion term in a first step that will be compensated later in a diffusion-away procedure (anti-diffusion). In this process, they introduced some design criteria which we adopted and modified for our problem. The construction is done in such a way that the scheme is positivity preserving and local extremum diminishing. To get more insight into the technique, we need some definitions and theorems concerning positivity preserving and local extremum diminishing which build the background for the flux-corrected transport tools. We start with some definitions by Kuzmin et al. (2012) (pages 147–149).
1366
S. Eberle et al.
Definition 1 (Positivity preserving). The solution of a scalar transport equation is positivity preserving, if min u 0 @
)
u 0;
(54)
where u is the solution of a scalar transport equation and @ is the boundary of the reference domain . Because we work with algebraic systems, we need the definition of the so-called M-matrix (see, e.g., Möller 2008, page 24), which plays an important role within positivity preserving schemes. Definition 2 (M-matrix). A square matrix M D fmij g with mij 2 R is called an M-matrix, if it is non-singular, its off-diagonal coefficients satisfy mij 0 for all j ¤ i and for M 1 all entries m1 ij 0. This definition is needed for the following theorem, which tells us under which conditions a discrete numerical scheme is positivity preserving. Theorem 3. A fully discrete numerical scheme is positivity preserving, if it can be written as an algebraic system of the form M unC1 D Kun;
(55)
where M D fmij g is an M-matrix, K D fkij g has no negative entries, and uni , uinC1 are the coefficients for u at the time step n and n C 1. This setting gives rise to continue our consideration with the semi-discrete problem M
du D Ku; dt
(56)
where M D fmij g is the mass matrix and K D fkij g the discrete transport operator, which leads to the identity X X d uj D mij kij uj : dt j j
(57)
As mentioned before, we also want that the numerical scheme is local extremum diminishing, which means that no new extrema can form and existing extrema cannot grow. Therefore, we borrow the following definition from Möller (2008) (page 23):
Forest Fire Spreading
1367
Definition 4 (Local extremum diminishing). A space discretization of the form (57) with mi i > 0 and kij 0 for all i and j ¤ i is called local extremum diminishing (LED). Next, we go on with some helpful equipment available from Kuzmin et al. (2012) Theorem 5. Suppose that (57) satisfies the properties mi i > 0;
(58)
mij ; kij 0 i ¤ j:
(59)
Then the following a priori estimates hold true for the solution ui : P
d ui 0. dt (ii) If uj .0/ 0 for all j , then ui .t/ 0 for all t > 0 (positivity preservation). (i) If
j
kij D 0 and ui uj , for all j ¤ i , then
The proof can be found in Kuzmin et al. (2012) (page 149). The details are omitted here. Nevertheless, it should be remarked that all conditions of Theorem 5 are fulfilled for our collocational method; thus, it can be applied in our framework. The next important thing we are interested in are criteria for admissible time steps, where we are confronted with the following -scheme: mi i
X X unC1 uni i D cijnC1 .unC1 unC1 / C .1 / cijn .unj uni /: j i t j ¤i
(60)
j ¤i
Herein, mi i is the diagonal element of the mass matrix M ; uni and unC1 are the i coefficients for u at the time step n and n C 1; t is the time step; is a suitable parameter, which leads to the different kinds of time-stepping schemes as, for example, Euler scheme and Crank-Nicolson scheme; and cij is the coefficient of space discretization. The largest admissible time step is bounded from above, if some diagonal coefficients cinC1 are strictly positive: i admis tmax
0 and 0 < 1. where cinC1 i Now, we have the required background to construct the flux-corrected transport scheme as proposed by Kuzmin et al. (2012), which is designed in such a way that it is positivity preserving and local extremum diminishing. The following approach will be found in the PhD Thesis by Eberle (2015), too. Applied to our problem of forest fire spreading, we first have to add an artificial
1368
S. Eberle et al.
diffusion in order to get the low-order problem. To this end, we calculate the lumped mass matrix ML D mL ij (Kuzmin et al. 2012, pages 154–155) instead of the mass matrix M with X mi k for i D j; (62) mL ii D k
mL ij
D0
for i ¤ j:
(63)
In this context, the operator K H of the high-order problem and the diffusion operator D are introduced in order to get the low-order operator K L D K H C D. The operator K H of the high-order problem is given by kijH . / D cvr .xi ; zj / C D .xi ; zj /
(64)
and describes the convection and diffusion, where, as usual, denotes the basis function. We introduce the diffusion operator in the form: X di k ; (65) di i D k¤i
dij D dj i D maxf0; kijH ; kjHi g
for i < j
(66)
by following the approach by Kuzmin et al. (2004) (page 10). The operator K L is constructed in such a way that it has zero row and column sums, since under this condition, Theorem 5 tells us that the scheme is local extremum diminishing. Furthermore, the operator K L is an M-matrix from which the positivity preserving property can be concluded (see Theorem 5). On the right-hand side of Eq. (44), we recognize the reaction part of our convection-diffusion-reaction problem, i.e., EA q.u/ D HAY exp h.u T1 / (67) RT and denote by q n the reaction term at time step n. We follow the indicated procedure and approximate the coefficients a by an D an1
tn 1 L n1 ML .K a q n1 /: 2
Next, a time-discretization via the Crank-Nicolson scheme . D order problem is performed as follows
(68) 1 / 2
.ML tK L /an D .ML C .1 /tn K L /an1 C tq n ;
for the low-
(69)
where is a parameter, tn is the time step size, an are the coefficients of the temperature at the time step n, and q n contains the source term. Subsequently, the coefficients aQ are approximated by calculating
Forest Fire Spreading
1369
aQ D an1
tn 1 L n1 ML .K a q n1 /: 2
(70)
After these operations we apply Zalesak’s algorithm, to be found in Zalesak (1979) (page 341), in order to modify the right-hand side of the problem. For that purpose, we are led to the residuum: ri D
X
rij
(71)
j
D
X mij aQ in aQ jn mij ain1 ajn1 j
tn n dij aQ i aQ jn C ain1 ajn1 ; 2
where i is the index of the collocation point i and j the index of the neighbor points j . The residuum r and the weights ˛ are used to calculate the modified right-hand side q .aQ n ; an1 / D
X
˛ij rij ;
(72)
j
where the parameters ˛ 2 Œ0; 1 are determined by Zalesak’s algorithm. Therefore, we just take the N next neighbors into account: 1. First we consider PjC D
N X
maxf0; rij g;
(73)
minf0; rij g:
(74)
i D1 i ¤j
Pj D
N X i D1 i ¤j
PjC and Pj describe the input of the fluxes at point j from all neighbor points i . 2. After that we calculate QjC D max 0; max .aQ i aQ j / ;
(75)
Qj D max 0; min .aQ i aQ j / :
(76)
j D1;:::;N
j D1;:::;N
1370
S. Eberle et al.
Fig. 3 Flux limiter due to Zalezak (1D scheme)
QjC and Qj are the distances of the values at the point i to the maximum/minimum around the neighbor point j . 3. Now we use the values PjC , Pj , QjC and Qj to calculate ( RjC
D min 1; (
Rj
D min 1;
mj QjC PjC mj Qj Pj
) ;
(77)
:
(78)
)
4. Next, we are in position to determine the weights 8 o n 0; else:
(79)
In order to get more insight to this limiter, we introduce the following graphic for a 1-dimensional problem: Finally, the collocation coefficients are determined for the modified problem in order to get the stabilized solution by solving the system an D an1
tn 1 L n1 q n1 q : ML K a 2
(80)
Summarizing the constituents of our stabilization approach, we attain the following conclusion: the tool of flux-corrected transport as proposed by Kuzmin et al. (2004) and John and Novo (2012) can be adopted to the convectiondiffusion-reaction problem, thereby using a collocation method in terms of certain locally supported radial basis functions for space discretization as well as a time-stepping scheme for time discretization. To realize suitable simulations based on the described procedure, we need some input data for the parameters which are characterized by the nature of the forest we are dealing with.
Forest Fire Spreading
4
1371
Data Analysis
Next, we deal with the observables that are of significance for our purposes. As we have seen, the set of input data plays an important role. Indeed, they have a great influence on the applied numerical techniques, as already pointed out when discussing the Péclet number.
4.1
Weather and Fuel Data
In order to classify our information, we split the data into two groups: weather data (as shown in Fig. 4) and fuel data (as depicted in Fig. 5). For the weather data, we have to take into account the air temperature and the wind velocity, as well as its direction. The set of fuel data is grouped into the tree species, ground vegetation, and dead wood, from whom we can derive the density, specific heat capacity, and thermal conductivity. The relevant observables that are required in our context can be listed
Fig. 4 Categorization of required weather data
weather data
wind velocity
air temperature
ambient temperature
Fig. 5 Categorization of fuel data
fuel data
tree species
ground vegetation
- density - specific heat capacity - thermal conductivity
dead wood
1372
S. Eberle et al.
as follows: density , specific heat capacity c, thermal conductivity k, released heat Q, and convection coefficient h.
4.2
Parameter Influence on the Model
In order to receive a better insight into the parameters influencing the model, we specify various situations for different forests (note that our considerations are essentially influenced by Séro-Guillaime and Margerit (2002a,b)): 1. In a forest with homogeneous material without wind, a circular spreading of fire can be observed. 2. If wind occurs in a forest with homogeneous material, the fire propagates in wind direction in a drop-shaped form. 3. For a forest with heterogeneous material, extensive calculations have to be done, for both cases with and without wind. These features demonstrate the importance as well as the difficulties in simulating forest fire. In addition, parameters like density, thermal conductivity, and so on are dependent on the water content of the fuel. For example, the density of conifers is smaller than that of broadleafs. Beside density, thermal conductivity also plays an important role. Here it can be stated that the value for broadleafs is larger than the one of conifers. For other forest plants, these properties can be considered in a similar way. Needless to say, the accurate simulation of forest fire spreading requires all the concerning weather data and fuel data.
4.3
Parameter Studies
Next, we study some relations between the different parameters and their effects on each other as well as on the temperature T and the mass fraction of the fuel Y . The behavior of the mass fraction of the fuel depending on the time t shows us that the mass fraction decreases very fast at the beginning and slower in a later stage (see Fig. 6). The reaction rate, which plays an important role in forest fire spreading, is illustrated in Fig. 7. We recognize that the reaction becomes faster with increasing temperature.
5
Numerical Example
We conclude our work with some first results of a forest fire simulation by use of the collocational method. Therefore, we evaluate the behavior of the temperature of the fuel and the mass fraction after some time steps. In doing so, we restrict ourselves to the following situation:
Forest Fire Spreading
1373 behavior of mass fraction of fuel
1 0.9 mass fraction of fuel
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0
10
20
30
40
50
60
70
80
90 100
time
Fig. 6 Exemplary behavior of mass fraction of fuel (The collection of input data is taken from Asensio and Ferragut 2002)
reaction rate
0.7 0.6
reaction
0.5 0.4 0.3 0.2 0.1 0 300
400
500
600
700
800
900
1000 1100 1200
temperature of fuel
Fig. 7 Exemplary reaction rate (The collection of input data is taken from Asensio and Ferragut 2002)
• We assume that there is one ring-shaped fire source placed in the north the area. • Furthermore, we consider two different fuel types (one on the right-hand side and the other on the left-hand side). • Additionally, we have the wind directed to the south.
1374
S. Eberle et al.
In accordance with these assumptions, we observe an elliptical spreading. Moreover, the spreading of the fire is faster for the fuel of type 1. Other computations based on B-spline radial basis functions can be found in Eberle (2014).
750
1
700
0.9
650 600
0.8 0.7 0.6
550 0.5 500 450
0.4 0.3
400
0.2
350
0.1
300 750
0 1
700
0.9
650 600
0.8 0.7 0.6
550 0.5 500 450
0.4 0.3
400
0.2
350
0.1
300 750
0 1
700
0.9
650
0.8
600
0.7 0.6
550 0.5 500 0.4 450
0.3
400
0.2
350
0.1
300
0
Fig. 8 Temperature (given in K) and mass fraction of fuel
Forest Fire Spreading
1375 750
1
700
0.9
650
0.8
600 550
0.7 0.6 0.5
500 450
0.4 0.3
400
0.2
350
0.1
300
0
750
1
700
0.9
650
0.8
600 550
0.7 0.6 0.5
500 450
0.4 0.3
400
0.2
350
0.1
300
0
750
1
700
0.9
650
0.8
600 550
0.7 0.6 0.5
500 450
0.4 0.3
400
0.2
350
0.1
300
0
750
1
700
0.9
650
0.8
600 550
0.7 0.6 0.5
500 450
0.4 0.3
400
0.2
350
0.1
300
0
Fig. 9 Temperature (given in K) and mass fraction of fuel
1376
S. Eberle et al. 750
1
700
0.9
650
0.8
600 550
0.7 0.6 0.5
500 0.4 450 400
0.3 0.2
350
0.1
300 750
0 1
700
0.9
650
0.8
600
0.7 0.6
550 0.5 500 0.4 450
0.3
400
0.2
350
0.1
300
0
Fig. 10 Temperature (given in K) and mass fraction of fuel
6
Perspective and Future Application Fields
We conclude our work with the explanation of challenges and perspectives, in particular in its respect to the German state Rhineland-Palatinate.
6.1
Climate Change and Fire Risk
The anthropogenic increase in greenhouse gas concentrations leads to further global warming (Latif 2012). Using “best estimate” climate sensitivity analysis, the global average temperature will increase by 2:0–6.1 ı C above preindustrial equilibrium of greenhouse gas concentrations until the end of this century IPCC 2007. A further increase in mean annual temperature, lower precipitation in the growing season, and an increase of heat waves and drought periods will exacerbate the risk of forest fires worldwide (see, e.g., Archer, Rahmstorf 2010; Carvalho et al. 2011; Flannigan et al. 2000; IPCC 2007). Consequently, the length and the intensity of the fire season, the fire-prone area, and the probability of great fires supposedly increase (see, e.g., European Environment Agency 2012; Flannigan et al. 2000; Holsten
Forest Fire Spreading
1377
Fig. 11 Fire risk is expressed by the Seasonal Severity Rating (SSR) by Joint Research Centre (JRC) (European Environment Agency 2012). Based on climate projections by the Regional Climate Model (RCM) RACMO2, driven by the Global Climate Model (GCM) ECHAM5 for the SRES emission scenario A1B, SSR is assessed. Left: projected change in SSR by 2071–2100 compared to 1961–1990 reference period; Right: projected annual average SSR in 2071–2100 (Source: http://www.eea.europa.eu/data-and-maps/figures/projected-meteorologicalforest-fire-danger)
et al. 2013), however with a significant regional variation (Camia et al. 2008). The modeled fire risk until the end of the twenty-first century shows – compared to the reference period 1961–1990 – for southeastern and southwestern Europe a clear increase and for western Central Europe a relatively strong increase of the forest fire risk. Even in temperate mountain ranges, forest fires are likely to become a relevant disturbance factor (Schumacher and Bugmann 2006). As shown in Fig. 11, for Rhineland-Palatinate, where our test region is located, an increased fire risk of up to 100 % is projected under the assumed emission scenario conditions. The forest fire risk here will be however still the second lowest of the risk classes occurring in Europe at the end of the twenty-first century. The risk for the occurrence of forest fires depends on fuel material, fuel characteristics, and environmental conditions.
6.2
Climate Change Projections for Rhineland-Palatinate
Before we have a closer look at the future perspectives of forest fire modeling, we want to deal with the current and possible future climatic conditions in RhinelandPalatinate. Rhineland-Palatinate is among the most vulnerable regions to climate
1378
S. Eberle et al.
change in Germany. Within the last 130 years (from 1881 to 2010), the mean daily temperature has increased by about 1:3 ı C (Klimawandelinformationssystem Rhineland-Palatinate, see www.kwis-rlp.de). The future climate development is projected by means of regional climate models that consider large-scale climate trends as well as locally observed meteorological values (Gerstengarbe et al. 1999). Under the emission scenario A1B, the regional climate model (RCM) WETTREG2010 projects an increase of the mean annual temperature of about 3:7 ı C until the end of 2100 for Rhineland-Palatinate (see Fig. 12). Simultaneously, precipitation decreases by an average of 15–20 % during the growing season. Rare situations like the hot summer of 2003 will probably occur more frequently in the future. The number of heat waves with temperatures above 30 ı C and a minimum duration of 2–5 days, at the end of the first century, will be two to three times as often as in the reference period (see www.kwis-rlp).
6.3
Future Application Fields
Since the occurrence of forest fires is strongly influenced by climate, the question arises, how the risk of forest fires will develop under changing climate conditions (Gerstengarbe et al. 1999). Simulations for Germany with the new emission scenario RCP 8.5 (Representative Concentration Pathways, see Moss et al. 2010) point to a 6–10 % higher risk for forest fires in Rhineland-Palatinate until 2050 compared to 1991–2010 (Lasch-Born et al. 2013). To explore the relationship between climate and forest fire, high-resolution climate data is required to visualize the spatial variability of relevant factors for fires, such as precipitation, temperature, and drought (Meyn et al. 2012). As outlined in more detail in chapter Geomathematics: Its Role, Its Aim, and Its Potential, models make it possible to evaluate the involved factors and their importance for forest fires. By using various forest fire models, similar to Weibel et al. (2010), the influence of weather, forest composition, human activity, and changes in legislation on the probability of forest fire origin can be examined. Although, results show that the transferability of the models is limited, and the application for an assessment of the future risk of forest fire is even more difficult under changing climatic conditions. Therefore, fire spread models need to be developed and adapted to the circumstances and causes of fire (see, e.g., Weibel et al. 2010). In the assessment of future forest fire regimes, the behavior of the fire spreading over the landscape should be considered, where dynamic landscape models could play an important role (Weibel et al. 2010). Related to this fact, the present physically oriented tool for the simulation of fire spread behavior offers the advantage that the dynamic mechanisms of ignition and spread of forest fires are modeled. In contrast to empirical models, the here pursued application does not depend on the availability of empirical data and, hence, can be used, with some restrictions, even if one is confronted with a lack of data. Based on the given numerical example it has to be stated that, at the current state, we are able to simulate forest fire spreading for different homogeneous fuel types
Forest Fire Spreading
–0.50
0.00
1379
0.50
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5 [°C]
–45.0 –37.5 –30.0 –22.5 –15.0 –7.5 0.00 7.5 15.0 22.5 30.0 37.5 45.0 52.5 60.0 67.5 75.0 [%]
Fig. 12 Climate Projections for Rhineland-Palatinate (Germany) by the RCM WETTREG2010 (driven by GCM ECHAM5) for the SRES scenario A1B. Top: projected change in mean daily maximum temperature in meteorological summer for the period 2071–2100 compared to the reference period 1971–2000; bottom: projected change in precipitation amount in meteorological summer for the period 2071–2100 compared to the reference period 1971–2000 (Source: CEC Potsdam GmbH i.A. des Umweltbundesamtes 2010)
1380
S. Eberle et al.
and diverse wind velocities and directions. For the simulation of fire behavior and fire risk in forest landscapes with their specific spatiotemporally pattern of different tree species, forest types, age classes, and stages, the here presented physically oriented tool should be evaluated by different weather, fuel, and site conditions (cf. Anderson 1982). According to this, some more and specific future research fields can be listed as follows: • Approximation of the initial conditions for fire spreading by the influence of heterogenous material. • Influence of the wood water content, in combination with the forest stand density and the mixture of different tree species. • Influence of soil texture and soil water content on forest fires, which has hardly been studied so far (Wirth 2005). The identified application fields refer to the uncertainty in modeling. As prerequisite for the fire behavior simulation at selected points in the landscape, the predictive capacity and the uncertainty of each input parameter are required (Weibel et al. 2010). Concerning this, many of the input parameters for the fire behavior simulation should be modeled or ideally gathered in extensive field surveys. A challenge is the real-time modeling of regional specific variation of the fire risk, the ignition and spread of forest fires, and, finally, the effect of forest fires on the forest landscapes. Future fire regimes are largely determined by future climate conditions and forest landscape pattern. Therefore, it is recommended that the potential impacts of forest fires are simulated in combination with different climate projections and dynamic models of forest landscape development (cf. Moss et al. 2010; Van der Linden and Mitchell 2009; Weibel et al. 2010). Following this, the present physical modeling approach should be coupled with fire risk (e.g., Holsten et al. 2013) and fire effect models (see chapter Geomathematics: Its Role, Its Aim, and Its Potential), e.g., by using a Geographic Information System (GIS) (Schöning 2009). For decision makers and different stakeholders, a valuable tool would be an interactive Decision Support System (DSS) in which the most susceptible regions to forest fires are displayed and which supports the allocation of fire prevention and fire-fighting resources (see, e.g., Sebastián-Lopéz et al. 2008).
6.4
Forest Management Options
For sustainable forest management, the question arises, how forest ecosystems can be adapted to an increased risk of forest fires caused by climate change (cf. Gerstengarbe et al. 1999). In this respect the forests in Europe must not only be adapted to changes in mean climate but also to increased occurrence and variability of extreme weather events (Lindner et al. 2010). The adaptive capacity can be defined by the inherent adaptability of trees, forest ecosystems, and socioeconomic
Forest Fire Spreading
1381
factors to climate change. To date, insights in the adaptive capacity and regional vulnerability of European forests to climate change are limited and require more research (Lindner et al. 2010). Strategies aiming at risk diversification on the landscape level, e.g., by changing tree species composition and combining different forest types (e.g., encompassing a higher amount of broad-leaved forests or broad-leaved needle forests) could reduce the susceptibility of forests to fires (Gerstengarbe et al. 1999). Depending on the forest vulnerability and climate impact, an anticipatory and preventive risk management should be taken into account that, at specific forest fire-prone areas, the targeted multifunctionality of forests (at the same time in one region wood production, nature conservation, recreation, and different forms of protection) could be replaced in favor of monofunctional forests (Loustau et al. 2007), which have a lower vulnerability to forest fires. In general, continued research is required to explore the relationships between climate-fire association and the interaction between fire and other stress factors and their impacts on vegetation (Flannigan et al. 2000). Acknowledgements Sarah Eberle thanks for the support by the Rhineland-Palatinate Center of Excellence for Climate Change Impacts (at the Forest Research Institute for Forest Ecology and Forestry, Trippstadt, Germany) within the scope of the project “Forest Fire Determination: Theory and Numerical Aspects” (P.I. Willi Freeden).
References Albini FA, Baughman RG (1979) Estimating windspeeds for predicting wildland fire behaviour. Research paper INT–221, USDA Forest Service, Intermountain Forest and Range Experiment Station Albini FA (1986) Wildland fire spread by radiation – a model including fuel cooling by natural convection. Combust Sci Technol 45:101–113 Allgöer B, Schöning R (n.y.) Forest fire modelling with GIS in the Swiss National Park. http:// www.ncgia.ucsb.edu/conf/SANTA_FE_CD-ROM/sf_papers/allgower_britta/allgower.html, 29 Jan 2013 Anderson HE (1969) Heat transfer and fire spread. Research paper INT–69, USDA Forest Service, Intermountain Forest and Range Experiment Station Anderson HE (1982) USDA forest services, intermountain forest and range experiment station. General technical report INT-122 Anderson HE (1983) Predicting wind-driven wildland fire size and shape.Research paper INT-305 Ansorge R, Sonar T (2009) Mathematical models of fluid dynamics.Wiley-VCH, Berlin Archer D, Rahmstorf S (2010) The climate crisis. An introductory guide to climate change. Cambridge University Press, Cambridge Asensio I and Ferragut L (2002) On a wildland fire model with radiation. Int J Numer Methods Eng 54:137–157 Asensio I, Ferragut L, and Simon J (2001) Modelling of convective phenomena in forest fire. Rev. R. Acad. Cien. 95(1): 13–27 Asensio I, Ferragut L, and Simon J (2005) A convection model for fire spread simulation. Appl Math Lett 18(6):673–677. Elsevier Baehr HD, Stephan K (2010) Wärme- und Stoffübertragung.Springer
1382
S. Eberle et al.
Burmann E, Ern A (2005) Stabilized Galerkin approximation of convection-diffusion-reaction equations: discrete maximum principle and convergence. Math Comput 74:1637–1652 Burrows ND (1999) Fire behaviour in jarrah forest fuels. Part 1. Laboratory experiments. CALMScience 3:31–56 Burrows ND, Ward B, Robinson A (1991) Fire behaviour in spinifex fuels on the Gibson desert nature reserve, Western Australia. J Arid Environ 20:189–204 Camia A, Amatulli G, San-Miguel-Ayanz J (2008) Past and future trends of forest fire danger in Europe (JRC 46533, EUR 23427 EN). European Commission, Joint Research Centre Carvalho AC, Carvalho A, Martins M, Marques C, Rocha A, Borrego C, Viegas DX, Miranda AI (2011) Fire weather risk assessment under climate change using a dynamical downscaling approach. Environ Modell Softw 26:1123–1133 Catchpole WR, Catchpole EA (1994) Short course on wildland fire modelling: Kursunterlagen. Coimbra Catchpole WR, Catchpole EA, Rothermel RC, Morris GA, Butler BW, Latham DJ (1998) Rate of spread of free-burning fires in woody fuels in a wind tunel. Combust Sci Technol 131:1–37 Cekirge HM (1978) Propagation of fire fronts in forest. Comput Math Appl 1978.4:325–32 Chandler C, Thomas P, Trabaud L, and Williams D (1983) Fire in forestry. Wiley, New York Cheney NP, Gould JS, Catchpole WR (1998) Prediction of fire spread in grasslands. Int J Wildland Fire 8(1):1–13 Croba D, Lalas D, Papadopoulos C, Tryfonopoulos D (1994) Numerical simulation of forest fire propagation in complex terrain. In: Viegas DX (ed) Proceedings of the second international conference on forest fire research, Coimbra, vol. 1275. University of Coimbra, Portugal, pp 491–500 Cumming SG (2001) Forest type and wildfire in the Alberta boreal mixedwood: What do fires burn? Ecol Appl 11:97–110 Cui J, Freeden W, and Witte B (1992) Gleichmäßige Approximation mittels sphärischer Finite Elemente und ihre Anwendung in der Geodäsie. Zeitschrift für Vermessungswesen (ZfV), (117): 266–278 De Mestre NJ, Catchpole EA, Anderson DH and Rothermel RC (1989) Uniform propagation of a planar fire front without wind. Combust Sci Technol 65(4–6):231–244 Diez JI, Vrabie I (1994) Existence for reaction diffusion systems. A compactness method approach. J Math Anal Appl 188:521–540 Dupuy JL (1997) Mieux comprendre et predire la propagation des feux de forets:experimentation,test et propagationdemode les. PhD thesis, Université Claude Bernard, Lyon I, Centre National de la Recherche Scientifique, Villeurbanne Eberle S (2013) Modeling and simulation of forest fire spreading. Mathematics of planet earth. Proceedings of the 15th annual conference of the international association for mathematical geosciences. Madrid, pp 811–814 Eberle S (2015) Forest fire determination: theory and numerical aspects. PhD-thesis. University of Kaiserslautern, Geomathematics Group Emmons H (1964) Fire in the forest. Fire Res Abs Rev 5:163–178 European Environment Agency (2012) Climate change, impacts and vulnerability in Europe 2012. EEA report 12, http://www.eea.europa.eu/publications/climate-impacts-and-vulnerability2012, 17 Jan 2013. doi:10.2800/66071 Eskin GI (1981) Boundary Value Problems for Elliptic Pseudodierential Equations. American Mathematical Society Translation of Mathematical Monograpgs (from Russian original), Vol. 52, American Mathematical Society Providence, R.I. Fengler M, Freeden W (2005) A nonlinear galerkin scheme involving vector and tensor spherical harmonics for solving the incompressible Navier-Stokes equation on the sphere. SIAM J Sci Comput 27:967–994 Fernandes PM (1998) Fire spread modelling in Portuguese shrubland. In: Viegas DX (ed) Proceedings of the 3rd international conference on forest fire research, University of Coimbra, Luso, pp 61–628 Ferragut L, Asensio I (2004) Simulacion de incendios forestales. Bol Soc Esp Mat Apl 27:7–28.
Forest Fire Spreading
1383
Ferragut L, Asensio I, Monedero S (2004) Modelling slop, wind and moisture content effects on fire spread. European congress on computationl methods in applied sciences and engineering. Finnland Ferragut L, Asensio I, Monedero S (2005) A numerical method for solving convection-reactiondiffusion multivalued equations in fire spread modelling. Adv Eng Softw 38(6):366–371. Elsevier Finney MA, McHugh CW, Grenfell IC, Riley KL, Short C (2011) A simulation of probabilistic wildfire risk components for the continental United States. Stoch Environ Res Risk Assess 25:973–1000 Flannigan MD, Stocks BJ, Wotton BM (2000) Climate change and forest fires. Sci Total Environ 262:221–229 Fons WL (1946) Analysis of fire spread in light forest fuels. J Agric Res 72:93–121 Forestry Canada Fire Danger Group (1992) Development and structure of the Canadian forest fire behaviour prediction system. Canadian Department of Forestry. Inf Rep ST-X-3 Frandsen WH (1971) Fire spread through porous fuels from the conservation of energy. Combust Flame 16:9–16 Fujii N, Hasegawa J, Phallop L, Sakawa Y (1980) A nonstationary model of firespreading. Appl Math Model 41:76–180 Gerstengarbe FW, Werner PC, Lindner M, Bruschek G (1999) Estimation of future forest fire development in the state of Brandenburg. Int Forest Fire News 21:91–93 Griffin GF and Allan GE (1984) Fire behaviour. In: Saxon EC (ed) Anticipating the inevitable: a patch burn strategy for fire management at Uluru (Ayers Rock-Mt. Olga) National Park. CSIRO, Melbourne, pp 55–8 Grishin AM (1997) A mathematical model of forest fires and new methods of fighting them. Publishing House of the Tomsk State University, Tomsk Grishin AM, Gruzin AD and Zverev VG (1983) Mathematical modelling of the spreading of highlevel forest fires. Sov Phys Dokl 28:328–330 Harten A (1983) High resolution schemes for hyperbolic conservation laws. J Comput Phys 49:357–393 Holsten A, Dominic AR, Costa L, Kropp JP (2013) Evaluation of meterological forest fire indices for German federal states. For Ecol Manag 287:123–131. doi:10.1016/j.foreco.2012.08.035 Hottel HC, Williams GC, Kwentus GK (1971) Fuel pre-heating in free-burning fires. 13th international symposium on combustion. Utah, pp 963–970. Huang CC, Xie Y (1984) Flame propagation along matchstick arrays on inclined base boards. Combust Sci Technol 42:1–12 IPCC (2007) Climate change 2007: Synthesis report, contribution of working groups I, II and III to the fourth assessment report of the intergovernmental panel on climate change [Core Writing Team, Pachauri, RK, Reisinger, A (eds)], IPCC, Geneva John V, Novo J (2012) On (essentially) non-oscillatory discretization of evolutionary convectiondiffusion equations. J Comput Phys 231(4):1570–1586. Elsevier Joos F (2006) Thermische Verbrennung. Springer, Heidelberg Klimawandelinformationssystem Rhineland-Palatinate. Klimawandelinformationssystem kwis-rlp für Rheinland-Pfalz. http://www.kwis-rlp.de/klima-witterung.html, 04 Apr 2013 Konev EV, Sukhinin AI (1977) The analysis of flame spread through forest fuel. Combust Flame 28:217–223 Kuzmin D (2012) A guide to numerical methods for transport equations. Friedrich-AlexanderUniversitant. Erlangen-Nürnberg Kuzmin D, Möller M, Turek S (2004) High-resolution FEM-FCT schemes for multidimensional conservation laws. Comput Methods Appl Mech Eng 193(45–47):4915–4946 Kuzmin D, Löhner H, Turek S (2012) Flux-corrected transport. Scientific Computation. Springer, Berlin/New York Larini M, Giraud F, Porterie B, Loraud JC (1998) A multiphase formulation for fire propagation in heterogeneous combustible media. Int J Heat Mass Transf 41(6/7):88–97
1384
S. Eberle et al.
Lasch-Born P, Gutsch M, Reyer C, Suckow F (2013) Auswirkungen auf den Wald in Deutschland. In: Gerstengarbe F-W, Welzer H (Eds) Zwei Grad mehr in Deutschland. Wie der Klimawandel unseren Alltag verändern wird. Das Szenario 2040, Fischer, pp 99–130 Latif M (2012) Globale Erwärmung. Ulmer, Stuttgart, p 119 S Lindner M, Maroschek M, Netherer S, Kremer A, Barbati A, Garcia-Gonzalo J, Seidl R, Delzon S, Corona P, Kolström M, Lexer MJ, Marchetti M (2010) Climate change impacts, adaptive capacity, and vulnerability of European forest ecosystems. For Ecol Manag 259:698–709 Linn RR (1997) A transport model for prediction of wildfire behaviour. PhD thesis. New Mexico State University, Department of Mechanical Engineering, Las Cruces, New Mexico Loustau D, Ogée J, Dufrêne E, Déqué M, Dupouey JL, Badeau V, Viovy N, Ciais P, DesprezLoustau ML, Roques A, Chuine I, Mouillot F (2007) Impacts of climate change on temperate forests and interaction with management. In: Freer-Smith PH, Broadmeadow MSJ, Lynch JM (eds) Forestry & climate change. Cromwell Press Group, Trowbridge, pp 143–150 Mallet V, Keyes DE, Fendell FE (2009) Modeling wildland fire propagation with level set methods. Comput Math Appl 57(7):1089–1101. Elsevier Mandel J, Beezley JD, Bennethum LS, Chakraborty S, Coen JL, Douglas CC, Hatcher J, Kim M, and Vodacek A (2007) A dynamic data driven wildland fire model. In: Computional science – ICCS 2007. Lecture notes in computer sciences, vol 4487. Springer, Berlin/Heidelberg. pp 1042–1049 Marsden-Smedley JB, Catchpole WR (1995) Fire behaviour modelling in Tasmanian buttongrass moorlands. II. Fire behaviour. Int J Wildland Fire 5:215–28 Margerit J, Sero-Guillaume O (1999) Modelling forest fires. Inflame Internal report McArther AG (1967) Fire behaviour in eucalyptus forests. Forest Research Institute. Forest and Timber Bureau of Australia. Leaflet No. 107 McCaw WL (1998) Predicting fire spread in Western Australia mallee-heath shrubland. PhD thesis. University College UNSW, Canberra Meyn A, Schmidtlein S, Taylor SW, Girardin MP, Thonicke K, Cramer W (2012) Precipitationdriven decrease in wildfires in British Columbia. Reg Environ Change. doi:10.1007/s10113012-0319-0 Möller M (2008) Adaptive high-resolution finite element schemes. PhD-thesis, Technische Universität Dortmund Moss RH, Edmonds JA, Hibbard KA, Manning MR, Rose SK, van Vuuren DP, Carter TR, Emori S, Kainuma M, Kram T, Meehl GA, Mitchell JFB, Nakicenovic N, Riahi K, Smith SJ, Stouffer RJ, Thomson AM, Weyant JP, Wilbanks TJ (2010) The next generation of scenarios for climate change research and assessment. Nature 463:747–756. doi:10.1038/nature08823 Pastor M (2003) Mathematical models and calculation system for study of wildland fire behavior. Prog Energy Combust Sci 29:139–153 Pagni J, Peterson G (1973) Flame spread through porous fuels. 14th international symposium on combustion. Pennsylvannia, pp 1099–1107. Perminov V (2010) Numerical modeling of forest fire initiation and spread. In: 4th international conference, latest trends on applied mathematics, simulation, modeling. WSEAS Press, pp 242–248 Preisler HK, Weise DR (2006) Forest fire models. In: EL Shaarawi AH, Piegrosch W (eds) Encyclopedia of environmetrics, vol 2. Wiley Pyne SJ (2004) Tending fire: coping with America’s wildland fires. Island Press, Washington D.C. Pyne SJ, Andrews PL, Laven RD (1996) Introduction to wildland fire. Wiley, New York Quintiere J (2006) Fundamentals of fire phenomena. Wiley, England Rothermel RC (1972) A mathematical model for predicting fire spread on wildland fuels. USDA forest service research paper INT-115, p 40 Santoni PA, Balbi JH (1998) Modelling of two-dimensional flame spread across a sloping fuel bed. Fire Saf J 31(3):201–205 Schöning S (2009) Modellierung des potentiellen Waldbrandverhaltens mit einem geographischen Informationssystem. Geo-Processing Reihe
Forest Fire Spreading
1385
Schaffeld H (1988) Finite-Elemente-Methoden und ihre Anwendung zur Erstellung von Digitalen Geländemodellen. PhD-thesis. University of Kaiserslautern. Geomathematics Group. Schumacher S, Bugmann H (2006) The relative importance of climatic effects, wildfires and management for future forest landscape dynamics in the Swiss Alps. Glob Change Biol 12: 1435–1450 Sebastián-Lopéz A, Salvador-Civil R, Gonzalo-Jimenéz J,San-Miguel-Ayanz J (2008) Integration of socio-economic and environmental variables for modelling long-term fire danger in Southern Europe. Eur J Forest Res 127:149–163 Séro-Guillaime O, Margerit J (2002a) Modelling forest fires. Part I: a complete set of equations derived by extended irreversible thermodynamics. Int J Heat Mass Transf 45: 1705–1722 Séro-Guillaime O, Margerit J (2002b) Modelling forest fires. Part II: reduction to two-dimensional models and simulation of propagation. Int J Heat Mass Transf 45:1723–1737 Shu C-W (1997) Essentially non-oscillatory and weighted essentially non-oscillatory schemes for hyperbolic conservation laws. NASA/CR-97-206253. IACSE report no.97–65 Sneeuwjagt RJ, Peet G (1985) Forest fire behaviour tables for western Australia. Australia: Department of Conservation and Land Management Steward FR (1974) Fire Spread through a fuel bed. In: Blackshear PL (ed) Heat transfer in fires: thermophysics, social aspects, economic impact. Scripta, Washington, DC, pp 315–318 Telisin HP (1974) Flame radiation as a mechanism of fire spread in forests. Heat transfer in flames. Wiley, New York, pp 441–449 Thomas PH (1967) Some aspects of the growth and spread of fires in the open. Forestry 40: 129–164 Torero JL, Simeoni A (2010) Heat and mass transfer in fires: scaling laws, ignition of solid fuels and application to forest fire. Open Thermodyn J 4:145–155 Van der Linden P, Mitchell JFB (2009) ENSEMBLES: climate change and its impacts: summary of research and results from the ENSEMBLES project. Met Office Hadley Centre, Exeter. http:// ensembles-eu.metoffice.com/docs/Ensembles_final_report_Nov09.pdf, 28 Mar 2013 Van Wagner CE (1967) Calculations on forest fire spread by flame radiation. Report No.1185, Canadian Department of Forestry Vega JA, Cuinas P, Fontrubel T, Perez-Gorostiaga P, Fernandez C (1998) Predicting fire behaviour in Galicia (NW Spain) shrubland fuel complexes. In: Viegas DX (ed) Proceedings of the third international conference on forest fire research, University of Coimbra, Luso, pp 713–28 Viegas DX, Ribeiro PR, Maricato L (1998) An empirical model for the spread of a fireline inclined in relation to the slope gradient or to wind direction. In: Viegas DX (ed) Proceedings of the third international conference on forest fire research, vol 2718, University of Coimbra, Coimbra, pp 325–42 Weber RO (1989) Analytical models for fire spread due to radiation. Combust Flame 78:398–408 Weber RO (1991) Modelling fire spread through fuel beds. Prog Enery Combust Sci 17:67–82 Weber RO, Sidhu HS (2006) A dynamical systems model for fireline growth with suppression. ANZIAM J 47: C462–C474 Weibel P, Elkin C, Reineking B, Conedera M, and Bugmann H (2010) Waldbrandmodellierung – Möglichkeiten und Grenzen. Schweiz Z Forstwes 161 (2010) 11:433–441 Wendland H (2005) Scattered data approximation. Cambridge monographs on applied and computational mathematics. Cambridge University Press, Cambridge/New York Wirth C (2005) Fire regime and tree diversity in boreal forests: implications for the carbon cycle. In: Scherer-Lorenzen M, Körner C, Schulze E-D (eds) Forest diversity and function: temperate and boreal systems. Springer, Berlin/Heidelberg, pp 309–344 Zalesak ST (1979) Fully multidimensional flux-corrected transport algorithms for fluids. J Comput Phys 31(3):335–362. Elsevier
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation Andreas Meister and Joachim Benz
Contents 1 2 3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mathematical Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Finite Volume Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Positivity Preserving and Conservative Schemes . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Practical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Ecological Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1388 1390 1392 1392 1397 1405 1411 1412 1415
Abstract
From spring to summer period, a large number of lakes are laced with thick layers of algae implicitly representing a serious problem with respect to the fish stock as well as other important organisms and at the end for the complete biological diversity of species. Consequently, the investigation of the causeand-effect chain represents an important task concerning the protection of the natural environment. Often such situations are enforced by an oversupply of nutrient. As phosphorus is the limiting nutrient element for most of all algae growth processes an advanced knowledge of the phosphorus cycle is essential. In this context the chapter gives a survey on our recent progress in modeling and numerical simulation of plankton spring bloom situations caused A. Meister () Department of Mathematics, Work-Group of Analysis and Applied Mathematics, University of Kassel, Kassel, Germany e-mail: [email protected] J. Benz Faculty of Organic Agricultural Sciences, Work-Group Data-Processing and Computer Facilities, University of Kassel, Witzenhausen, Germany © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_23
1387
1388
A. Meister and J. Benz
by eutrophication via phosphorus accumulation. Due to the underlying processes we employ the shallow water equations as the fluid dynamic part coupled with additional equations describing biogeochemical processes of interest within both the water layer and the sediment. Depending on the model under consideration one is faced with significant requirements like positivity as well as conservativity in the context of stiff source terms. The numerical method used to simulate the dynamic part and the evolution of the phosphorus and different biomass concentrations is based on a second-order finite volume scheme extended by a specific formulation of the modified Patankar approach to satisfy the natural requirements to be unconditionally positivity preserving as well as conservative due to stiff transition terms. Beside a mathematical analysis, several test cases are shown which confirm both the theoretical results and the applicability of the complete numerical scheme. In particular, the flow field and phosphorus dynamics for the West Lake in Hangzhou, China are computed using the previously stated mass and positivity preserving finite volume scheme.
1
Introduction
The main objective of this chapter is to present a stable and accurate numerical method for a wide range of applications in the field of complex ecosystem models. Thereby, we focus on the process of eutrophication, which represents a serious problem through giving rise to excessive algae blooms in all eutrophic fresh water ecosystems. Thus, it is crucial in ecological science to understand and predict the coherences of the underlying dynamics. Even taking into account the systematical difficulties of modeling and simulation (Poethke 1994), elaborate modeling provides a very useful tool for deeper understanding and long range prediction (Sagehashi et al. 2000). In the 1970s computer-based modeling and simulation in ecological science with a special focus on lake eutrophication examination started to achieve good results. Deterministic models were proposed in Park et al. (1974), Jørgensen (1975), and Straškraba and Gnauk (1985). Up to now due to both deeper knowledge of the underlying principles and growing computational power the examined models have been developed more sophisticated. The evolved models nowadays pose severe demands on the used numerical schemes. Models including matter cycles, where the material, for example, atoms, can only change their configuration and thus are neither created nor destroyed, require the algorithm to maintain the mass and number of atoms in the system, respectively. Another computational nontrivial but obvious demand is to ensure positivity for all examined material constituents. Consequently, in recent years, one can observe increasing interest in the design of positivity preserving schemes. A survey on positive advection transport methods can be found in Smolarkiewicz (2006). With respect to shallow water flows one is often faced with the transition
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1389
between dry and wet areas such that the construction of positive schemes is widespread for this kind of application, see Berzins (2001), Audusse and Bristeau (2005), Chertock and Kurganov (2008), Ricchiuto and Bollermann (2009) and the references therein. Concerning stiff reaction terms a modification of common Runge-Kutta schemes which ensures the positivity was originally suggested by Patankar in the context of turbulent mixing (Patankar 1980). Unfortunately the schemes obtained were not able to retain the characteristic of mass conservation, even though mass conservativity is a feature of the original Runge-Kutta schemes. The scheme was improved in Burchard et al. (2003) retaining the positivity but also reacquiring the conservation of mass. Based on this idea, a more complex form of conservation was achieved in a series of publications (Broekhuizen et al. 2008; Bruggeman et al. 2007). First applications of the modified Patankar-approach in combination with classical discretization schemes for the solution of advectiondiffusion-reaction equations are presented in Burchard et al. (2005, 2006) with a specific focus on marine ecosystem dynamics in the North and the Baltic Sea. But being more applicable to flexible definitions of conservation they lost crucial numerical stability. A detailed overview of those schemes can be found in Zardo (2005). Due to the superior numerical stability the modified Patankar ansatz is used. Also the ensured conservation of mass is appropriate for the model under consideration. One major and essential aspect for growth is nutrient supply. To predict the growth of an algae reliably one has to consider the limiting factor (Schwoerbel and Brendelberger 2005). The basis for the implemented ecological model is presented by Hongping and Jianyi (2002). As the model under consideration describes algae growth in the West Lake, China with a ratio of more then 1:14 for phosphorus to nitrogen the limiting nutrient is phosphorus (Lampert and Sommer 1999). Therefore it is sufficient to restrict the considered nourishment to phosphorus for modeling purpose. The model features the description of four different groups of algae species and zooplankton with the particularity of mapping specific phosphorus contents for each of these organisms. Furthermore, phosphorus is presented in solute and organic form in the water body as well as in the sediment. Beside the biological and chemical processes the flow field of the lake itself is of major importance. The current leads to an unbalanced distribution of matter inside a lake and thus creates significant differences in phosphorus concentrations. The two-dimensional shallow water equations form an adequate formulation of shallow water flow. By means of sophisticated high resolution schemes (Toro 2001; Vater 2004), one is able to simulate complicated phenomena like, for example, a dam break problem or others, see Stoker (1957) and Vázquez-Cendón (2007). The two-dimensional shallow water equations thus serve as basis formulation for the flow phenomena of interest. The phosphorus cycle model together with the two-dimensional shallow water equations forms the complete set of convection-diffusion-reaction equations and is a set of hyperbolic parabolic partial differential equations.
1390
2
A. Meister and J. Benz
Mathematical Modeling
Modeling ecological process in lakes, urban rivers, or channels possess two main demands. First, a reliable mathematical model has to be formulated which includes fundamental properties and effects of eutrophication including local biochemical phenomena as well as transport processes due to convection and diffusion. Thus, beside the consideration of the dynamics of biomass and phosphorus concentrations, one has to take account of the flow situation in the water body. Consequently, the equations governing the process of interest are given by a combination of the SaintVenant system for the flow part and an advection-diffusion-reaction system for the biochemical part. Neglecting the influence of rain and evaporation the underlying fluid dynamic part represented by the shallow water equations can be written in the case of a constant bottom topography in form of the conservation law @t us C
2 X
@xm fcm;s .us / D 0:
mD1
Thereby, us D .H; ˆv1 , ˆv2 /T and fcm;s .us / D Hvm ; ˆvm v1 C 12 ı1m ˆ2 ; ˆvm v2 T C 12 ı2m ˆ2 are referred to as the vector of conserved quantities and the flux function in which H , v D .v1 ; v2 /T denote the water height and the velocity, respectively. Furthermore, ˆ D gH represents the geopotential with the gravity force g and the Kronecker delta is denoted by ıim . The ecological part consists of processes which describe the behavior of biomass and organic phosphorus of four different groups of algae species (BA, PA) and zooplankton (BZ, PZ). To describe the complete phosphorus cycle including solute phosphorus in the water body (PS), organic phosphorus in the detritus (PZ) and organic and inorganic phosphorus in the sediment (PEO ; PEI / are considered, see Fig. 2. This model is based on the West Lake Model, which is published in Hongping and Jianyi (2002). The dynamics of biomass demands positivity and the cycle of phosphorus positivity and conservativity as well. The interactions between these ecological system elements are nonlinear, for example, growth is formulated as Michaelis-Menten kinetics. In addition, the particulate components (biomass, organic phosphorus, and detritus) in the water body are influenced by advective transport. The solute phosphorus in the water body is affected by advective and diffusive transport. For the elements inside of the sediment no horizontal transport is considered. The vertical exchange between phosphorus in the sediment and in the water body is driven by diffusion. Consequently, the governing equations for the phosphorus cycle and the biomass dynamics (Fig. 1) represents a system of advection-diffusion-reaction equations of the form @t up C
2 X mD1
c @xm fm;p .us ; up / D
2 X mD1
@xm fm;p .up / C qp .us ; up /
(1)
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1391
Growth Assimilation BZ Mortality
Grazing BA
Sinking
Sinking
Respiration
Respiration
Biomass dynamics– positive, nonmass conserving
Respiration Phosphorus dynamics– positive, mass conserving
PZ
Mortality
Assimilation Respiration
Grazing PA Sinking
PS Uptake
Sedimentation PD Mineralization
Exchange
Sedimentation
Waterbody Sediment Mineralization PEI
PEO
Fig. 1 Phosphorus and biomass dynamic
where the components read up D .BA; BZ; PA; PZ; PD; PS; PEI ; PEO /T and the convective and viscous fluxes are c fm;p .us ; up / D vm .BA; BZ; PA; PZ; PD; PS; 0; 0/T
and fm;p .up / D .0; : : : ; 0; PS @xm PS; PEI @xm PEI ; 0/T; respectively. Note, that BA and PA denotes a vector of four constituents by it selves. With respect to the spatial domain D R2 , which is assumed to be polygonal n
bounded, i.e., @D D [ @Dk , the complete system consisting of the fluid and kD1
biochemical part can be written as
1392
A. Meister and J. Benz
@t U C
2 X mD1
2 X
@xm Fcm .U/ D
@xm Fvm .U/ C Q.U/ in D RC ;
(2)
mD1
where U D .us ; up /T; Fcm D .fcm;s ; fcm;p /T; Fvm D .0; fvm;p /T , and Q D .0; qp /T. A detailed description of the biochemical part will be presented in Sect. 3.3.
3
Numerical Method
The discretization of the mathematical model is based on a conventional finite volume scheme. To satisfy the specific requirements of positivity with respect to the involved biogeochemical system it is necessary to employ an appropriate method for the corresponding local transition terms. Thus, for the sake of simplicity, we first present the finite volume approach and investigate its applicability in the context of the fluid dynamic part of the governing equation. Thereafter, we will discuss positivity preserving, conservative scheme for stiff and non-stiff ordinary differential equations. Similar to the finite-volume approach, we will discuss the properties of this numerical scheme not only theoretically but also by means of different test cases. Finally, we combine both parts to simulate the flow field and the phosphorus cycle with respect to the West lake as a comprehensive practical application.
3.1
Finite Volume Method
Finite volume schemes are formulated on arbitrary control volumes and well-known successful time stepping schemes and spatial discretization techniques can easily be employed in the general framework. This approximation technique perfectly combines the needs concerned with robustness and accuracy as well as the treatment of complicated geometries. We start with the description of a state-of-the-art finite volume scheme. The development of the method is presented in various articles (Meister 1998, 2003; Meister and Oevermann 1998; Meister and Sonar 1998; Meister and Vömel 2001) which also proof the validity of the algorithm in the area of inviscid and viscous flow fields. Smooth solutions of the system of the shallow water equations exist in general only for short time and well-known phenomena like shock waves and contact discontinuities develop naturally. Hence we introduce the concept of weak solutions which represents the basis for each finite volume scheme. Every single finite volume scheme can be derived using the concept of weak solutions. A bounded set R2 is said to be a control volume if Gauss’ integral theorem is applicable to functions defined on . The mapping uQ is called a weak solution of the system (2) if
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
d dt
Z
Z U dx D
2 X @ mD1
Z Fcm .U/nm
ds C
2 X @ mD1
1393
Z Fm .U/nm
ds C
Q.U/ d x
(3) is valid on every control volume D with outer unit normal vector n D .n1 ; n2 /T . Note that this formula can be derived from the system (2) by integrating over a control volume and applying Gauss’ integral theorem. The solution class considered 1 can be described by the function space BV RC ; L \ L1 .R2 I R17 / , i.e., the 0 mapping t 7! U.; t/ is of bounded variation and the image is an integrable function of the space variables which is bounded almost everywhere. The numerical approximation of Eq. 3 requires an appropriate discretization of the space part D as well as the time part RC . We start from an arbitrary conforming triangulation T h of the domain D which is called the primary mesh consisting of finitely many (say #T h / triangles Ti ; i D 1; : : : ; #T h . Thereby, the parameter h corresponds to a typical one-dimensional geometrical measure of the triangulation, as, for example, the maximum diameter of the smallest in-closing circle of the triangle Ti ; i D 1; : : : ; #T h . For a comprehensive definition of a primary grid, we refer to Sonar (1997a). Furthermore, N h denotes the index set of all nodes of the triangulation T h and is subdivided by N h D N h;D [ N h;@D , where N h;D is associated with the inner points and N h;@D includes the indices of the boundary points. We set N WD #N h and denote the three edges of the triangle T by eT ;k ; k D 1; 2; 3. Furthermore, we define E.i / D feT ;k jk 2 f1; 2; 3g; T 2 T h ; node xi 2 eT ;k g; V .i / D fT j node xi is vertex of T 2 T h g; and C .T / D fi ji 2 f1; : : : ; N g; node xi is vertex of T g: For the calculation of the triangulation we employ an algorithm developed by Friedrich (1993) which provides mostly isotropic grids. As we see in the following, the occurrence of second-order derivatives within the partial differential equation requires the evaluation of first order derivative on the boundary of each control volume used in the finite volume scheme. Due to this fact it is advantageous to employ a box-type method where the computation of a first-order derivative at the boundary of each box is straightforward. We define a discrete control volume i as the open subset of R2 including the node xi D .xi 1 ; xi 2 /T and bounded by the straight lines, which are defined by the connection of the midpoint of the edge eT ;k 2 E.i / with the point xs D .xs1 ; xs2 /T D
X i 2C .T /
˛i xi
1394
A. Meister and J. Benz
xj
n2ij lij2 lk
n1ij lij1
li
xs
xi
lj xk
Fig. 2 General form of a control volume (left) and its boundary (right)
of the corresponding triangle T , see Fig. 2. In the case that xi is at the boundary of the computational domain, the line defined by the connection of the midpoint of the boundary edge and the node itself is also a part of @i . For the calculation of the weights we employ
˛i D
2
1 P m2C .T /
X jlm j
jlm j
with jli j D jjxj xk jj2
for i; j; k 2 C .T /:
m2C .T / m¤i
(4) This definition exhibits the advantage that the deformation of the control volume with respect to distorted triangles is much smaller compared to those achieved by the use of xs as the barycenter of the triangle T . The union B h of all boxes i , i 2 N h is called the secondary mesh. Let N .i / denote Rthe index set of all nodes neighboring node xi , i.e., those nodes xj, j ¤ i , for which @i \@j 1 ds ¤ 0 is valid. In general, for j 2 N .i /, the boundary between the control volume i and j consists of two line segments which are denoted by lijk , k D 1; 2. Furthermore, nkij , k D 1; 2, represent the accompanying unit normal vectors. Note that in the case of a boundary box i there exist two adjacent cells j , j 2 N h;@D , such that @i \ @j consists of one line segment lijk only. In order to achieve a unique representation we interpret the lacking line segment as having the length zero. Introducing the cell average
Ui .t/ WD
1 ji j
Z U.x; t/d x i
the integral form with regard to an inner box i can be written as
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1395
d Ui .t/ D .Lci U/.t/ C .Li U/.t/ C .Qi U/.t/ dt with p .Li U/.t/
2 2 Z X 1 X X WD Fpm .U/ nkij ds; p 2 fc; g m ji j lijk mD1 j 2N .i / kD1
and .Qi U/.t/ WD
1 ji j
Z Q.U/ d x: i
Note that in the case of a boundary box i the boundary conditions have to be considered additionally. Obviously, the evaluation of the line integrals leads into trouble when U is discontinuous across the line segment lijk . For the solution of these Riemann problems we introduce the concept of a numerical flux function which is often called a Riemann solver. During the past more than 40 years, a huge variety of numerical flux functions has been developed for various balance laws showing different advantages as well as failings. Such schemes can be found in numerous well-written text books (Ansorge and Sonar 2009; Hirsch 1988a,b; LeVeque 1990; Toro 1999). Utilizing the HLL-scheme (Toro 1999), one obtains
.Lci U/.t/
2 1 X X ˇˇ k ˇˇ HLL O O j .t/I nkij : Ui .t/; U ˇlij ˇH ji j j 2N .i / kD1
O i and U O j represent an approximation of the conserved variables at the Herein, U left- and right-hand side of the midpoint xkij corresponding to the line segment lijk . Oj D U O j , leads to a spatially first-order O i D Ui and U Using the cell averages, i.e., U accurate scheme. The central approach for the construction of higher-order finite Oi D U Oj, volume methods is based on the recovery of improved approximations U from the given cell averages. A detailed introduction and mathematical analysis of different reconstruction technique is presented in Sonar (1997a). Within this study, we concentrate on the recovery based on linear polynomials calculated by means of a TVD reconstruction procedure using the Barth-Jespersen limiter (Barth and Jesperson 1989). Furthermore, the numerical evaluation of the viscous fluxes is performed in the sense of central differences. For each quantity appearing inside the flux Fm , we calculate the unique linear distribution with respect to the triangle T from the cell averages of the three corresponding boxes located at the node of the triangle. Thus, we can write
1396
A. Meister and J. Benz
.Li U/.t/
2 1 X X ˇˇ k ˇˇ central UT ;1 .t/; UT ;2 .t/; UT ;3 .t/I nkij : ˇlij ˇH ji j j 2N .i / kD1
where lijk T . Finally, the source term is simply expressed as .Qi U/.t/ Q.Ui .t//: Subsequent to the approximation of, the spatial parts we are faced with a huge system of ordinary differential equations of the form d U.t/ D .LU/.t/ C .QU/.t/; dt where U represents the vector containing all cell averages and L, Q denote the numerical approximation of the global terms (viscous and inviscid fluxes) and the local source terms, respectively. Concerning the decomposition of the right-hand side, we employ a standard Strang-splitting U.1/ D Un Ct‰.t; Un ; Q/ U.2/ D U.1/ C2t‚.t; U.1/ ; L/ UnC2 D U.2/ Ct‰.t; U.2/ ; Q/
(5)
which is of second-order if both discretization schemes ‚ and ‰ are second-order accurate. Note that the steps associated with the operator L include the convection and diffusion processes, whereas the other steps possess a local character due to the reaction terms. Thus, the intermediate steps are carried out separately for each control volume. The discretization scheme ‚ is simple realized by a second-order Runge-Kutta method. In the context of stiff reaction terms Q the performance of the complete numerical method decisively depends on the properties of the incorporated scheme ‰, which has to be stable and if necessary conservative and positivity preserving independent of the time step size used.
Numerical Results for Shallow Water Flow The example deals with a dam-break situation coupled with the occurrence of a pillar. The evolution of the water level depicted in Fig. 3 shows a good resolution of the dominant instationary flow features which are shock-boundary- and shock-shock interactions. The test case emphasize the applicability and stability of the finite volume scheme explained above for the simulation of shallow water flow.
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1397
Fig. 3 Instationary flow around a pillar
2.0
3.2
1.75
1.5
1.25
1.0 [m]
Positivity Preserving and Conservative Schemes
Concerning the remaining numerical step associated with the system of ordinary differential equations we present a survey on the modified Patankar-approach originally suggested and analyzed by Burchard et al. (2003) and its extension published in Benz et al. (2009). In order to describe the properties of the enclosed numerical scheme for stiff positive systems we make use of an arbitrary production-destruction equation. For i ¤ j we utilize the notation di;j .c/ 0 as the rate at which the i th component transforms into the j th, while pi;j (c) 0 represents the rate at which the j th constituent transforms into the i th. Clearly, pi;j (c) and di;j (c) must satisfy pi;j .c/ D dj;i .c/. In addition to these transition terms, we consider for the i th constituent local production by pi;i .c/ 0 and similar local destruction by di;i .c/ 0. Thus, the system we start to investigate here can be written as I I X X d ci D pi;j .c/ di;j .c/ C pi;i .c/ di;i .c/; i D 1; : : : ; I; dt j D1 j D1 j ¤i
(6)
j ¤i
where c D c.t/ D .c1 .t/; : : :; cI .t//T denotes the vector of the I constituents. This system has to be solved under the initial conditions c0 D c.t D 0/ > 0:
(7)
Definition 1. The system (6) is called fully conservative, if the production and destruction terms satisfy pi;j .c/ D dj;i .c/ for j ¤ i; j; i 2 f1; : : : ; I g and pi;i .c/ D di;i .c/ D 0 for i 2 f1; : : : ; I g:
1398
A. Meister and J. Benz
Furthermore, we always consider ecosystems where the constituents are by nature positive. From a mathematical point of view one can easily prove by means of a simple contradiction argument, that for non negative initial conditions ci .0/ 0; i D 1; : : :; I , the condition di;j .c/ ! 0 for ci ! 0 for all j 2 f1; : : :; I g, guarantees that the quantities ci .t/; i D 1; : : :; I remain non negative for all t 2 RC . Consequently, the properties mentioned above have to be maintained by the discretization scheme which means that no gain or loss of mass should occur for numerical reasons and that the concentration of all constituents have to remain positive independent of the time step size used. Based on the following standard formulation of a discretization scheme as cnC1 D cn C tˆ.cn ; cnC1 I t/ we introduce some notations and definitions which are used in the subsequent parts of the chapter. Definition 2. For a given discretization scheme ˆ we call e D cnC1 c t nC1 with cnC1 D c.t n / C tˆ c.t n /; c t nC1 I t the local discretization error vector, where c.t/ represents the exact solution of the initial value problem (6), (7) and t D t nC1 t n . Moreover, we write M D O.t p /
as t ! 0;
p 2 N0 ;
in terms of mi;j D O.t p / as t ! 0; p 2 N0 for all elements mi;j ; i D 1; : : :; r; j D 1; : : :; k of the matrix M 2 Rr k It is worth mentioning that the production and destruction terms pi;j; di;j; i; j D 1; : : :; I , are considered to be sufficiently smooth and we require the solution c of the initial value problem (6) and (7) to be sufficiently differentiable. In the following, we always consider the case of a vanishing time step t and thus the supplement t ! 0 will be neglected for simplification and we use the expression accuracy in the sense of consistency. Definition 3. A discretization scheme ˆ is called • consistent of order p with respect to the ordinary differential equation (6), if e D O.t pC1 /;
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1399
• unconditionally positive, if cnC1 > 0 for any given cn WD c.t n / > 0 and any arbitrary large time step t 0 independent of the specific definition of the production and destruction terms within the ordinary differential equation (6) • conservative, if I X
.cinC1 cin / D 0
(8)
i D1
for any fully conservative ordinary differential equation (6) and cn WD c.t n /. At first, we consider the well-known forward Euler scheme 0 B cinC1 D cin C t B @
1 I X
pi;j .cn /
j D1 j ¤i
I X j D1 j ¤i
C di;j .cn / C pi;i .cn / di;i .cn /C A
(9)
for i D 1; : : :; I . Quite obviously, we obtain the following result concerning the positivity and conservativity of the numerical method. Theorem 1. The forward Euler scheme (9) is conservative but not unconditional positivity preserving. Proof. By means of the abbreviations Pi D
PI j D1 j ¤i
pi;j and Di D
PI j D1 j ¤i
di;j
and utilizing the properties of the production and destruction rates we deduce I X
.cinC1 cin / D t
i D1
I X i D1
„
.Pi .cn / Di .cn // Ct ƒ‚
…
I X i D1
.pi;i .cn / di;i .cn // D 0; „ ƒ‚ … „ ƒ‚ … D0
D0
D0
which proves that the method is conservative. Due to the fact that the property to be unconditionally positive is independent of the production and destruction terms, we consider a fully conservative system with non vanishing right-hand side. Thus, there exists at least one index i 2 f1; : : :; I g such that Pi .cn / Di .cn / < 0 for a given cn > 0. Thus, using t >
Di
.cn /
cin yields cinC1 D cin C t.Pi .cn / Di .cn // < 0; Pi .cn /
which proves the statement.
1400
A. Meister and J. Benz
To overcome this disadvantage, Patankar (1980) suggested to weight the destruction terms di;j .c/ and di;i .c/ by the factor
cinC1 . cin
Theorem 2. The Patankar scheme. 0
1
I I nC1 nC1 C X BX n n ci n n ci C p .c / d .c / C p .c / d .c / cinC1 D cin C t B i;j i;j i;i i;i n n A @ c c i i j D1 j D1 j ¤i
j ¤i
for i D 1; : : :; I is unconditional positivity preserving but not conservative. Proof. We simply rewrite the Patankar scheme in the form 0
11
0
0
1
I I CC BX C B BX n n CC nC1 n B B1 C t B d .c / C d .c / c D c C t pi;j .cn / C pi;i .cn /C i;j i;i i i n AA @ A @ c @ i
„
j D1 j ¤i
ƒ‚
…
j D1 j ¤i
„
ƒ‚
cin >0
1
…
for i D 1; : : :; I . Thus, we can immediately conclude the positivity of the method. However, one can easily see that even in the case of the simple system c10 .t/ D c2 .t/ 2c1 .t/ c20 .t/ D 2c1 .t/ c2 .t/ and initial conditions c0 D c.t D 0/ D .1; 1/T one gets c11 D
1 C t ; 1 C 2t
c21 D
1 C 2t 1 C t
such that 2 X i D1
.ci1 ci0 / D
t 2 ¤ 0 for all t > 0: 1 C 3t C 2t 2
Theorem 2 shows that the so-called Patankar-trick represents a cure with respect to the positivity constraint but this method suffers from the fact that the conservativity is violated since production and destruction terms are handled in a different manner. Inspired by the Patankar-trick, Burchard et al. (2003) introduced a modified Patankar approach where source as well as sink terms are treated in the same way. However, this procedure can only directly be applied to conservative systems. Thus, an extension of this modified Patankar scheme to take account of additional non
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1401
conservative reaction terms as appearing within the biomass dynamics has been presented in Benz et al. (2009). With respect to the Euler scheme as the underlying basic solver, this extended modified Patankar approach can be written in the form 0 cinC1 D cin Ct
1 I X
B c nC1 n j B p .c / i;j @ cjn j D1 j ¤i
I X
di;j .cn /
j D1 j ¤i
C cinC1 n n nC n C .pi;i .c / di;i .c //!i A ci (10)
for i D 1; : : :; I , where ( !in
D
cinC1 cin ;
if pi;i .cn / di;i .cn / < 0;
1; otherwise:
Theorem 3. The extended modified Patankar scheme (10) is conservative in the sense of Definition 3. Proof. It is easily seen by straightforward calculations that 0 I X
.cinC1 cin / D t
i D1
I X
1 I X
c nC1 B n j B p .c / i;j @ cjn i D1 j D1 j ¤i
Ct
I X i D1
D t
I X
c nC1 C di;j .cn / i n C ci A j D1 j ¤i
.pi;i .cn / di;i .cn //!in ƒ‚ … „ D0
I X
cjnC1 .pi;j .cn / dj;i .cn // n D 0; „ ƒ‚ … cj i;j D1 j ¤i
D0
which proves the statement. Theorem 4. The extended modified Patankar scheme (10) applied to the system of differential equations (6) is unconditionally positivity preserving. Proof. The Patankar-type approach (10) can be written in the form AcnC1 D bn ; where A D .ai;j / 2 RI I with
(11)
1402
A. Meister and J. Benz
0 ai;i D 1 C
1
I X C t B Bmaxf0; di;i .cn / pi;i .cn /g C di;k .cn /C A > 0; cin @
i D 1; : : : ; I;
kD1 k¤i
ai;j D t
pi;j .cn / cjn
0;
i; j D 1; : : : ; I;
i ¤ j:
and bin D cin C t maxf0; pi;i .cn / di;i .cn /g cin > 0,
i =1,. . . ,I .
Hence, for i D 1; : : :; I we have
jai;i j > t
I X di;k .cn / kD1 k¤i
cin
D t
I X pk;i .cn / kD1 k¤i
cin
D
I X kD1 k¤i
.ak;i / D
I X
jak;i j
kD1 k¤i
which directly shows that the point Jacobi matrix B D I D1 AT defined by means 1 1 of the diagonal matrix D1 D diagfa1;1 ; : : : ; aI;I g satisfies .B/ k B k1 < 1. Thus, the matrix B is convergent. Regarding the fact that the matrix A contains only nonpositive off-diagonal elements and positive diagonal entries we can conclude with a standard statement from the numerical linear algebra that AT and therefore A are M-matrices. Thus, A1 exists and is nonnegative, i.e., A 0. This fact implies that cnC1 D A1 bn A1 cn > 0 since at least one entry per row of the matrix A1 is positive. The Patankar scheme as well as the modified and extended modified version can be interpreted as a perturbed Euler scheme due to the incorporation of the weights. Thus, it is quite not obvious that these schemes are still first order accurate. Fortunately, the order of accuracy of the underlying Euler scheme transmits to each variant discussed above. Similar to the error analysis presented in Burchard et al. (2003) for the modified Patankar scheme we will now prove this important property for the extended formulation. Theorem 5. The extended modified Patankar scheme (10) is first-order accurate in the sense of the local discretization error. Proof. Since the time step inside the extended modified Patankar scheme (10) is performed by the solution of a linear system of equations, it is advantageous to investigate the entries of the inverse matrix A1 D .aQ i;j / 2 RI I . Introducing e D .1; : : :; 1/T 2 RI one can easily verify eT A eT . Within the proof of Theorem 4
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1403
we have seen that A is regular and A1 0. Thus, we obtain eT eT A1 that yields
0 aQ i;j 1;
i; j D 1; : : : ; I
(12)
independent of the time step size t > 0 used. Starting from the formulation of the time step in the form of the linear system (11) one can conclude
I X bjn cinC1 D a Q D O.1/; i D 1; : : : ; I; i;j cin „ƒ‚… cin j D1 2Œ0;1
from the estimation (12). Introducing this property into the extended modified Patankar scheme (10) and determining cn WD c.t n / for simplification leads to 0 cinC1 cin D t
1 I X
B c nC1 n j B p .c / i;j @ cjn j D1 j ¤i
„
I X
c nC1 di;j .cn / i n C.pi;i .cn / ci j D1 j ¤i
C di;i .cn //!in C A
ƒ‚
…
DO.1/
D O.t/:
Thus we obtain cinC1 cinC1 cin 1 D D O.t/ cin cin
(13)
for i D 1; : : :; I . A similar results is valid for the weight !in due to ( !in
1D
cinC1 cin cin
D O.t/;
0 D O.t/;
if pi;i .cn / di;i .cn / < 0; otherwise:
A combination of (13) and (14) with a simple Taylor series expansion yields
(14)
1404
A. Meister and J. Benz
ci .t nC1 / D ci .t n / C t
dci n .t / C O.t 2 / dt 0
1
I I X BX C n D ci .t n / C t B p .c.t // di;j .c.t n // C pi;i .c.t n // di;i .c.t n //C i;j @ A j D1 j ¤i
j D1 j ¤i
CO.t 2 / 0
1
I I nC1 X BX C c nC1 n cj D cin C t B p .c / di;j .cn / ic n C .pi;i .cn / di;i .cn //!in C n i;j c @ A
0 B t B @ „
j
j D1 j ¤i
I X
c nC1 n j
pi;j .c /
j D1 j ¤i
i
j D1 j ¤i
cjn
cjn
1 I X
ƒ‚
nC1 n ci
di;j .c /
j D1 j ¤i
cin
cin C C A …
DO.t /
t ..pi;i .cn / di;i .cn //.!in 1// CO.t 2 / ƒ‚ … „ DO.t /
D cinC1 C O.t 2 /; which completes the proof. In order to increase the accuracy one can easily integrate the idea describe in the context of the extended modified Patankar scheme within a standard second-order Runge-Kutta method. Similar to the proof presented in Burchard et al. (2003), it can be shown that such an extension is second-order accurate, unconditionally positivity preserving and conservative in the sense of Definition 3.
Numerical Results for Positive Ordinary Differential Equations The first test cases presents a simple linear system of ordinary differential equations. The results are taken from the original paper by Burchard et al. (2003). The system can be written as dt c1 D c2 5c1 ;
and dt c2 D 5c1 c2
(15)
with initial values set to be c1 .0/ D 0:9 and c2 .0/ D 0:1. The analytic solution is c1 .t/ D .1 C 4:4 exp.6t//=6 and c2 .t/ D 1 c1 .t/. Using the step size t D 0:25 one obtains negative concentrations for the standard forward Euler scheme, whereas the modified Patankar approach still preserves the positivity of the solution. Furthermore, both schemes are conservative, which
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation Forward Euler scheme
2
Modified Patankar scheme
2 1.5
1
0.5 0
c1, simulated c2, simulated c2, simulated c1 + c , analytical 1 c2, analytical
–0.5 –1 0
0.2 0.4 0.6 0.8
1
1.2 1.4 1.6 1.8
t
Concentration
1.5 Concentration
1405
1 0.5 0 –0.5
c1 +
–1 0
0.2 0.4 0.6 0.8
1
c1, simulated c2, simulated c2, simulated c1, analytical c2, analytical
1.2 1.4 1.6 1.8
t
Fig. 4 Numerical approximation (t D 0:25) and analytical solution of the simple linear system with the forward Euler scheme (left) and the (extended) modified Patankar scheme (right)
can be seen from the horizontal line representing the sum of both concentrations (Fig. 4). As a second test cases we consider the simplified conservative biochemical system 1 0 1 uptba setpa resp PA C B C d B mi npe exchp B PEI C D B C; A setpa mi npe dt @ PEO A @ PS exchp uptba C resp 0
where uptba, setpa, resp, exchp, and minpe denote the phosphorus uptake, phosphorus of setting phytoplankton, phosphorus due to respiration, exchange between water and sediment and the mineralization of organic phosphorus, respectively. The results obtained by the standard forward Euler scheme, the Patankar scheme and the modified Patankar scheme for a constant time step size t D 1=3d are compared with a high resolution numerical result using the time step size t D 1=100d . Figures 5 and 6 confirm the analytic statements concerning the three schemes used. The modified Patankar scheme is positivity preserving and conservative while the Patankar method suffers from non-conservativity and the standard Euler approach yields meaningless negative values for both the solute phosphorus concentration in the water body and the phosphorus within the biomass. Note that the conservativity can be observed by (PS C PA C PEI C PEO / WD .PS C PA C PEI C PEO /n .PS C PA C PEI C PEO /0 .
3.3
Practical Applications
The ecological model presented here is an enhancement of the phosphorus cycle model proposed in Hongping and Jianyi (2002). All considered processes are
1406
A. Meister and J. Benz
0.004
0.004 PSref PAref Delta (PS + PA + PE° + PEI) PS PA
0.0035 0.003
PSref PAref Delta (PS + PA + PE° + PEI) PS PA
0.0035 0.003
0.0025
0.0025
0.002
0.002
0.0015
0.0015
0.001
0.001
0.0005
0.0005
0
0
–0.0005
–0.0005 0
50
100
150
200
250
300
350
0
50
100
150
200
250
300
350
Fig. 5 Results of the forward Euler scheme (left) and the Patankar scheme (right) concerning the second test case 30 PSref PAref Delta (PS + PA + PE° + PEI) PS PA
25 Temperature [°C]
0,003
800
0,0025 0,002 0,0015 0,001
I (Light intensity)
T (Water body) TE (Sediment)
700
20
600
15
500
10
400
5
300
0,0005 0 –0,0005
0 0
50
100
150
200
250
300
350
Light intensity [Lux]
0,004 0,0035
200 0
50
100
Days
150
200
250
300
350
Days
Fig. 6 Results of the modified Patankar scheme (left) concerning the second test case. Distribution of sediment temperature (TE), water temperature (T ) and light intensity (I ) over the course of the year (right) with respect to the West Lake Table 1 Constituents of the phosphorus cycle Description Solute phosphorus/PO 4 Phosphorus in detritus Inorganic and solute phosphorus in sediment Organic phosphorus in sediment Biomass of zooplankton and its content of phosphorus Biomasses of four different groups of algae species With their respective content of phosphorus
Abbreviation PS PD PEI PEO BZ PZ BAAD PAAD
Unit g/m3 g/m3 g/m3 g/m3 g/m3 g/m3 g/m3 g/m3
depicted in Fig. 2. For the sake of simplicity only one biomass algae (BA) and respectively only one phosphorus content (PA) are included in Fig. 2. The remaining three biomasses and corresponding phosphorus can easily be included since their evolution is quite alike. The examined constituents of the biological system are summarized with their abbreviations and their units respectively in Table 1.
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1407
The impact of the current on the different kinds of biomass and phosphorus constituents within the governing equations (2) is performed as follows. Regarding c the biochemical system the flux fm;p describes the passive advective transport with the existing current for all constituents of the phosphorus cycle, which are in the water body. Furthermore, the additional term fm;p includes diffusive effects for the solute components PS and PEI with appropriate diffusion coefficients DiffPS and DiffPEI . The expansion of the right-hand side qp (U) is given by 0 B B B B B B B B B B B B B B qp .U/ D B B B B B B B B B B B B B B @
exchp
P
uptbai C mi npd C zresp C
P
respi
1
C C C C C C C C i C uptbaA respA setpaA gsi nkpA assi mpA C C uptbaB respB setpaB gsi nkpB assi mpB C C uptbaC respC setpaC gsi nkpC assi mpC C C uptbaD respD setpaD gsi nkpD assi mpD C C P C zresp zmorp C assi mpi C C i P C assi mi zres zmor C C i C C growthA resA si nkA grazA C C growthB resB si nkB grazB C A growthC resC si nkC grazC i
i
mi npd setpd C zmorp exchp C mi npe P mi npe C setpd C .setpai C gsi nkpi /
growthD resD si nkD grazD A detailed description of the terms is presented in Appendix 4. For the proper computability of the above given equations the temperature of the water T and of the sediment TE must be supplied. Based on the air temperature stated in Weather Hangzhou (2008a,b,c), these temperatures are solutions of a onedimensional heat equation for the proper distribution shown in Fig. 6. The figure also shows the assumed light intensity over the course of 1 year. For the rain q also annual developments are used from the same source of information. Most of the constants are adopted from Hongping and Jianyi (2002). All constants used are assembled in Table 2, “Appendix: Ecological Model.” Even though the model described above is based on Hongping and Jianyi (2002), it has many important new features. An extremely important fact is that the underlying flow field of the water coupled with the biological and chemical equations are simultaneously solved. This gives a much more detailed image of the modeled occurrences in the lake then any prediction made under the assumption of even distributed materia can hope for. Another major difference between the model proposed in Hongping and Jianyi (2002) and our model is the splitting of the phosphorus in sediment PE to organic phosphorus in the sediment PEO and inorganic phosphorus in the sediment PEI .
1408
A. Meister and J. Benz
Table 2 Description and numerical values for all constants Constant Fmin
Value 0.05
Fs
0.25
Unit g m3 g m3
Description Min feeding concentration of phytoplankton
Max graze rate of zooplankton Preference factor for grazing i
GRmax
0.09
1 d
Ri
0.18, 1, 1, 1
–
DPSi
0.027, 0.016, 0.016, 0.018
TOPi
30, 23, 20, 21
K1
1.5
K2
g m3 ı C
Menten feeding constant phytoplankton
Menten constant for i phosphorus uptake Optimal temperature for growth i Extinction coefficient of water
1
1 d 1 d
LOPi
310, 340, 350, 310
Lux
Optimal light radiation for growth i
growt hmaxi
3.3, 2.35, 2.43, 2.37
1 d
Max growth rate i
ROi
0.007, 0.003, 1 d
Respiration rate i
0.003, 0.003 Kseti
0.025, 0.016, 0.016, 0.017
P i nmini
d
Velocity of sedimentation i
g m3
Min content of phosphorus in i
g m3
Max content of phosphorus in i
0.015, 0.020, 0.015, 0.015
P UPmaxi
m
0.005, 0.005, 0.005, 0.005
P i nmaxi
Extinction coefficient of phytoplankton
0.07, 0.1,
mor
0.0004
ZRO
0.03
1 d 1 d 1 d
ZRM
0.04
–
Respiration multiplier of zooplankton
Kex
0.03
Phosphorus exchange coefficient
Kml
0.01
1 d 1 d
Sm1
0.8
–
Temp. coeff of mineralization rate PD
Km2
0.178
1 d
Mineralization rate of PE
Sm2
1.08
–
Temp. coeff of mineralization rate PE
Vs
0.125
m d
Sedimentation rate
rd
0.38
–
Percentage of organic phosphorus in detritus
ut i li
0.8, 0.8, 0.8, 0.8
–
0.07, 0.07
TCOEF
0.38
Max phosphorus uptake rate of i Death rate of zooplankton Respiration rate of zooplankton
Mineralization rate of PD
Usage of feed material ı
1 C
Q10 coefficient
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1409
A temperature-driven mineralization process from sedimented matter to solute phosphorus is introduced. As only the second form of phosphorus can contribute to the phosphorus supply in the water body, this offers a better reproduction of the phosphorus accumulation in sediment during the winter period. The fixation of phosphorus in the sediment is also an important process from the ecological point of view. Nevertheless, due to the missing pH and O2 -concentration the fixation is not considered in this model. Not complete uptaken biomass of algae by grazing is assimilated. The model by Hongping and Jianyi assumes a complete assimilation. Our model expects a gain of 80 % of the biomass for the zooplankton biomass and a loss of 20 %. A critical mathematical and also biological certainty is that all processes (with one exception: exchp) are positive. This is the mathematical formulation for the fact that processes are not reversible. For example, zooplankton grazing cannot grant a mass win for the grazed algae or respiration cannot end up with the algae gathering phosphorus from the water body. This should be modeled accordingly. Nevertheless, the original model allows sign changes for example in grazi , uptbai or asim. It also allows greater uptakes then the maximum uptake rate. This flaws are corrected by using equations oriented at equations proposed in Hongping and Jianyi (2002) with the proper constraints. An example for this is the determination of P Ci . Since no measured data are available for a comparison with the numerical simulation, the results are valid only under the assumptions included in the model. To demonstrate the applicability of the scheme for the discussed model we show various numerical results. The first test case presents a course of 1 year after the computation of several years for the whole phosphorus cycle model. In Fig. 7, the changes of the phosphorus of one algae and there biomass together with the solute phosphorus in the water body and the phosphorus in the sediment are presented. The scheme gives reasonable results for the change of the phosphorus content over the seasonal changes as can be seen in Fig. 7. The phosphorus concentration PEO augments in spring. With the rising of the temperature the bacteria in the sediment start to convert the organic-bound phosphorus, which was accumulated over the winter, into inorganic solute phosphorus. Through the diffusive exchange between the solute phosphorus in water and sediment the solute phosphorus in the water augments and is directly consumed by the starting algae growth BA. At end of the year, the phosphorus concentration PEO and PS increases due to the decease of the algae BA. PEO is augmented by the dead biological mass, PS through lesser consumption and exchange between PEI and PS. As second test case shown in Figs. 8 and 9 solute phosphorus inflow has been examined. It was assumed that all boundaries of the West Lake except the small channels in the south end (lower left part of the figure) and in the north eastern corner (upper right part of the figure) are fixed walls, i.e., no particles can pass this boundary. The southern boundary is considered the inflow, the north eastern boundary the outflow.
1410
A. Meister and J. Benz 0.002
BA
Phosphorus in g/m3
0.0015
0.2 PE_I PE_O PS
0.001
0.15
0.1
Bio mass algae in g/m3
0.25
0.0005 0.05
0 Jan
Feb Mar Apr May June July Aug
Sep
Oct Nov Dec
0 Jan
Fig. 7 Course of 1 year for phosphorus in water and sediment with one biomass algae
4000 VABS 320000 300000 280000 260000 240000 220000 200000 180000 160000 140000 120000 100000 80000 60000 40000 20000
Y
3000
2000
1000
0
0
1000
2000
3000
4000
X
Fig. 8 Absolute value of the velocity vabs D k v k in the West Lake
For a clear visualization of the proportions of the current, high inflow velocities have been assumed. One can observe the curvature of the current in the lake after the straight inflow following the structure of the lake. Then the current splits into a western and an eastern part. The west flowing current circles back into the lake
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1411
4000 PS 0.003 0.0028 0.0026 0.0024 0.0022 0.002 0.0018 0.0016 0.0014 0.0012 0.001 0.0008 0,0006 0.0004 0.0002
Y
3000
2000
1000
0
0
1000
2000
3000
4000
X
Fig. 9 Solute phosphorus floating into the West Lake
behind the three islands and through the small openings into the nearly cut off western part of the lake. The eastern directed part of the flow pours out of the West Lake. The West Lake was assumed to contain only traces of phosphorus PS at the beginning of the calculation. This allows the undisturbed observation of the distribution of in-flowed phosphorus. One can see clearly how the phosphorus is distributed. High concentrations of phosphorus are found directly along the strong currents, the inflow area and also the triangular dead area along the south eastern boundary of the lake. Also one can note that through the absence of a strong current into the nearly cut off western part of the lake the phosphorus is entering very slowly.
4
Conclusion
A complex phosphorus cycle containing four different groups of algae species and their respective phosphorus content was presented. Furthermore, a higher order finite volume scheme was introduced. Through various numerical results the scheme has proven to be able to solve simultaneously a real life fluid dynamics problem and a sophisticated system of positive and conservative phosphor cycle describing biological and chemical equations. The task is accomplished via a splitting ansatz combining higher order flux solving methods with problem adopted ordinary differential equation techniques,
1412
A. Meister and J. Benz
the extended modified Patankar ansatz. Constructed in this way, the scheme preserves the important characteristics of conservativity and positivity necessary to obtain meaningful numerical results. Thus state-of-the-art modeling and numerical techniques have been successfully combined in the context of an eutrophication lake modeling. Acknowledgements First of all, we would like to express our thanks to W. Freeden for initiating this Handbook of Geomathematics and giving us the possibility to participate. Several mathematical parts of this chapter are developed in a joint work with H. Burchard and E. Deelersnijder. A. Meister is grateful for this productive cooperation. Furthermore, A. Meister would like to thank Th. Sonar for many years of fruitful collaborations in the field of computational fluid dynamics.
Appendix: Ecological Model In order to give a deeper insight in the structure of the source term qp (us , up /, we now concentrate on the specific expression for the biomass of i th algae group: 1. Temperature impact on growth of algae group i T Ti D
TOPi T T e TOPi : TOPi
2. Phosphor availability factor for algae group i PPi D
PS : DPsi C PS
3. Light extinction K D K1 C K2 †j BAj .
KH : 4. Depth averaged available light intensity L D I 1eKH 5. Light intensity impact on growth of algae group i LLi D
LOPi L L e LOPi : LOPi
6. Growth of algae i growthi D growthmaxi T Ti PPi LLi BAi : 7. Respiration loss of biomass of algae group i resi D ROi e T COEF
T
8. Sinking, i.e., dying, of biomass of algae group i
BAi :
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
si nki D BAi
1413
Kseti : H
9. Total food preference weighted biomass algae F D †jRjBAj . 10. Available-food factor for zooplankton ( FF D
0:0 F Fmin Fs C F Fmin
Fmi n > F : Fmi n F
11. Grazing of biomass algae i through zooplankton grazi D GRmax FF
Ri BAi BZ: F
The biomass zooplankton dynamics are given by 1. Biomass gain for zooplankton through grazing algae i assi mi D uti li grazi : 2. Respiration of zooplankton zres D ZRO e T COEF
T
BZ C ZRM
X
grazj :
j
3. Death rate of zooplankton zmor D mor BZ. The phosphorus dynamics are given as: 1. Diffusive exchange between the solute phosphorus in sediment and water exchp D Kex .PEI PS /: T 20 2. Mineralization of PD: mi npd D Km1 Sm1 PD. TE20 3. Mineralization from PEO to PEI : mi npe D Km2 Sm2 PEO . PD 4. Sedimentation of PD: setpd D H .1 rd / Vs .
The exchange dynamics between phosphorus in living cells and the phosphorus without associated biomass are: 1. Phosphorus fraction from biomass of algae group i P CONi D
PAi : BAi
1414
A. Meister and J. Benz
2. Influence of the phosphorus already contained in the biomass of algae i for its growth 8 ˆ ˆ ˆ ˆ
P i nmaxi P Ci D P i nmaxi P CONi otherwise ˆ ˆ ˆ P i nmaxi P i nmini ˆ : 3. Phosphorus uptake from algae group i uptbai D P UPmaxi P Ci BAi PPi : 4. Respiration loss of phosphorus of algae i respi D resi P CONi : 5. Sinking, i.e., dying, of phosphorus of algae group i setpai D si nki P CONi : 6. Phosphorus loss of algae group i through zooplankton grazing grazpi D grazi P CONi : 7. Biomass loss of algae group i through zooplankton grazing not assimilated by zooplankton gsi nki D .1 uti li /grazi : 8. PD gain through zooplankton grazing algae group i gsi nkpi D gsi nki P CONi : 9. Phosphorus gain for zooplankton through grazing algae group i assi mpi D assi mi P CONi : 10. Phosphorus fraction from biomass of zooplankton P CONz D 11. Respiration amount from zooplankton
PZ : BZ
Phosphorus Cycles in Lakes and Rivers: Modeling, Analysis, and Simulation
1415
zresp D zres P CONz : 12. PZ loss through mortality zmorp D zmor P CONz : 13. PZ loss through respiration zresp D zres P CONz :
References Ansorge R, Sonar Th (2009) Mathematical models of fluid dynamics. Wiley-VCH, New York Audusse E, Bristeau M-O (2005) A well-balanced positivity preserving second-order scheme for shallow water flows on unstructured meshes. J Comput Phys 206:311–333 Barth TJ, Jesperson DC (1989) The design and application of upwind schemes on unstructured meshes. AIAA paper 89-0366 Benz J, Meister A, Zardo PA (2009) A conservative, positivity preserving scheme for advectiondiffusion-reaction equations in biochemical applications. In: Tadmor E, Liu J-G, Tzavaras AE (eds) Hyperbolic problems. Proceedings of symposia in applied mathematics. American Mathematical Society, Providence Berzins M (2001) Modified mass matrices and positivity preservation for hyperbolic and parabolic PDEs. Commun Numer Methods Eng 9:659–666 Broekhuizen N, Rickard GJ, Bruggeman J, Meister A (2008) An improved and generalized second order, unconditionally positive, mass conserving integration scheme for biochemical systems. Appl Numer Math 58:319–340 Bruggeman J, Burchard H, Kooi BW, Sommeijer B (2007) A second-order, unconditionally positive, mass-conserving integration scheme for biochemical systems. Appl Numer Math 57:36–58 Burchard H, Deleersnijder E, Meister A (2003) A high-order conservative Patankar-type discretisation for stiff systems of production-destruction equations. Appl Numer Math 47:1–30 Burchard H, Deleersnijder E, Meister A (2005) Application of modified Patankar schemes to stiff biogeochemical models of the water column. Ocean Dyn 55(3–4):326–337 Burchard H, Bolding K, Kühn W, Meister A, Neumann T, Umlauf L (2006) Description of a flexible and extendable physical-biogeochemical model system for the water column. J Mar Syst 61:180–211 Chertock A, Kurganov A (2008) A second-order positivity preserving central upwind scheme for Chemotaxis and Haptotaxis models. Numer Math 111:169–205 Friedrich O (1993) A new method for generating inner points of triangulations in two dimensions. Comput Methods Appl Mech Eng 104:77–86 Hirsch C (1988a) Numerical computation of internal and external flows, vol 1. Wiley, New York Hirsch C (1988b) Numerical computation of internal and external flows, vol 2. Wiley, New York Hongping P, Jianyi M (2002) Study on the algal dynamic model for West Lake, Hangzhou. Ecol Model 148:67–77 Jørgensen SE (1975) A eutrophication model for a lake. Ecol Model 2:147–165 Lampert W, Sommer U (1999) Limnoökologie. Georg Thieme, Stuttgart LeVeque RJ (1990) Numerical methods for conservation laws. Birkhäuser, Boston Meister A (1998) Comparison of different Krylov subspace methods embedded in an implicit finite volume scheme for the computation of viscous and inviscid flow fields on unstructured grids. J Comput Phys 140:311–345
1416
A. Meister and J. Benz
Meister A (2003) Viscous flow fields at all speeds: analysis and numerical simulation. J Appl Math Phys 54:1010–1049 Meister A, Oevermann M (1998) An implicit finite volume approach of the k turbulence model on unstructured grids. ZAMM 78(11):743–757 Meister A, Sonar Th (1998) Finite-volume schemes for compressible fluid flow. Surv Math Ind 8:1–36 Meister A, Vömel C (2001) Efficient preconditioning of linear systems arising from the discretization of hyperbolic conservation laws. Adv Comput Math 14(1):49–73 Park RA et al (1974) A generalized model for simulating lake ecosystems. Simulation 21:33–50 Patankar SV (1980) Numerical heat transfer and fluid flows. McGraw-Hill, New York Poethke H-J (1994) Analysieren, Verstehen und Prognostizieren. PhD thesis, Johannes GutenbergUniversität Mainz, Mainz Ricchiuto M, Bollermann A (2009) Stabilized residual distribution for shallow water simulations. J Comput Phys 228(4):1071–1115 Sagehashi M, Sakoda A, Suzuki M (2000) A predictive model of long-term stability after biomanipulation of shallow lakes. Water Res 34(16):4014–4028 Schwoerbel J, Brendelberger H (2005) Einführung in die Limnologie. Elsevier/Spektrum Akademischer, Munich Smolarkiewicz PK (2006) Multidimensional positive definite advection transport algorithm: an overview. Int J Numer Methods Fluids 50:1123–1144 Sonar Th (1997a) On the construction of essentially non-oscillatory finite volume approximations to hyperbolic conservation laws on general triangulations: polynomial recovery, accuracy, and stencil selection. Comput Methods Appl Mech Eng 140:157–181 Sonar Th (1997b) Mehrdimensionale ENO-Verfahren. Teubner, Stuttgart Stoker JJ (1957) Water waves. Interscience Publisher, New York Straškraba M, Gnauk A (1985) Freshwater ecosystems. Elsevier, Amsterdam Toro EF (1999) Riemann solvers and numerical methods for fluid dynamics. Springer, Berlin Toro EF (2001) Shock-capturing methods for free-surface shallow flows. Wiley, New York Vater S (2004) A new projection method for the zero Froud number shallow water equations. Master’s thesis, Freie Universität Berlin, Berlin Vázquez-Cendón M-E (2007) Depth averaged modelling of turbulent shallow water flow with wet-dry fronts. Arch Comput Methods Eng 14(3):303–341 Weather Hangzhou (2008a) Internet, May 23. http://www.chinatoday.com.cn/english/chinatours/ hangzhou.htm Weather Hangzhou (2008b) Internet, May 23. http://www.ilec.or.jp/database/asi/asi-53.html Weather Hangzhou (2008c) Internet, May 23. http://www.chinatoday.com.cn/english/chinatours/ hangzhou.htm Zardo PA (2005) Konservative und positive Verfahren für autonome gewöhnliche Differentialgleichungssysteme. Master’s thesis, University of Kassel
Model-Based Visualization of Instationary Geo-Data with Application to Volcano Ash Data Martin Baumann, Jochen Förstner, Vincent Heuveline, Jonas Kratzke, Sebastian Ritterbusch, Bernhard Vogel, and Heike Vogel
Contents 1 2
3
4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concept of Reduced Model for Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Linear Interpolation Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Model-Based Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scenario of Volcano Ash Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The Simulation in the Model System COSMO-ART . . . . . . . . . . . . . . . . . . . . . . 3.2 Description of the Model Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Low-Fidelity Model for the Dispersion of a Volcano Plume . . . . . . . . . . . . . . . . . . . . . 4.1 Conversion of the Mesh Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The Continuous Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1419 1420 1422 1423 1424 1424 1425 1426 1427 1427 1429
Electronic supplementary material: The online version of this chapter (doi:10.1007/978-3-64254551-1_87) contains supplementary material, which is available to authorized users. Martin Baumann, present affiliation: Heidelberg University Computing Centre (URZ), Heidelberg, Germany Vincent Heuveline, present affiliation: Engineering Mathematics and Computing Lab (EMCL), Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany Jonas Kratzke, present affiliation: Engineering Mathematics and Computing Lab (EMCL), Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany M. Baumann () • J. Kratzke • S. Ritterbusch Engineering Mathematics and Computing Lab (EMCL), Karlsruhe Institute of Technology, Karlsruhe, Germany e-mail: [email protected]; [email protected]; [email protected] J. Förstner German Weather Service (DWD), Offenbach, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_87
1417
1418
M. Baumann et al.
4.4 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Discrete Volcanic Particle Injection Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Implementational Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Qualitative Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Quantitative Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Sources of Errors and Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1430 1430 1431 1432 1432 1434 1436 1439 1440
Abstract
Driven by today’s supercomputers, larger and larger sets of data are created during numerical simulations of geoscientific applications. Such data often describes instationary processes in three-dimensional domains in terms of multidimensional data. Due to limited computer resources, it might be impossible or unpractical to store all data created during one simulation, which is why several data reduction techniques are often applied (e.g., only every nth timestep is stored). Intuitive scientific visualization techniques can help to better understand the structures described by transient data. Adequate reconstruction techniques for the time-dimension are needed since standard techniques (e.g., linear interpolation) are insufficient for many applications. We describe a general formalism for a wide class of reconstruction techniques and address aspects of quality characteristics. We propose an approach that is able to take arbitrary physical processes into account to enhance the quality of the reconstruction. For the eruption of the volcano Eyjafjallajökull in Iceland in the spring of 2010, we describe a suitable reduced model and use it for model-based visualization. The original data was created during a COSMO-ART simulation. We discuss the reconstruction errors, related computational costs, and possible extensions. A comparison with linear interpolation clearly motivates the proposed modelbased reconstruction approach.
V. Heuveline Engineering Mathematics and Computing Lab (EMCL), Karlsruhe Institute of Technology, Karlsruhe, Germany Institute for Applied and Numerical Mathematics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany University of Heidelberg, Interdisciplinary Center for Scientific Computing, Engineering Mathematics and Computing Lab, Heidelberg, Germany e-mail: [email protected]; [email protected] B. Vogel • H. Vogel Institute for Meteorology and Climate Research (IMK), Karlsruhe Institute of Technology, Eggenstein-Leopoldshafen, Germany e-mail: [email protected]; [email protected]
Model-Based Visualization of Instationary Geo-Data with Application to. . .
1
1419
Introduction
Nowadays, geoscientific phenomena such as global warming and the greenhouse effect are of public interest and much research is done in these fields. Related questions interdisciplinarily refer to several applied sciences such as meteorology, geochemistry, and geomorphology amongst others. Therefore, a great diversity of models and corresponding solution procedures are used and large data sets are typically created, often several gigabytes of data or considerably more. The ever-growing computing power available in computing facilities accessible for researchers allows for more complex models, higher accuracy in the computations, and also the creation of more data. Most often, the analysis of such data requires a complex working sequence and the recovery of an intuitive and deep-going comprehension of the implicitly described features of interest is non-trivial. Tailored imaging tools can facilitate this step of cognition by filtering existing data and displaying only the important subsets with adequate techniques of scientific visualization (Bonneau et al. 2006). Most often, the investigated data is multi-dimensional (i.e., multiple properties such as temperature and wind are given) and defined on a three-dimensional domain. Transient physical processes are typically represented by a description of the spatial state at a selection of points in time within the time horizon. A common approach to keep the total data amount economically justifiable is to store the state of the system only at a few points in time, e.g., one description per hour. By this data reduction, a large portion of the data is eliminated irrecoverably, as long as the same calculations are not repeated again. If the time evolution of some process is described implicitly by a data sequence, the time corresponding increments and number of time steps must fit the timescale of the inspected feature. If more data is needed (e.g., for comprehensible animated visualizations), additional data sets must be reconstructed at intermediate time steps. Simple reconstruction methods ground on interpolation techniques such as polynomial or spline interpolation. In this case, the interpolation is calculated based on the information given in the data only. In Fig. 1, an example of a rotating
Fig. 1 Trajectory of a particle (red circle) and positions at time t D 0; 1; 2; 3 (left), the linear interpolation of the trajectory based on the three positions at points in time t D 0; 1; 2; 3 (middle), and the trajectory calculated using an efficient, simplified model (right)
1420
M. Baumann et al.
particle is sketched. The linear interpolation of the particle’s trajectory shown in the middle panel is piece-wise linear and only a very rough approximation of the physically correct, circular trajectory (see left panel). This is due to a very coarse time-resolution consisting of three time steps during one circular motion. In this article, we describe an approach of data reconstruction that incorporates a physical model additionally to the available data. By the consideration of such a model, a priori knowledge can be exploited in contrast to pure interpolation techniques. In the right panel of Fig. 1, the reconstructed trajectory of the particle assuming a reduced model is shown. The trajectory is almost circular, but the end positions of the reconstructions indicated by green circles are not located at the given particle positions (red circle). This gap between the reconstructed final state and the given data arises from the fact that the deployed reduced model is only an approximation of the original model. Certainly, this discrepancy can be minimized by different techniques which will be addressed later. While standard interpolation techniques can be applied universally, the model-based reconstruction approach requires an adequate model. Obviously, the quality and also the related computational costs of the reconstruction can be controlled by the specific choice of this model. We describe the general reconstruction approach based on models that are given by means of partial differential equations. Many phenomena in the geosciences can be modeled in terms of instationary partial differential equations which make this approach quite universal. High quality can be achieved only if the applied reduced physical model is adequate for the phenomenon included in the data. As a proof of concept, we investigate the scenario of the eruption of the volcano Eyjafjallajkull in Iceland in April 2010. Large amount of volcanic ash was injected into the atmosphere and was transported rapidly towards Europe. High-fidelity simulations of this scenario were calculated with the on-line coupled model system COSMO-ART (Vogel et al. 2009) which will be described in Sect. 3. During the numerical simulation run, data files containing the wind and the distribution of six different ash species in a 1-h stepping were stored. We propose a simplified physical model for the reconstruction of the evolution of the ash distributions between these existing data sets. In Sect. 4 we describe the details of the applied reduced model and the simplifications made. With this system the reconstruction can be computed very efficiently on a desktop computer. The results discussed in Sect. 5 clearly motivate the proposed approach of a problem-dependent reconstruction-model.
2
Concept of Reduced Model for Visualization
In this section, we give an abstract description of the data reconstruction task for visualization of instationary data and discuss related quality characteristics. We demonstrate that the standard approach of linear interpolation fits into this formalism and motivate the use of a reconstruction approach that makes use of a physical model in addition to the given data.
Model-Based Visualization of Instationary Geo-Data with Application to. . .
1421
We assume that a physical process denoted by u W Œ0; T ! X should be visualized for the interval Œ0; T . Here, X denotes some arbitrary space in which the state of the system can be described (e.g., X D R in case of a scalar-valued solution such as temperature). The physical model that exactly describes the aforementioned process is denoted by F . In this case, the exact model F , the initial state u.0/, and the solution u fulfills the following relation: F Œu.0/ .t/ D u.t/
8t 2 Œ0; T :
(1)
This equation states that model F prolongates the state of the system from u.0/ at time t D 0 through time such that at time t the determined state equals u.t/. A very general description of an approximative model can be given in a similar fashion. The approximative model ˚ fulfills the relation ˚.t/ u.t/
8t 2 Œ0; T :
(2)
The definition of the approximative model ˚ can include several parameters (e.g., states u.ti / at some points in time ti , physical parameters such as a viscosity parameter, and so forth). The approximative model ˚ will be exploited as an interpolation between known states of the exact model F . The usage of a physical model equation to determine the interpolation motivates the term model-based reconstruction. Important aspects of an approximation are its accuracy, robustness and stability, parameter dependence, coupling, and computational effort. We denote the interpolation operator for the solution u between time steps ti 1 and ti depending on states u.ti 1 / and u.ti / by ˚Œti 1 ; ti ; u.ti 1 /; u.ti / .t/. Then, the accuracy of the approximative model can be analyzed using a suitable norm of deviation jj˚Œti 1 ; ti ; u.ti 1 /; u.ti / ujj. If the solution is only available for specific states, a variant is to analyze at interpolated states such as jj˚Œti 1 ; ti C1 ; u.ti 1 /; u.ti C1 / .ti / u.ti /jj. Since the approximate models are iterated for each interpolation interval, for example in Œti 1 ; ti /, additional continuity conditions such as ˚Œti 1 ; ti ; u.ti 1 /; u.ti / .ti 1 / D u.ti 1 / and ˚Œti 1 ; ti ; u.ti 1 /; u.ti / .ti / D u.ti / seem desirable, but should not be overrated, since this is always achievable using simple interpolation schemes that can decrease the overall accuracy. This motivates the use of approximation schemes based on mathematical models for a simplified physical model, that is expected to outperform general interpolation concepts. This expectation is based on the concept to provide more information to the visualization than traditional approaches that only exploit data states. This additional information is given by the simplified physical model, that now links the visualization to the numerical simulation. The aspect of robustness is mostly defined by the numerical method and numerical parameters chosen for the approximation. We expect the approximative model to be solvable and stable for any given valid or slightly disturbed state data, yielding valid results. The treatment of instationary boundary conditions, or other outer parameters of influences might need special treatment to improve the
1422
M. Baumann et al.
robustness of the computed approximative model, as it will be discussed in the following text. The approximative models need state information for the computation. As simplified physical models, we generally expect that only partial information of the full state is needed. This data austerity decreases the amount of data that is to be managed for visualization, but in general we cannot expect continuity if the final state information remains unused. This approach is used for fast forward schemes. This is no great loss, since the results can be amended using simple linear interpolation schemes late in the interval. Commonly, the amount of coupling within the interpolation scheme is an important issue for the computational effort, and especially for a potential speed-up using parallelization. While a linear interpolation can indeed provide a coupling from starting state to the end, this is of course a trivial approach only applicable to slowly changing phenomena, as we will see in the following, where we give examples of models that are covered by this abstract formulation.
2.1
Linear Interpolation Reconstruction
A reconstruction based on linear interpolation is given by ˚lin:intp: Œti 1 ; ti ; u.ti 1 /; u.ti / .t/ WD u.ti 1 / C
u.ti / u.ti 1 / .t ti 1 /: ti ti 1
(3)
This function describes for any t 2 Œti 1 ; ti the linear interpolation between the two states u.ti 1 / and u.ti /. By definition, it is guaranteed that the reconstructed data matches the two states at the corresponding points in time. In the following, we will sometimes omit the arguments in squared bracket for better readability. As illustrated by Fig. 2, the linear interpolation yields acceptable results for slowly
4
4
4
2
2
2
0
0
0
–2
–2
–2 –4
–4
–4 –4
–2
0
2
4
–4
–2
0
2
4
–4
–2
0
2
4
Fig. 2 Illustration of the effect of linear interpolation on isosurfaces for scalar data. The correct physical process is moving between the blue states, the intermediate state in red. The green state denotes the interpolated intermediate state
Model-Based Visualization of Instationary Geo-Data with Application to. . .
1423
changing phenomena, but does not perform well in general. This limits the use of this general approach to either slowly changing data, or as an amendment of nearaccurate results, as we expect to gain from approximate models.
2.2
Model-Based Reconstruction
Many physical processes can be considered as dynamical systems, described by means of an initial state and an evolution law, i.e., initial-value problems. Often such evolution laws are given by partial differential equations (PDE). In that case, the problem formulation on a spatial domain ˝ and time interval Œt start ; t end with t start < t end has the form F .u.t; x// D f .t; x/; .t; x/ 2 Œt start ; t end ˝; (4) x 2 ˝; u.t start ; x/ D U .x/; with additional boundary conditions. The differential operator F, the external force term f, and the initial state U are defined according to the considered scenario. We assume the problem to be well-posed and apply a discretization method by means of a time-stepping scheme, given on a partitioning t start D t .0/ < t .1/ < < t .N / D t end of the time interval. In that case, an approximation u.i / of the solution at time t .i / can be calculated successively by means of a corresponding solution-operator A.i / for any i D 1; : : : ; N : u.0/ WD U;
u.i / WD A.i / .u.i 1/ /
8i D 1; : : : ; N:
(5)
This is a standard approach for the solution of a parabolic problem and is typically combined with a discretization in space by means of a finite element, finite difference, or finite volume method. This concept is deployed in model-based reconstruction where for any point in time t .i / 2 Œt start ; t end , the reconstructed state of the process is given by ˚F;f ŒU .t .i / / WD u.i /;
(6)
for any i D 1; : : : ; N . The sequence of approximations depends only on the initial state U and the solution operator which does not take any future states into account. Therefore, this reconstruction approach will not make sure that a given final state of the reconstruction interval will be achieved. For any two states corresponding to the points in time ti 1 and ti of some given data, a partitioning can be inscribed and approximate solutions can be calculated as previously described. Since the underlying PDE and its discretization can be arbitrarily chosen, this reconstruction approach is very generic and can be applied to many problems. In the next section, we will give an example based on the convection diffusion problem.
1424
3
M. Baumann et al.
Scenario of Volcano Ash Data
In this section, we present the scenario of the eruption of the volcano Eyjafjallajökull in Iceland in April 2010. The details related to the model system that was used to calculate the high-fidelity simulation are given. Subsequently, the resulting output data of the volcanic ash that is the starting point of the model-based reconstruction is described.
3.1
The Simulation in the Model System COSMO-ART
The COSMO model is the operational weather forecast model of the German weather Service DWD (Deutscher Wetterdienst). It is a non-hydrostatic regional model and is based on the thermo-hydrodynamical equations describing compressible flow in a moist atmosphere. Details about the dynamical core and the numerical scheme can be found in Steppeler et al. (2003) and Baldauf et al. (2011). COSMO-ART (Vogel et al. 2009; Bangert et al. 2012) is an extension of COSMO, where ART stands for Aerosols and Reactive Trace gases. It is a comprehensive model system to simulate the spatial and temporal distributions of reactive gaseous and particulate matter. The model system is mainly used to quantify the feedback processes between aerosols and the state of the atmosphere on the continental to the regional scale with two-way interactions between different atmospheric processes. The model system treats secondary aerosols as well as directly emitted components like soot, mineral dust, sea salt, volcanic ash, and biological material. Secondary aerosol particles are formed from the gas phase. Therefore, a complete gas phase mechanism is included. Modules for the emissions of biogenic precursors of aerosols, mineral dust, sea salt, biomass burning aerosol, and pollen grains are included. For the treatment of secondary organic aerosol (SOA) chemistry the volatility basis set (VBS) was included (Athanasopoulou et al. 2013). Wet scavenging and in-cloud chemistry are taken into account (Knote and Brunner 2013). Processes as emissions, coagulation, condensation (including the explicit treatment of the soot aging), deposition, washout, and sedimentation are taken into account. In order to simulate the interaction of the aerosol particles with radiation and the feedback of this process with the atmospheric variables the optical properties of the simulated particles are parameterized based on detailed Miecalculations. New methods to calculate efficiently, the photolysis frequencies and the radiative fluxes based on the actual aerosol load were developed based on the GRAALS radiation scheme (Ritter and Geleyn 1992) and were implemented in COSMO-ART. To simulate the impact of the various aerosol particles on the cloud microphysics and precipitation COSMO-ART was coupled with the two-moment cloud microphysics scheme of Seifert and Beheng (2006) by using comprehensive parameterizations for aerosol activation and ice nucleation.
Model-Based Visualization of Instationary Geo-Data with Application to. . .
1425
The advantage of COSMO-ART with respect to other models is that identical numerical schemes and parameterizations are used for identical physical processes as advection and turbulent diffusion. This avoids truncation errors and model inconsistencies. COSMO is verified operationally by DWD. The model system can be embedded by one way nesting into individual global scale models as the GME model or the IFS model. All components of the model system are coupled on line with time steps on the order of tenth of seconds. Nesting of COSMO-ART within COSMO-ART is possible. Typical horizontal grid sizes vary between 2.8 and 28 km. For the simulation of the volcanic ash, the model domain is consistent with the domain covered by the operational weather forecast of Deutscher Wetterdienst for Europe. This means 665 657 40 grid points. The horizontal resolution is 0:0625ı, in the vertical the resolution is between 20 m close to the surface up to several 100 m at the top of the domain in 20 km height. The time step is 40 s. The reference simulation was performed for 120 h. The volcano emissions were represented by 6 classes of particles with a diameter between 1 and 30 +m. Details about the parameterization of the source height and the source strength can be found in Vogel et al. (2013). Sinks for the ash particles are wet and dry deposition as well as sedimentation. The initial and boundary conditions for the meteorological variables were taken from the operational runs of the GME. Since the output of one time step is in the order of 1 GB it is restricted to hourly output. The numerical simulation of this scenario was calculated on the HP XC3000 computer system hosted at the Steinbuch Centre for Computing (SCC) at the Karlsruhe Institute for Technology (KIT). On this machine, the calculation using 64 CPUs (Intelr Xeonr Processor E5540, 2.53 GHz, quad-core) takes about 16 h.
3.2
Description of the Model Output
The output data of the COSMO-ART simulation that is stored for visualization purposes contains six ash particle concentrations and the wind field W ˝ I ! R6C ;
vE W ˝ I ! R3 ;
(7)
given in the atmospheric domain ˝ over a time period of 5 days I D Œ0; 120 in units of hours. There is one snapshot given at tn D n, .n D 0; : : : ; 120/, with a 1-h stepping. The atmospheric domain is given as a discrete grid in a geographical coordinate system in units of longitudinal and latitudinal degrees from 20ı 000 0”S; 18ı 000 0”W to 21ı 000 0”N; 23ı 300 0”E. The height above sea level is given in pressure levels. For values near the earth’s surface, the pressure levels are aligned to the orography. The ash and wind fields are given on this grid in the GRIB data format (Wor 2003). Figure 3 shows a visualization of the ash plume developing over Europe. The evaluation and interpretation of the model output is usually done using twodimensional horizontal or vertical cross sections. However due to the huge amount of data and due to the time dependency of the atmospheric processes, only a small
1426
M. Baumann et al.
Fig. 3 Snapshots of the volcano ash cloud after 1, 2, 3, and 4 days of development. Visualization of the COSMO-ART simulation data using ParaView (Henderson 2007). Ash particle concentrations are represented by iso surfaces, the wind field as arrows at an average height of 4:8 km, the vertical axis is scaled by a factor of 75
fraction of the data can be really looked at. This limits the understanding of the interaction of the atmospheric processes with the ash plume. With three-dimensional visualizations as shown in Fig. 3, complex spatial structures can intuitively be experienced. For instationary processes, new methods for displaying the data within a reasonable time-frame are urgently needed. Such a method for the reconstruction of the time evolution is described in the following.
4
Low-Fidelity Model for the Dispersion of a Volcano Plume
In the previous section, a model for the evolution and dispersion of the Eyjafjallajökull plume was presented. The density distributions of different particle species and wind fields calculated using this model was exported into files, one file per hour. In the following, we present a low-fidelity model for the reconstruction of the ash plume dispersion from this hourly data. While in the introductive example (see Fig. 1) the trajectory of one single particle was reconstructed, we are interested in the distribution of the particle densities and apply a partial differential equation to describe its development. Firstly, we describe the structure of the COSMOART output data which is the starting point of the model-based reconstruction. Subsequently, we give details of this method related to this scenario including the physical model for the dominating processes.
Model-Based Visualization of Instationary Geo-Data with Application to. . .
4.1
1427
Conversion of the Mesh Structure
For simplicity, we transform the grid structure, described in Sect. 3.2, into a structured grid with rectilinear Cartesian coordinates. The horizontal components of the geographical coordinates ij in units of degrees are converted to plane Cartesian coordinates xEij WD Rearth ij =180ı in units of kilometers. Regarding the vertical dimension, each pressure level k is assigned to its average height z.k/, see Fig. 4. This results in the following discrete representation ˝O of the domain ˝ D Œ0; 4610 Œ0; 4550 Œ0; 22:2 in units of kilometers: ˝O WD
39 [
˝k ;
˝k WD f.x; y; z.k// W .x; y/ D xEij ; i 2 .0; 664/; j 2 .0; 656/g:
kD0
We use a corresponding data file format of the Visualization Toolkit (VTK) project (Avila 2004) since this can easily be opened in many standard visualization tools.
4.2
The Continuous Model
In this section, we describe a simplified continuous model that we use in the following to reconstruct the motion of the volcano ash distribution. This model
Fig. 4 Average height above sea level of each pressure level
1428
M. Baumann et al.
Fig. 5 Magnitude of horizontal wind (left) and vertical wind (right) in units of m=s at pressure level 20 with an average height of 4:3 km on April 18, 2010, at 10pm
is represented by a parabolic partial differential capturing effects of advection and diffusion. One major simplification is that the three-dimensional domain ˝ is replaced by a set of horizontal slices ˝k which are independently regarded. This makes it possible to calculate the additional snapshots very efficient on a workstation computer instead of a high-performance parallel computer. The dispersion of the volcano plume is mainly driven by advection due to the wind. Assuming small vertical wind, the reduced model accounts for the horizontal wind only. Figure 5 shows the horizontal and vertical winds on a representative vertical layer. The horizontal wind has almost everywhere values of more than 10 m=s, on some regions even about 60 m=s, while the vertical wind is comparably small. The different scales of the horizontal and vertical wind motivate to neglect the vertical wind component and regard the ash dispersion in independent horizontal layers. Obviously, the influence of gravity is ignored which will be reflected in the numerical results as described later. The reduced model contains artificial diffusion which is included not only to represent molecular diffusion, but to represent mixing effects due physical processes that are not resolved such as turbulence and the omitted vertical advection. In a numerical point of view, the problem is more stable due to the higher diffusion. It must be noted that the correct level of diffusion is not known a priori. Instead, it is a model parameter on which the approximation quality and also the related computational costs will depend on. For the numerical tests described later, we determined a good choice for this parameter by the solution of an optimization problem. The volcano as the only source of particles should get particular attention. It spreads ash particles into the atmosphere continuously in time, which is described indirectly by the given COSMO-ART output data. For the reduced model, the effect of the volcano eruption should be considered. On the one hand, the reconstructed ash distributions should correspond to the given data as closely as possible, on the other hand, the particle distribution should be governed by the model equation. One way to include this effect in the reduced model would be by means of a source term for the ash. Since no additional information related to the volcano should be used
Model-Based Visualization of Instationary Geo-Data with Application to. . .
1429
for the reconstruction, we include a localized interpolation and smoothing step into the discrete model. The resulting reduced model including effects of advection and diffusion and an abstract force term is stated in the following. The evolution of the vector of six ash particle species, denoted by , in each horizontal layer is described by a two-dimensional convection-diffusion problem. For the reconstruction in the time interval In , the particle density n is initiated by the respective snapshot at time tn . The partial differential equation for each horizontal level k and time interval In has the form: @t C vEO r D f D0 .tn / D n
in ˝k In ; in @˝k In ;
(8)
in ˝k ;
with the artificial viscosity . The zero boundary conditions can be justified by the vanishing ash densities at the domain boundary at all times in the COSMO-ART data. The wind field vEO is calculated from the snapshots by linear interpolation in time vEO .t/ D ˚lin:intp: Œtn ; tnC1 ; vE.tn /; vE.tnC1 / .t/:
4.3
(9)
Discretization
In this section, a standard finite difference discretization based on the explicit Euler scheme of problem (8) is described. The resulting algorithm allows for efficient calculation of the particle concentrations at intermediate points in time between tn and tnC1 . The numerical scheme is easy to implement and leads to a fast algorithm. Details can be found in Hindmarsh et al. (1984). The Laplace-operator and the convection term are each approximated by a central difference quotient. The time derivative is approximated by a forward difference quotient. For given time step size ıt > 0, the following scheme has to be solved for each point in time mn D tn Cmıt with m D 1; 2; : : : ; M , where M WD .tnC1 tn /=ıt : n Qlfm .xEi;j; mC1 / D lfm .xEi;j ;mn / ıt O vE1 .xEi;j ;mn / lfm .xEi C1;j ;mn / lfm .xEi 1;j ;mn / h ıt O vE2 .xEi;j ;mn / lfm .xEi;j C1 ;mn / lfm .xEi;j 1 ;mn / h ıt C 2 lfm .xEi ˙1;j ;mn / C lfm .xEi;j ˙1 ;mn / 4lfm .xEi;j ; mn / : h (10)
1430
M. Baumann et al.
No force term is considered in this scheme, i.e., f 0, since the effect of the volcano eruption is considered in a subsequent interpolation and smoothing step described later. The density at all boundary nodes are fixed to be zero and are not changed in any state of the procedure during the numerical simulation.
4.4
Stability
The stability of the numerical scheme (10) for the explicit Euler method for problem (8) can be studied by means of a von Neumann analysis, see e.g., Hindmarsh et al. (1984). Two stability conditions restricting the time step size can be obtained: ıt
h2 ; 4
and ıt
4 : kE v k2
(11)
Although these conditions guarantee stability of the discrete solution, they do not guarantee the ash concentrations to remain non-negative over the simulation time. Unphysical negative values may occur if the Péclet number P D kE v kh=.2/ is greater than one. Obviously, P does not depend on the time step size ıt , but on the velocity field vE as well as on the grid spacing h which in our case should not be changed (e.g., by grid-refinement). Therefore the Péclet number can only be changed by means of the viscosity . A viscosity high enough to guarantee the Péclet number to be smaller than one would lead to a very strong mixing effect. This mixing would be much stronger than needed and would lead to nonphysical, overemphasized smoothing of the particle densities. Therefore, we apply a post-processing procedure in which the negative particle concentration values are increased to zero.
4.5
Discrete Volcanic Particle Injection Model
We mimic the source of ash due to the volcano eruption by adopting the timeinterpolated particle concentrations in each time step within a small neighborhood of the volcano. The integration of this interpolated data into the discrete solution calculated by the low-fidelity model is guaranteed to have a smooth spatial transition. In the horizontal layer k, the linear interpolation in time of the particle concentration at the volcano’s position xEVk 2 ˝k is given by . O xEV ; t/ D ˚lin:intp: Œtn ; tnC1 ; .tn /; .tnC1 / .xEV ; t/:
(12)
Setting the interpolated particle concentration only at the volcano’s position would lead to high gradients and numerical oscillation in the surrounding domain. For a smooth and stabilizing transition, we use a weighted linear combination of O and the particle concentration Qlfm determined by the low-fidelity model. The weighing
Model-Based Visualization of Instationary Geo-Data with Application to. . .
1431
Fig. 6 Discrete Gaussian used as interpolation coefficients represented as a stencil of the size 11 11. Each pixel refers to one grid point in the discrete domain Qk ˝
coefficients !ij represent a discretized Gaussian bell function on a 11 11 stencil. This stencil is located at the volcano position and covers a sub-domain denoted by 8 ˝Q k . At the grid points xEij 2 ˝Q k the weights are defined by wij WD exp. 47 jxEij 2 xEV j / 2 Œ0; 1 and tend to zero at the boundary, see Fig. 6. In each time step, after the solution has been updated according to Eq. (10), this solution is modified by O xEij; mn / C .1 wij /Qlfm .xEij; mn /; lfm .xEij; mn / WD wij .
4.6
for xEij 2 ˝Q k :
(13)
Implementational Aspects
We implemented a discrete model of the previously described scenario in C++. The conversion of the original COSMO-ART data from the GRIB format to a structured VTK format was done in a preprocessing step. The work-flow for the reconstruction of the particle concentration between any two successive time steps tn and tnC1 is listed in Algorithm 1. Algorithm 1 Work-flow of the implementation Read initial snapshot .tn / using VTK library Determine highest stable time step size due to equation (11) Setup stencil for discrete scheme for m D 1 : : : N do Update ash particle concentration lfm .mn / in any horizontal layer if m mod Nvis DD 0 then Visualize lfm .mn / end if end for Calculate error lfm .Nn / .tnC1 / if required
For this scenario, in each snapshot of the original data at least one half of the domain has vanishing particle concentrations. Taking this fact into account, the computational costs for the reconstruction of the densities can significantly be reduced. The computational cost scales linearly with the number of nodes that have to be updated in each time step. In our implementation, we determined the smallest rectangle in each horizontal layer that contains all non-zero particle concentrations
1432
M. Baumann et al.
within the initial and the target snapshot (1 h later). We applied the numerical scheme only to the nodes in these sub-domains which led to a fraction of the original computational costs. In particular for the first intervals, In with n < 50 the ash particles are localized very strongly. The presented low-fidelity model is an approximation of the original COSMOART model and therefore its (exact) solution is already related to some error as discussed in the next section. Therefore, it is justifiable to calculate only approximative solutions with moderate accuracy to increase the performance additionally. In numerical tests we verified that the reconstructed data based on single precision accuracy computations are related to errors in the order of 0:01 % compared to those with double precision accuracy. This motivates to utilize very performant hardware for the data reconstruction (e.g., GPUs) that can exploit the highest performance using single precision accuracy.
5
Numerical Results
In this section, we investigate the results of numerical test series with respect to the quality and also the related computational costs. We compare data reconstructions calculated by linear interpolation and by the low-fidelity model approach both qualitatively and quantitatively.
5.1
Qualitative Comparison
We consider three successive time steps and evaluate the reconstructions of the 2-h interval. The intermediate time step (after 1 h) serves for purposes of error evaluation only. In Fig. 7a, the reconstruction error of a linear 2-h interpolation is shown. The white line sketches the structure of the ash distribution as given in the original data at time step n D 97. At that state, the linear interpolation corresponds to the mean average and therefore contains features of both snapshots at n D 96 and n D 98. The physical evolution process is not captured correctly. High particle concentrations are reconstructed only at places where both the initial and the final snapshot have such high concentrations. In contrast, the reconstruction by means of the low-fidelity model can structurally reproduce the evolution process, see Fig. 7b. A simulation started at n D 96, indicates good agreement with the original data at n D 97 and even at n D 98. The aforementioned property of the interpolation approach that the highest function values are not necessarily preserved, has disadvantages as the following 3D visualization shows: Fig. 8 shows an ash distribution reconstructed by the interpolation approach and also by the low-fidelity model. The ash particle concentrations are indicated by means of an isosurface visualization. In the original data at n D 80 (left) and at n D 81 (right), the two small separated ash clouds clearly can be seen. Regarding the model-based visualization, the isosurfaces continuously move from the start position to the target position. This is indicated by the visualization after
Model-Based Visualization of Instationary Geo-Data with Application to. . .
1433
Fig. 7 Comparison of the reconstructed and linearly interpolated particle concentration of the ash species with lightest particles of the weight 1:0 106 +g at the mean height of 9:8 km. The white line represents high particle distributions as described in the original data. For the reconstruction, a viscosity of D 0:01 km2 =s was applied. (a) Linear interpolation of the ash particle concentrations of time step n D 96 and n D 98. (b) Result of the low-fidelity simulation from time step n D 96 to n D 98 plotted for n D 97 and n D 98
half of the interval time, n D 80:5, in the lower panel. In contrast, the concentration values computed by the linear interpolation at that time fall below the iso value for the visualization. This physically incorrect artifact leads to a vanishing ash cloud in the upper panel of Fig. 8. As additional material, a 3D animation of this scene (“Comparison Volcano Ash Distribution") can be found in the online version of this chapter (doi:10.1007/9783-642-54551-1_87). It shows a comparison between the original non-interpolated data, the reconstructed data by linear interpolation, and the data reconstructed using the low-fidelity model. The ash particle concentrations in the animation are represented by iso surfaces, similar to Fig. 3. The wind is indicated by colored arrows at an average height of 4:8 km above sea level. For clarification purposes, the vertical axis is scaled by a factor of 75 of the original height above sea level. The linear interpolation in time gives the impression of a pulsating ash dispersion, arising
1434
M. Baumann et al.
Fig. 8 Iso surface visualization of the ash particle concentrations. The linear interpolation at n D 80:5 does not correctly capture the ash transport since only one of the two ash clouds are visualized
from the artifact described in the previous paragraph. The model-based visualization shows a flowing transition from one snapshots to the next with the exception of a small correction at the end of each construction interval. For the animation and Fig. 8, a viscosity of D 0:005 km2 =s was applied, which seems to be a good choice as described in the following section.
5.2
Quantitative Comparison
For the quantitative examination of the reconstruction quality, we introduce an error measure which represents the error of all particle species s D 1; : : : ; 6 and the snapshots n D 8; : : : ; 120 by means of one scalar value. The first snapshots are neglected since no ash particles are contained therein. The error measure is s s defined as the mean value of the relative L2 -error En .mode / WD kmode .tn / s s data .tn /k=kdata .tn /k and has the form: 1 XX s En .mode /: 113 6 nD8 sD1 120
Emode WD
6
(14)
This error measure is the basis of the following quantitative evaluation. We expand the simulation time period again from 1 to 2 h such that the error of the linear interpolation Einterpolation , evaluated at the intermediate time-stamp, can be computed. The low-fidelity simulation is initialized at the time step tn and the convection field as well as the particle concentrations for the volcano model are identified by linear interpolation between tn and tnC2 . Thus, we compute the error with respect to the original data at tnC1 (denoted by Elfm1 ) and tnC2 (denoted by Elfm2 ). It is reasonable that the linear interpolation has its greatest error at the intermediate time-stamp tnC1 . For the low-fidelity reconstruction one can assume
Model-Based Visualization of Instationary Geo-Data with Application to. . .
1435
Fig. 9 Validation set-up: the solid lines show the errors of the interpolation and the low-fidelity reconstruction. The dashed line represents the average computational costs in seconds for the reconstruction of the evolution from tn to tnC2
that the error grows in time and therefore the error is maximal at the end of the considered interval at tnC2 . Figure 9 shows the results for this artificial set-up. The general error of the interpolation amounts to Einterpolation 0:43. The errors obtained for the low-fidelity model are for both the intermediate Elfm1 and the final error Elfm2 smaller as long as the viscosity is sufficiently small. For high viscosities the resulting mixing effects are too high for the model to correctly describe the evolution of the ash plume. With decreasing viscosity the results gain the required sharpness and the error at intermediate time falls down to Elfm1 0:27 and at the final time step reaches a slightly higher value of Elfm2 0:30. Here the error curves indicate the existence of a minimum, where the mixing effect seems to capture the physics at best. To conclude for this scenario, the low-fidelity model with a suitable choice for the viscosity is superior to the linear interpolation. Next we regard the set-up for the actual reconstruction of the ash particle concentration within each time interval. Here, the simulations are supposed to reconstruct the evolution from one time step tn to the next tnC1 . In contrast to the validation set-up described above, the data of the convection field and the volcano model is now interpolated between one and the following snapshot. Figure 9 indicates that the best reconstruction with an error of Elfm 0:19 can be expected for a viscosity of D 0:005 km2 =s. This was the parameter of our choice for the final reconstruction of the evolution. Within a range of 0:003 km2 =s 0:01 km2 =s the low-fidelity simulations show similar sizes of the error. However, the
1436
M. Baumann et al.
Fig. 10 Reconstruction set-up: the continuous lines represent the errors of the interpolation and the low-fidelity simulation. The dashed line shows the average computational costs in seconds for the reconstruction of the evolution from tn to tnC1
computational costs increase with decreasing viscosity. This effect is explainable by the direct connection between the viscosity and the stability condition on the time step size, see Eq. (11). Choosing the expectably highest stable time step size, we get an expression for the number of needed iterations N D T kE v k2 =.4/, i.e., the viscosity reciprocally governs the computational costs. The dashed lines in Figs. 9 and 10 refer to the computational costs by means of the average computing time in seconds and approximately show a linear relation on the logarithmic scales in correspondence to the formal relation. These results were obtained in sequentially run simulations on a desktop workstation with an Intelr CoreTM i7-3770K Processor (3.50 GHz, quad-core). Hence, depending on the available hardware in purpose of the data reconstruction, a compromise has to be found between accuracy and costs.
5.3
Sources of Errors and Extensions
As described previously, the orography-following mesh structure of the original data, given in a geographical coordinate system, is converted to a regular Cartesian mesh in a preprocessing step. This conversion involves an error due to the different mesh structures which leads to a smoothing of the data. This interpolation error
Model-Based Visualization of Instationary Geo-Data with Application to. . .
1437
Fig. 11 Error of reconstructed concentration data for the different species of ash particles, calculated based on the viscosity of D 0:005 km2 =s. The most weightiest particle species correspond to the largest reconstruction error
was accepted since it allows for simple mesh data structures that facilitate the implementation of the numerical scheme. One major simplification is the reduction of the three-dimensional domain into a set of independent two-dimensional slices. This implicates the downward movement of particles due to their weight not to be considered. Figure 11 shows a plot of the reconstruction error for the different ash particle species. It can clearly be seen that the error is larger for the ash distribution of the most weightiest particles. This might be caused by the disregarded effect of the gravity force which would allow an error reduction of approximately 5 % for the largest particles if the gravity force would be accounted for. Although the vertical wind is small compared to the horizontal wind, it is not zero, as shown in Fig. 5, and would transport ash particles in the vertical direction. Also the mixing effects in the atmosphere have contribution in the vertical direction. The consideration of the three-dimensional effects of advection and diffusion would require a fully-coupled three-dimensional discrete model. This would lead to much higher computational costs compared to the presented layer approach. A fundamental component of the proposed reconstruction approach is the model equation describing the physical processes considered during the calculation. We used a 2D-version of a convection-diffusion model to account for the transport due to wind and diffusive mixing. In Sect. 3.1 we described the COSMO-ART model used to simulate the evolution of the volcano plume. Besides the ash particle densities and the wind field, several quantities were additionally (e.g., aerosols, soot, mineral dust, sea salt) considered to account for atmospheric processes that might be important for this scenario. These quantities and corresponding equations depict possible extensions for the low-fidelity model. Important components of this full description need to be identified which typically requires deep knowledge about the models. The quality and performance efficiency of the model-based reconstruction
1438
M. Baumann et al.
method essentially depends on the considered model variant, simplifications made, and the discretization. Reconstructions based on the fully coupled three-dimensional problem are computational expensive in general but might be efficiently applied by means of model reduction approaches. One simple data reduction strategy is given by neglecting a portion of the available grid points, e.g., use only every nth grid point in any space direction. This results in a data reduction by a factor of n3 . However reduced spatial resolution is in general related to a strong loss of details and quality (e.g., the data reduction of the original COSMO-ART data was reduced by this technique with respect to the time structure). For many applications, very efficient reduced models can be defined by means of an orthogonal basis of the space spanned by the available snapshots. Using the proper orthogonal decomposition method (POD) (Kunisch and Volkwein 1999, 2002, 2008), many problems with very high number of unknowns can be approximated very accurately with less than 100 unknowns only. An important aspect is that the only costly calculations (the determination of the reduced model) need to be done only once which can be done on a power-full computer system. For this type of model-reduction a model equation describing the physical behavior is required as well. For the reconstruction of the volcano ash plume based on POD techniques, the proposed convection-diffusion model, even with 3D-effects of advection and diffusion, could be deployed. The starting point of the low-fidelity model described in Sect. 4 is a parabolic partial differential equation which accounts only for the initial state of each interval. However, it is not guaranteed that the calculated data at the end of the reconstruction interval will correspond to the given original data at that time. In general there will be some error due to the use of the low-fidelity model as we discussed in Sect. 5. This error can be minimized by fitting existing model parameters (e.g., the viscosity ) such that the saltation-like change at the end of the reconstruction interval is as small as possible. Of course this non-smooth behavior can be eliminated by some post-processing operation of smoothing or interpolation, but these approaches lead to additional errors again. Such problems can be avoided by an alternative problem formulation that takes both states at the beginning and also at the end of the reconstruction interval into account. This can be achieved adding mixing effects related to the time-structure. This can modeled by adding temporal viscosity (e.g., a term of the form t @2t with viscosity t > 0) leading to an elliptic problem formulated in the space-time domain. The solution of such a fully coupled system can no longer be done by means of time-stepping schemes as presented before, since all unknowns within the reconstruction interval are fully coupled. Such problems are typically tackled by means of sparse linear equation systems and related solution approaches (e.g., direct or iterative linear system solvers). Although the computational efforts to solve such a problem might be higher than the described time-stepping scheme, this approach can be desirable in many applications since the reconstructed data will catch all states given in the original data. To reduce the costs, adequate model reductions (e.g., POD techniques) might provide remedy.
Model-Based Visualization of Instationary Geo-Data with Application to. . .
6
1439
Conclusion
In this article, we investigated methods for the reconstruction of time-depending processes which are described at a finite number of points in time only. We gave a general description of such reconstruction methods. We discussed linear interpolation as one of the standard approaches and proposed a model-based reconstruction method. In addition to the given data, the latter takes a physical model into account that approximately describes the underlying processes. For the eruption of the volcano Eyjafjallajökull in Iceland in the spring of 2010, we proposed such a low-fidelity model based on the convection-diffusion equation. We described the discrete model in detail and presented reconstructions of the evolution of the volcano plume by means of six species of ash particles. The reconstructions based on linear interpolation and the model-based approach were compared to the original data, calculated by a COSMO-ART model. Although the low-fidelity model was only a rough approximation of the original one, it lead to much smaller reconstruction errors compared to linear interpolation. Such results motivate the use of corresponding low-fidelity models for various fields of applications for purposes of scientific visualization or post-processing in general. Finally, some possible extension to the model for the volcano scenario were discussed. The definition of an adequate low-fidelity model may be non-trivial since a detailed knowledge about the physics, discretization, and also software programming are needed. The better the reconstruction model suits the original problem, the less data is needed to achieve a high reconstruction quality. In that sense, powerful reconstruction approaches allow for substantial reductions of the needed amount of data to represent simulation results. While the original numerical simulations usually are conducted on large parallel computer systems, the data reconstruction might be calculated even on a standard desktop computer. Such efficient reconstruction techniques are essential for interactive visualization purposes. The specific selection of a model equation is certainly a key point since it constitutes the physical processes that will be considered during the reconstruction. On the one hand, it must be able to describe the relevant features given in the data in an appropriate spatial and temporal scale, on the other hand, the model should not be too complex both with respect to software engineering and the computational costs. In the various fields of science and engineering a great diversity of physical features are considered for which many reconstruction strategies are required. Reduced models for standard processes (such as the convection-diffusion equation) could be defined universally, at least for certain problem classes. For highest efficiency, the reduced model might require adaptation to the particular application or at least existing model parameters might need to be adjusted. By the freedom to choose an arbitrary low-fidelity model, the user can control the quality of the reconstruction and also the related computational costs. This allows to define highly performant methods for applications in which real-time availability or interactivity of the reconstructed result is relevant. Such methods could be integrated into visualization systems such as Amira (Kon 2009), EnSight (Com 2006), or
1440
M. Baumann et al.
ParaView (Henderson 2007). In case of ParaView, an extensive variety of filters exists that can be included into the visualization pipeline to manipulate the data. In particular, an often-used filter for the linear time-interpolation exists. Model-based reconstruction methods could be integrated by simply adding an additional filter which can easily be done since ParaView is an open-source development. If not high performance but high quality of the reconstruction is needed for some point in time, a more complex model – in the extreme case the original model – could be applied for reconstruction. In recent years, numerical methods have been developed which make use of different physical model equations at the same time, see e.g., Oden and Prudhomme (2002), Braack and Ern (2003), Bales et al. (2009). This is often realized by a posteriori estimations of the error related to the applied models and their control by switching between a selection of available physical models of different complexities, known under the term model adaptivity. Such developments motivate to think no more of the numerical simulation and the reconstruction as two separate steps. Instead, a model system can be seen as some abstract mechanism that describes the physics in a given scenario up to some (needed) accuracy. From that perspective, a desired visualization of some process serves as initiating impulse causing the calculation of a numerical simulation of the corresponding scene. Visualization is then no more a post-processing step of the simulation, but is part of the model system. Such a combined simulation and visualization system would allow for physical-aware visualizations of zoomed views based on more complex physical models instead of purely interpolated images. The discretization (spatial mesh and time partitioning) as well as the complexity of the physical model could be adjusted automatically by means of mathematical error estimators to guarantee high accuracy of the result.
References Amira 5 User’s Guide. Konrad-Zuse-Zentrum für Informationstechnik Berlin (ZIB) and Visage Imaging (2009). http://www.amira.com Athanasopoulou E, Vogel H, Vogel B, Tsimpidi AP, Pandis SN, Knote C, Fountoukis C (2013) Modeling the meteorological and chemical effects of secondary organic aerosols during an eucaari campaign. Atmos Chem Phys 13(2):625–645. doi:10.5194/acp-13-625-2013 Avila LS (2004) The VTK users’s guide. Kitware. ISBN:1-930934-13-0 Baldauf M, Seifert A, Förstner J, Majewski D, Raschendorfer M, Reinhardt T (2011) Operational convective-scale numerical weather prediction with the COSMO model: description and sensitivities. Mon Weather Rev. doi:10.1175/MWR-D-10-05013.1. e-View Bales P, Kolb O, Lang J (2009) Hierarchical modelling and model adaptivity for gas flow on networks. Volume 5544 of lecture notes in computer science. Springer, pp 337–346. ISBN:9783-642-01969-2 Bangert M, Nenes A, Vogel B, Vogel H, Barahona D, Karydis VA, Kumar P, Kottmeier C, Blahak U (2012) Saharan dust event impacts on cloud formation and radiation over Western Europe. Atmos Chem Phys 12(9):4045–4063. doi:10.5194/acp-12-4045-2012 Bonneau GP, Ertl T, Nielson G (2006) Scientific visualization: the visual extraction of knowledge from data. Mathematics and visualization. Springer, Heidelberg
Model-Based Visualization of Instationary Geo-Data with Application to. . .
1441
Braack M, Ern A (2003) A posteriori control of modeling errors and discretization errors. Multiscale Model Simul 1(2):221–238 EnSight User Manual. Computational Engineering International, Inc., 2166 N. Salem Street, Suite 101, Apex, NC 27523, (2006). http://www.ensight.com Henderson A (2007) ParaView guide, a parallel visualization application. Kitware Inc. http:// www.paraview.org/ Hindmarsh AC, Gresho PM, Griffiths DF (1984) The stability of explicit euler time-integration for certain finite difference approximations of the multi-dimensional advectiondiffusion equation. Int J Numer Methods Fluids 4(9):853–897. ISSN:1097-0363, doi:10.1002/fld.1650040905, http://dx.doi.org/10.1002/fld.1650040905 Introduction to GRIB. World Meteorological Organization, June 2003 Knote C, Brunner D (2013) An advanced scheme for wet scavenging and liquid-phase chemistry in a regional online-coupled chemistry transport model. Atmos Chem Phys 13(3):1177–1192. doi:10.5194/acp-13-1177-2013 Kunisch K, Volkwein S (1999) Control of the burgers equation by a reduced-order approach using proper orthogonal decomposition. J Optim Theory Appl 102(2):345–371 ISSN:0022-3239, doi:http://dx.doi.org/10.1023/A:1021732508059 Kunisch K, Volkwein S (2002) Galerkin proper orthogonal decomposition methods for a general equation in fluid dynamics. J Numer Anal 40(2):492–515 Kunisch K, Volkwein S (2008) Optimal snapshot location for computing pod basis functions. SFB-report, 2008-008 Oden JT, Prudhomme S (2002) Estimation of modeling error in computational mechanics. J Comput Phys 182(2):496–515. ISSN:0021-9991, doi:10.1006/jcph.2002.7183, http://dx.doi. org/10.1006/jcph.2002.7183 Ritter B, Geleyn JF (1992) A comprehensive radiation scheme for numerical weather prediction models with potential applications in climate simulations. Mon Weather Rev 120(2):303–325. doi:10.1175/1520-0493(1992)1202.0.CO;2 Seifert A, Beheng KD (2006) A two-moment cloud microphysics parameterization for mixedphase clouds. Part 1: model description. Meteorol Atmos Phys 92:45–66. ISSN:0177-7971, doi:10.1007/s00703-005-0112-4 Steppeler J, Doms G, Schttler U, Bitzer HW, Gassmann A, Damrath U, Gregoric G (2003) Mesogamma scale forecasts using the nonhydrostatic model LM. Meteorol Atmos Phys 82:75–96. ISSN:0177-7971, doi:10.1007/s00703-001-0592-9 Vogel B, Vogel H, Bäumer D, Bangert M, Lundgren K, Rinke R, Stanelle T (2009) The comprehensive model system COSMO-ART – radiative impact of aerosol on the state of the atmosphere on the regional scale. Atmos Chem Phys 9(22):8661–8680. doi:10.5194/acp-98661-2009 Vogel H, Förstner J, Vogel B, Hanish Th, Mühr B, Schättler U, Schad T (2013, submitted) Simulation of the dispersion of the Eyjafjallajökull plume over Europe with COSMO-ART in the operational mode. Atmos Chem Phys Discuss 13(5):13439–13463
Modeling of Fluid Transport in Geothermal Research Jörg Renner and Holger Steeb
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fundamental Laws: A Primer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Constitutive Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Field Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Stationary Flow Processes: Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Hydraulic Conduits with Simple Geometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Complex Geometries: From Conduit Networks to Porous Media . . . . . . . . . . . . 4 Fundamental Laws: A Second Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Balance of Momentum of the Mixture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Balance of Mass of the Mixture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Mixture Theory Versus Biot’s Theory of Linear Poro-elasticity . . . . . . . . . . . . . 4.4 Governing Set of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Constitutive Equations for the Stresses: Biot’s Case . . . . . . . . . . . . . . . . . . . . . . 4.6 Overview of the Governing Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Transient Flow Processes: Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Homogeneous Poro-elastic media: Low-Frequency Pumping Tests . . . . . . . . . . 5.2 Transient Flow in Deformable Hydraulic Conduits with Simple Geometries . . 5.3 Complex Geometries: Mesoscopic Loss Phenomena . . . . . . . . . . . . . . . . . . . . . . 5.4 Cryer’s Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Interlayer Flow or Mesoscopic Loss Phenomena . . . . . . . . . . . . . . . . . . . . . . . . . 6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1444 1446 1446 1448 1449 1451 1452 1453 1456 1463 1463 1464 1464 1466 1469 1469 1470 1472 1480 1488 1491 1492 1495 1496
J. Renner () Experimentelle Geophysik - Institut für Geologie, Mineralogie und Geophysik, Ruhr-Universität Bochum, Bochum, Germany e-mail: [email protected] H. Steeb Kontinuumsmechanik - Mechanik, Ruhr-Universität Bochum, Bochum, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_81
1443
1444
J. Renner and H. Steeb
Abstract
Extraction of heat stored in the rocks in Earth’s upper crust requires a fluid to which the heat can be transferred and that can migrate through the conduits in the rock. Flow of these fluids requires gradients in fluid pressure. Often, a linear relation between flow rate and pressure gradient offers a sufficiently precise description. Fluid pressure in turn couples the flow with the deformation of the solid host. We derive and analyze the governing equations underlying these hydro-mechanical, coupled problems. Our presentation addresses simple conduits and porous media. The latter are tackled from a continuum perspective, specifically mixture theory. Analytical and numerical solutions of the differential equations are analyzed for scenarios from well testing to propagation of elastic waves. Well testing constitutes the traditional tool for hydraulic characterization of geothermal reservoirs. Evaluation of attenuation of elastic waves may allow for hydraulic characterization and monitoring relying on surveys.
1
Introduction
Fluid flow entirely inside the earth, for example, convection in the outer liquid core on a spatial scale of 106 m (Gubbins 2001), and between its interior and its surface, for example, the rise of magma at mid-ocean ridges or water exchange between the world’s oceans and oceanic lithosphere or the continental-scale groundwater flow on a spatial scale of 105 m (e.g., Garven 1995; Müntener 2011; Stein and Stein 1994; Stein et al. 1995), is crucial for matter and heat transport and thus for the very dynamics that so much control plate tectonics and ultimately the environment provided for life (e.g., Ingebritsen and Manning 2003). On a smaller scale (101:::4 m), fluid flow is also of eminent importance for provision of liquid resources (e.g., Doré and Sinding-Larsen 1996; Gleeson et al. 2012; May 2010) and temporary or permanent underground storage of fluids (e.g., Daniel 1993). Thus, it is not surprising that a large number of theoretical and numerical studies regarding geoscientific fluid-flow problems have been presented in the past, addressing fundamental questions as well as engineering applications. Employed methods range from analytic techniques to numerical analyses on supercomputers. For geothermal research, fluid flow is relevant in two prominent ways. First and foremost, the power that can potentially be provided at the surface in electrical or thermal form simply reads P /
4V f 4 c; 4t
(1)
i.e., it scales with the product of the (volume) flow rate of fluid, 4V f =4t [m3 s1 ]; the temperature drop experienced by the produced fluid during its usage for energy provision, 4 [K]; and the specific heat of the fluid, c [J m3 K1 ]. The temperature difference is essentially controlled by the reservoir temperature and the technical
Modeling of Fluid Transport in Geothermal Research
1445
installations at the surface transforming the heat stored in the fluid into electricity and/or transferring the heat of the produced fluid to a secondary fluid circulating in a heating system. Reservoir temperature depends on the depth reached by the production well and the specific geological situation. In the absence of exceptional hydrothermal activity, a depth of about 3–5 km is required to reach a reservoir temperature of close to 150 ı C, the prerequisite for operating a turbine with steam to produce electricity. Specific heat is a material parameter characterizing the working fluid, likely water or steam. Typically the reservoir will contain liquid water, and steam will only be generated when the phase boundary is crossed during depressurization on the fluid’s way to the surface. Thus, the flow rate constitutes the central variable parameter defining a successful and economically feasible operation. Central questions such as “How does the flow rate evolve over the lifetime of a power plant?” or “How can the flow rate be affected by pumping protocols before (stimulation) and during plant operation?” can only be answered based on a substantial understanding of the fluid flow in the reservoir. Second, the controlled initiation of fluid flow in a potential geothermal reservoir is a mandatory prerequisite for its hydraulic characterization. A key question in this context is “How can the long-term flow rate be reliably predicted from short-term hydraulic experiments employing minimal perturbations of the reservoir pressure?” Besides the conventional approach of conducting pumping tests in wells reaching the reservoir, the transmission of seismic waves through the reservoir provides a means of initiating fluid movement, and we therefore consider the relation between fluid flow and propagation of elastic waves, a relevant aspect of fluid flow in geothermal research. However, we restrict to such highly dispersive wave phenomena that are characterized by a relative movement between the fluid and the solid. We feel requested to analyze pressure diffusion and hydromechanical coupling, too, under the overarching theme of fluid flow. The analysis of hydraulic well testing in particular of fractured reservoirs likely mandates accounting for hydromechanical coupling, i.e., coupling of changes in pore fluid pressure and stress state of the solid framework. The mobilization of fluid during the passage of a seismic wave is a coupled phenomenon by itself. Volume flux (flow rate per unit area) and fluid pressure are inseparable field variables in poro-elasticity. Yet, pressure diffusion may reach spatial extents quite different from the ones characteristic for the transport of substantial amounts of fluids. Small fluid volumes may have a significant effect on fluid-pressure levels that in turn may alter the local stress state of the reservoir to the extent that inelastic deformation, e.g., seismic activity, is induced or triggered (see McGarr et al. 2002, for terminology and examples). Seismicity, i.e., inelastic failure of underground volumes documented by the occurrence of earthquakes, associated with pumping operations – to some degree wanted and necessary for a successful stimulation but unwanted during continuous plant operation – will however not be tackled in detail. As in most geoscientific fields of problems, the analyses of data records associated with subsurface fluid flow require the iterative development of subsurface models. Since data in the form of time series of, for example, flow rate and
1446
J. Renner and H. Steeb
pressure are typically only available at a limited number of points in space, one has to rely on concepts such as hydraulically equivalent scenarios or materials with a limited number of structural parameters and explicit physical properties. Laboratory experiments on rock samples may plainly serve the purpose of material characterization or may aim at the investigation of analogue models on miniature scale. The transport of heat by fluids in motion, the very prerequisite for geothermal energy provision, exhibits parallels to the transport of solutes. Thus, tracer tests play a crucial role in the identification of preferential flow paths. The contribution is structured in a way that aims at serving the likely disparate readership spanning from mathematicians eager to be informed about the model concepts employed in geothermal research and geoscientists hoping to expand their understanding of the theoretical concepts required to solve fluid flow problems. We provide two rather formal introductions of the fundamental relations (second and fourth section). The reader who is unfamiliar with the formalities of continuum mechanics may happily skip these sections (possibly apart from comments on notation at the start of the first one) and come back to them only after reading the third and fifth section that intend to slowly increase complexity. In the third section, we first analyze steady-state flow in hydraulic conduits with simple geometry and then proceed to complex networks. The subsequent presentation of aspects of transient flow in the fifth section similarly evolves from individual conduits with simple geometries to homogeneous poro-elastic media to heterogeneous media. Each subsection is headed by a “tool box” comprising the basic equations involved in the sequel. Working through these formal preambles in the light of the subsequent examples may facilitate appreciating the introductions of the fundamental relations in retrospect. We conclude with a collection of notes on current research developments and related fields. Our presentation of basic relations leans heavily on mixture theory. We restrict however to a single fluid phase dispersed in a solid phase since liquid-gas mixtures are not typical for deep geothermal reservoirs.
2
Fundamental Laws: A Primer
We present the basic equations of (linear) continuum mechanics. Focus is set on the governing kinematic relations, balance equations, and constitutive equations for single-phase solids or fluids. The aim is to recapitulate basic physical principles and discuss briefly resulting linear field equations, e.g., Lamé-Navier equations for linear elastic solids or the Navier-Stokes equations for Newtonian fluids.
2.1
Kinematics
Position Vectors and Displacements of a Material Point The positions of a material point P in its deformed (current) and its undeformed (reference) configurations are given by the position vectors x and X, respectively.
Modeling of Fluid Transport in Geothermal Research
1447
The displacement vector of the material point is introduced as the difference between x and X, i.e., u D x X:
(2)
Calculating the first and second time derivatives of the position vector x, we obtain the velocity vector v D xP and the acceleration a D xR . Here, we indicate the material or substantial time derivative with the “dot” operator. By applying the chain rule, the material or substantial time derivative for an arbitrary scalar field function .x; t/ depending on the current position x reads d .x; t/ P D .x; t/ D @x .x; t/ v C @t .x; t/: dt
(3)
Throughout this chapter we employ the frequently used short notation for partial derivatives indicating the variable(s) with respect to which to differentiate as index on @, i.e., @x .x; t/ WD @ .x; t/=@x.
Velocity Gradient and Linear Strains The spatial velocity gradient is obtained from the partial time derivative of the velocity with respect to the position vector in the current configuration, i.e., L D @x v D grad v. Note that L D Lij ei ˝ ej is a general second-order tensor which can be additively split into a symmetric and a skew-symmetric part. We apply the abbreviated summation convention according to Einstein if an index appears twice in an equation. The symmetric and skew-symmetric parts of the velocity gradient are given by L D D C W with DD
1 L C LT 2
and W D
1 L LT ; 2
(4)
where D represents the strain-rate tensor and W the spin or rate of rotation tensor. To interpret simple homogeneous experiments (simple shear, pure compression, tension, torsion tests), we split the strain-rate tensor, i.e., the symmetric part of the velocity gradient into a traceless part, the deviator, and an additional term characterizing the volumetric deformation, i.e., D D dev .D/ C vol .D/
with
vol .D/ D
1 tr .D/ D div v: 3
(5)
The second-order unity tensor is denoted as I D ıij ei ˝ ej , with the Kronecker symbol ıij D 1 for i D j and ıij D 0 for i ¤ j . From the gradient of the displacement vector grad u, we define linear strain measures D
1 grad u C gradT u : 2
(6)
1448
J. Renner and H. Steeb
Again, the second-order (symmetric) linear strain tensor can be split into a volumetric and a deviatoric part, i.e., D dev ./ C vol ./
2.2
vol ./ D
with
1 tr ./: 3
(7)
Conservation Laws
We introduce mechanical and thermal balance relations of a deformable material body B. In the present discussion, we restrict ourselves to classical continua, i.e., the Cauchy stress tensor is axiomatically introduced as a symmetric tensor. Therefore, the balance of moment of momentum is a priori fulfilled. Furthermore, we do not intend to discuss the formulation of thermodynamically consistent constitutive relations since we consider a discussion of the balance of entropy out of the scope of this handbook contribution. Theoretical details of modern material theory related to constitutive modeling of solids, fluids, and multiphase materials can be found, e.g., in Haupt (2000), Hutter and Jöhnk (2004), and Bower (2010).
Balance of Mass Axiom: The total mass M .B; t/ of a material body B given as Z Z M D dm D dv B
(8)
B
is constant during its deformation d .M / D const. dt
(9)
The density is given as the ratio of mass dm and volume dv in a certain representative volume element (RVE) of size dv.
Balance of Linear Momentum R R Axiom: Linear momentum J.B; t/ D B xP dm D B xP dv of a deformable material body B is changed by Rthe sum F D F.@B; t/ C F.B; t/ D F@B C F R B of near-field or contact forces F@B D @B t da and far-field or body forces FB D B b dv, i.e., d .J/ D F: dt
(10)
Surface tractions t are related to the Cauchy stress tensor by Cauchy’s theorem t D n. The vector n is the outward normal vector of @B.
Modeling of Fluid Transport in Geothermal Research
1449
Balance of Energy R Axiom: The sumRE D Ekin CEint of kinetic energy Ekin D B 12 xP xP dv and internal energy Eint D B dv ( denotes the specific internal energy) of a deformable material body B is changed by the sum P D Pmech C PthermRof mechanical R R R power Pmech D @B xP t daC B xP b dv and thermal power Ptherm D @B q daC B r dv, i.e., d .E/ D P D Pmech C Ptherm : dt
(11)
Here, q denotes the heat flux across the body’s surface, and r is the specific heat production. Furthermore, the heat flux is related to the heat flux vector and the outward surface normal vector by Cauchy’s heat-flux theorem q D q n.
2.3
Constitutive Equations
Constitutive relations have to be formulated to close the system of equations, i.e., the number of unknown field variables has to match the number of kinematic, balance, and constitutive equations. In particular, kinematic and dynamic quantities have to be related as an expression of substance’s rheology, the way a specific material deforms and flows. The first fundamental characterization of fluids concerns their density fR D dmf=dv f , i.e., the ratio between the mass element of the fluid dmf and the volume dv f that it occupies. The effective weight of the fluid calculates as fR D fR g, where g denotes gravitational acceleration. The density fR of an incompressible fluid remains constant during the flow process (or equivalently tr D D div v D 0, see Eq. 5). Compressible fluids change their density in flows; their pressure is calculated from an equation of state, i.e., p D p.fR ; /. Prominent examples for equations of state are Boyle-Mariotte’s law (ideal gas), van der Waals’ law, or the versatile Muskat’s law (e.g., Bear 1972): fR D 0fR
p p0
n
expŒˇ f .p p0 / :
(12)
For incompressible liquids, it is proposed n D 0, ˇ f D 0 and for compressible liquids n D 0, ˇ f ¤ 0. For gases, Muskat distinguishes between isothermal cases (n D 1, ˇ f D 0) and adiabatic processes (n D cv =cp , ˇ f D 0), where cv and cp denote specific heat capacity at constant volume and pressure, respectively. The fluid pressure has unit [ N/m2 D Œ Pa D 105 Œ bar D 101 Œ dynes / cm2 , but it is also frequently expressed by the height of a continuous water column 1 Œ m H2 0 D 0:09807 Œ bar 0:1 Œ bar . Throughout this presentation, we follow a notation according to which ˇ ˛ D @p ln ˛R D @p ln v f denotes a compressibility [Pa1 ] and K ˛ D 1=ˇ ˛ denotes a bulk modulus [Pa]. Superscripts to these properties indicate the particular material of interest, for example, ˇ f and K f represent fluid properties. Subscripts in contrast indicate that these parameters express the pressure
1450
J. Renner and H. Steeb
sensitivity of a physical property other than volume, for example, ˇ D @p ln and K denote relative changes in porosity . In many cases, it is warranted to treat the fluid as barotropic, i.e., fluid density is a function of pressure alone, i.e., fR D fR .p/, cf. (12).
Viscous Fluids Flow of real fluids is a dissipative process and requires continuous operation of forces. A velocity imposed on a fluid’s surface, e.g., by the action of a tangential or shear force, is only partly transmitted into the bulk of the fluid by momentum interaction between neighboring fluid particles. The velocity perturbation decays with distance from the surface. The penetration depth depends on the specific fluid and is quantified by its viscosity. As a result of the balance of entropy, the total stress tensor Tf is split into an equilibrium and into a nonequilibrium part: f
Tf D Tfeq C Tfneq D TE p I:
(13) f
The deviatoric part of the equilibrium stresses is zero (dev .Teq / D 0), and the f volumetric part of the nonequilibrium stresses is also zero (vol .Tneq / D 0). The f nonequilibrium or extra stress part TE is given by TfE D 2 fR dev .D/
(14)
introducing formally dynamic viscosity fR Œ kg=m s or Œ N s / m2 . Thus, dynamic viscosity constitutes the proportionality constant between shear stress, i.e., the offdiagonal elements of the stress (or the extra) tensor, and the component of the velocity gradient normal to the direction of the shear stress, i.e., in the direction of the normal vector of the surface under consideration. Performing a rheological shear test, for example, in a plate-plate viscosimeter, we could obtain the relation between the shear stress component and the shear rate P of a linear viscous fluid as D fR P . In a three-dimensional setting, this linear relation is expressed by Eq. (14). Linear viscous behavior, i.e., a constant viscosity, is commonly addressed as Newtonian behavior. The dynamic viscosity of water at room temperature D 293 K is given by
fR D 1:00 103 N s / m2 . The dynamic viscosity is also still given in “centi-Poise” Œ cP with Œ cP D 103 Œ Pa s D 103 Œ N s/m2 . The kinematic viscosity fR D
fR =0fR is expressed in Œ m2 =s or in the unit “Stoke,” with Œ m2 =s D 104 Œ Stoke . Kinematic viscosity can be considered a diffusivity of momentum linking characteristic length and time scales of momentum transfer between fluid particles.
Linear Elastic Solids For a linear elastic solid, the extra stress part consists of a volumetric and a deviatoric part given by TsE D vol .TsE / C dev .TsE / D 3 K vol .s / C 2 G dev .s /
(15)
Modeling of Fluid Transport in Geothermal Research
1451
with bulk modulus K and shear modulus G. For this presentation, we adopt the sign convention from continuum mechanics. Thus, compressive (total and effective) stresses (leading to a decrease in volume) are negative. Pore pressure and confining pressure are positive. Note that in many fields in geosciences, compressive stresses are treated positive.
2.4
Field Equations
Navier-Stokes Equations To derive the Navier-Stokes equations, i.e., the set of field equations or partial differential equations (PDEs) expressed in partial derivatives of the primary variable v, we insert the introduced kinematic relation (5) and the constitutive expression for a Newtonian fluid (14) into the balance of momentum in local form. This approach leads to @t .fR v/ C fR grad v v fR div grad v C grad p D fR b;
(16)
a set of three partial differential equations commonly addressed as the Navier-Stokes equations. This set of PDEs has to be supplemented by an equation of state for the pressure if the fluid is compressible, e.g., (12), or a kinematic constraint if the fluid is incompressible (17), i.e., div v D 0;
(17) fR
known as the continuity equation. The constraint fR .x; t/ D fR .x; t0 / WD 0 is a consequence of mass conservation (8) and (9). In dimensionless form, the Navier-Stokes equations read St @t ? .v? /
1 div? grad? v? C grad? v? v? C grad? p ? D b? ; Re
(18)
where we introduced the Reynolds number Re and the Strouhal number St Re D
lQ vQ fR
fR
and
St D
lQ tQ vQ
(19)
describing the ratio of inertia to viscous forces and the ratio of local to convective (or advective) acceleration, respectively (e.g., Truckenbrodt 1996, p. 128). In the definition of Re and St (19), we introduced characteristic quantities of the problem Q velocity v, (indicated with a tilde), like length l, Q and time tQ. Only very few exact analytic solutions of the Navier-Stokes equations are known today for rather specific cases. Among these are the ones for stationary flow in simple conduits that we present in the next section. With the help of computational methods (finite volumes, finite elements, finite differences, or smoothed particle hydrodynamics), numerical
1452
J. Renner and H. Steeb
solutions of the Navier-Stokes equations can be obtained for various problems. Especially for laminar flow conditions, computational fluid dynamics (CFD) has extended the theoretical knowledge about the Navier-Stokes equations within the last decades.
Lamé-Navier Equations Another set of fundamental field equations are the Lamé-Navier equations that describe the linear propagation of elastic waves in solids or, if inertia terms vanish (sR uR D 0), the deformation behavior of elastic solids under the influence of near-field and far-field forces. The derivation of these field equations follows the technique described for the Navier-Stokes equations. We insert the kinematic relations for the linear strains (6) and (7) and the generalized Hooke’s law (15) into the local form of the balance of momentum to get sR uR D div grad u C . C /grad div u C sR .1 div u/ b:
(20)
From Eq. (20), we find the velocity cP of a longitudinal wave mode called the compressional wave or P-wave and the velocity cS of a transversal wave mode called the shear wave or S-wave. The standard notation uses the subscript P for “primary wave,” the faster of the two waves, and the subscript S for “secondary wave.” In the Lamé-Navier equations (20), we used the properties of linear elastic isotropic materials. As introduced in Eq. (15), the stress tensor depends on two material parameters, bulk modulus K and shear modulus G, that relate to the two Lamé parameters and according to sR cP2 D K C
3
4 G D C 2 ; 3
DK
2 ; 3
and sR cS2 D D G: (21)
Stationary Flow Processes: Transport
Stationary flow processes are entirely dominated by the geometry of the pathway provided to the moving fluid, the conduit(s), and the fluid’s intrinsic resistance to flow. The solid constituting the conduit can safely be treated as non-deformable in many applications since the conduit geometry remains fixed at steady state. In principle, the fluid is heated or cooled during compression or dilation, and thus flow rate depends on the heat exchange between the fluid and its environment. We neglect this type of hydrothermal coupling in the following. In geothermal applications, the volume changes of the involved fluids are typically moderate, and heat flow is dominated by the thermal characteristics of the reservoir. The temperature differences between the pumped fluid and the reservoir will typically largely exceed any temperature variations due to volume changes of the fluid.
Modeling of Fluid Transport in Geothermal Research
3.1
1453
Hydraulic Conduits with Simple Geometries
Basic Equations I: Stokes Equation The classic flow problems to be presented next consider a fluid volume and its movement within a conduit of simple geometry in response to an imposed stationary pressure difference. The solid comprising the conduit is initially treated as rigid, i.e., undeformable. This constitutive assumption ensures that the boundaries for the fluid flow remain fixed in space and time. According to fundamental observations, fluids flow from regions where they experience high pressure to regions where they experience low pressure. Spatial pressure variations drive fluid flow in a direction toward pressure equilibration. Given the flow is sufficiently slow, i.e., inertia forces can be neglected, an incompressible fluid volume experiences essentially normal or pressure forces due to pressure differences between its opposing surfaces and tangential or (viscous) shear forces due to velocity differences of its surface relative to neighboring fluid volumes. At steady state, the fluid volume moves with constant velocity, and thus the total force is zero. Evaluating the force balance and performing the transition to infinitesimal volumes leads to the Stokes equation for creeping flow in the absence of body forces: fR div grad v C grad p D 0;
(22)
where the gradient of pressure represents the imbalance of normal forces across the fluid volume and the imbalance of tangential forces is converted to an expression for velocity using the constitutive relation (14). Equation (22) is supplemented by an equation of state for compressible fluids, e.g., (12), or the continuity equation in case of incompressible liquids (17). This partial differential equation allows for analytic solutions in case of specific conduit geometries and boundary conditions. Stokes equation is the specific form of the full Navier-Stokes equation (16) for stationary conditions (@t .fR v/ D 0) in the absence of body forces (b D 0) when the nonlinear term fR grad v v is negligible.
Law of Hagen-Poiseuille: Tubes The flow of fluids through tubes is of obvious fundamental technical interest given that the transport of fluids as well as their local provision and discharge are overly realized by employing man-made tube networks. However, tubes also constitute one of the fundamental idealized pore geometries of the pore space of rocks (e.g., Bernabé et al. 1982) or more generally porous materials. Exploiting the cylindrical symmetry for a tube with its axis oriented along the direction of eEz at z D 0 (Fig. 1) simplifies Stokes equation (22) to 1 1 @z p D @r .r@r v/ :
fR r
(23)
Accounting for two dynamic boundary conditions, specifying the pressures at the inlet and outlet (pin , pout ), and two kinematic boundary conditions, v.r D R/ D 0,
1454
J. Renner and H. Steeb
Fig. 1 Schematic of the radial velocity distribution for flow in a cylindrical tube
r pin
R
pout
v(r) z
L
i.e., no-slip condition at the tube’s inner wall, and @r vjrD0 D 0 owing to axial symmetry, yields the well-known quadratic velocity profile v.r/ D
1 4p 2 R r2 @z p R2 r 2 D fR fR 4
4 L
(24)
(4p WD pin pout ) and upon integration the law of Hagen-Poiseuille Qz D @t V f D
R4 4p : 8 fR L
(25)
The minus sign emphasizes that the fluid yield is “downhill” relative to the direction of the pressure gradient that indicates the direction and magnitude of the steepest increase in fluid pressure. For given boundary conditions, i.e., prescribed pressures at the opposing ends, the conduit geometry and the fluid’s properties expressed by tube radius and viscosity, respectively, determine jointly the volume-flow rate. The fourth-power dependence of volume-flow rate on tube radius bears eminent consequences for hydraulically preferential paths in networks composed of tubes with varying geometrical characteristics (e.g., Bernabé and Bruderer 1998). The inverse dependence on fluid viscosity is of equal importance considering that viscosities of geoscientifically relevant fluids differ by orders of magnitude from 105 Pa s for gases to 103 Pa s for water, to 103 (light) and up to 100 Pa s (heavy) for oil, and to 108 Pa s for magmas. The viscosity dependence of flow rate accounts for the necessity to operate turbines with gases and for the enhancement of productivity of heavy-oil reservoirs upon heating.
Cubic Law: Slits The flow of fluids through slits is also of obvious fundamental technical interest, for example, in lubrication. Slits also constitute one of the idealized pore geometries used to model pore networks of rocks (e.g., Bernabé et al. 1982). Employing a Cartesian coordinate system and analyzing Stokes equation (22) for flow in direction of eEz in a slit of width (aperture) w between parallel plates with a normal vector in the direction of eEx and located at x D ˙w=2 (Fig. 2) yields
Modeling of Fluid Transport in Geothermal Research
1455
Fig. 2 Schematic of the velocity distribution for flow between parallel plates
x
pout v(x)
pin
w z
B
y
L
1 @z p D @xx v:
fR
(26)
Solving the partial differential equation for dynamic (prescribed in- and outlet pressure as above) and kinematic boundary conditions (the no-slip condition at the plates’ surfaces v.x D ˙w=2/ D 0 and the requirement from symmetry @x vjxD0 D 0) leads to a quadratic velocity profile 1 v.x/ D 2 fR
w2 x2 4
1 @z p D 2 fR
w2 x2 4
4p L
(27)
closely resembling the one for a tube (24). Upon integration over the slit’s cross section, one arrives at the famous cubic law Qz D @t V f D
B w3 4p 12 fR L
:
(28)
The slit’s extension in the direction of eEy is denoted B. As the radius for tubes, the aperture is the critical geometrical parameter due to its third-power relation to flow rate. Even creeping flow between plates becomes immediately more complicated than presented here when altering the boundary conditions only slightly. For example, a steady-state solution does not exist for a point source between the plates (a zero-order model for a fracture or joint intersected by a borehole) unless a constant-pressure boundary condition is enforced at some finite distance from the source.
Extended Cubic Law: Rock Fractures Real rock fractures deviate from the simple model of parallel plates by their surface roughness that also allows the opposing surfaces to be in physical contact at a finite number of asperities without hindering fluid transport per se. Surface roughness perturbs the simple geometry of flow lines along the slit. Flow is redirected around local protrusions, and yield is thus reduced for given boundary conditions. Yet, the simple cubic law maintains its validity when using an effective hydraulic aperture for rough fractures and fractures in mechanical contact as demonstrated by the
1456
J. Renner and H. Steeb
seminal work of Witherspoon et al. (1980). However, ignoring the out-of-plane flow components apparently may lead to overestimated transport properties of fractures. At large separations, the topography of the opposing surfaces has little effect. The larger the protrusion of individual asperities relative to the average distance between the surfaces, the larger the disagreement between the real flow rate and the prediction by the parallel-plate model. A significant number of experimental and numerical investigations aimed at determining empirical factors, so-called friction factors f > 1, that account for roughness and multiply the geometrical factor of 12 in the denominator of (28) (e.g., Brown 1987; Renshaw 1995). Specifically, Renshaw (1995) proposed a modification that employs standard deviation of aperture and average aperture as statistical measures of the roughness. In numerical simulations of rough fractures, the local cubic law approach is commonly employed since results correspond closely to solving the full flow problem (e.g., Brush and Thomson 2003). At small separations and for fracture halves in contact, the flow is tortuous, tending to be channeled through high-aperture regions (Witherspoon et al. 1980). As the fractional contact area increases, the friction factor increases (e.g., Piggott and Elsworth 1992; Zimmerman et al. 1992). Eventually, the channel network along the slit becomes subject of percolation issues (e.g., Nolte et al. 1992). Fracture geometry is sensitive to loading. Normal loads prominently change average aperture. The resistance to normal displacement is typically expressed by normal stiffness. The nonlinear asperity deformation (Hertzian contacts) often causes a load dependence of stiffness. Increasing normal load reduces the yield of fluid volume for given boundary conditions in pressure. The correlation between the mechanical and hydraulic properties of fractures was investigated in some detail by Pyrak-Nolte and Morris (2000). Sisavath et al. (2003) presented a simple model for deviations from the cubic law for a fracture undergoing dilation or closure. Tangential loads may cause dilatant shear displacements that increase average aperture (e.g., Yeo et al. 1998). Furthermore, shear displacements may give rise to anisotropy in flow beyond intrinsic structures of the surfaces (e.g., Auradou et al. 2006; Giacomini et al. 2008). Until today, the critical issue for quantitative predictions remains to include the results of surface roughness measurements in micromechanical concepts from which hydraulic and mechanical properties can be derived at the same time (e.g., Olsson and Brown 1993).
3.2
Complex Geometries: From Conduit Networks to Porous Media
The governing equations of (macroscale) single-phase fluid flow in porous media can be developed on the basis of (a) averaging procedures of the microscale flow processes in individual conduits (pores) described by the Navier-Stokes field equations (Howes and Whitaker 1985); (b) empirical (transport) equations on the macroscale disregarding the morphology of the porous network, cf. classic Darcy’s law; and (c) the continuum-mixture theory that may be regarded as an extension
Modeling of Fluid Transport in Geothermal Research
1457
of the classic continuum theory (Hassanizadeh and Gray 1979a,b; Bowen 1982; Coussy 1995; Ehlers and Bluhm 2002). We will focus on continuum-mixture theory after presenting some basic aspects of microscale approaches to hydraulic properties. Treating pore space as a three-dimensional network composed of individual hydraulic conduits varying in shape and size is conceptionally appealing but leaves one with the incredible task to find a finite number of simple descriptors: 1. of the statistics of the geometrical characteristics of the hydraulic conduits that in the best of cases are accessible to “direct” determination, and 2. of the network “design” or topology. One such approach often associated with the keyword “hydraulic radius” is also subsumed under “Kozeny-Carman models” in honor of early contributors (e.g., Schopper 1982). Hydraulic radius is considered the central geometrical characteristic of the pore space. Typically, it is quantified from the ratio between pore volume and pore surface. Pore volume is readily measured in the laboratory (by either fluid saturation or determination of bulk and component density), while pore surface is accessible only by indirect methods such as nitrogen adsorption. In Kozeny-Carman models, tortuosity, a measure of flow path length relative to sample length, quantifies the topology of the pore space and is closely related also to measures of connectivity between pores. The controlled design of networks is a basic step in the quest for representative parameters. While numerical approaches have dominated research in this direction with the advent of fast personal computers, analogue models composed of electrical resistors were employed in early work (Greenberg and Brace 1969; Rink and Schopper 1968), and synthetic porous media, for example, prepared from wellsorted powders, are used until today (e.g., Blair et al. 1996; Mok et al. 2002; Revil and Cathles 1999). Network modeling relying on tubes has shown to be very valuable for understanding the role of heterogeneity, for example, (e.g., Bernabé and Bruderer 1998; Madden 1976). Heterogeneity of pore space actually has two aspects, the variability in geometry of conduits and the variability of connectivity of conduits in the network. The range of networks associated with a certain number and arrangement of a certain ensemble of pores is specifically a subject of percolation theory (Berkowitz and Ewing 1998). A continuous line of work on numerical network modeling has been presented by Bernabé and coworkers (Bernabé 1995; Bernabé and Bruderer 1998). Recently, coordination number was suggested to suffice as a complementary parameter to hydraulic radius for permeability modeling (Bernabé et al. 2010, 2011). Extended micromechanical modeling suggests that three geometrical parameters rather than a single characteristic length scale are required for substantial models of hydraulic parameters, pore radius, and length and separation (e.g., Gavrilenko and Gueguen 1989). These models invoke penny-shaped cracks or cylindrical tubes as basic geometries for pores (e.g., Matthews et al. 1993). Many of the theoretical
1458
J. Renner and H. Steeb
developments happened in parallel for the two closely linked transport properties, hydraulic and electric (e.g., Avellaneda and Torquato 1991; Madden 1976; Schopper 1982). Investigations into the length scale relating hydraulic and electric transport properties revealed that the subtle differences between electrical and fluid fields are affected by spatial randomness of the pore space that may not be easily realized in numerical models that, for example, employ periodic structures (Martys and Garboczi 1992). Analytic models complemented the experimental and numerical work (e.g., Torquato 2002). A peculiarity of hydraulic properties of natural rock volumes appears to be a distinct scale dependence related to some hierarchical organization of the hydraulic conduits (e.g., Neuman 2008). This notion rests on direct experimental observations (e.g., Tidwell and Wilson 1999) but also on the indirect comparison of results from experiments involving flow on different spatial scales (e.g., Brace 1980; Hünges et al. 1997; Ingebritsen and Manning 1999). The variability of hydraulic transport properties with spatial scale can be explained in the context of percolation theory (e.g., Hunt 2005). A range of further methods (bounding, heuristic methods, deterministic methods, stochastic methods, etc.) have been invoked for the calculation of equivalent hydraulic properties of heterogeneous porous media (see, e.g., review by Renard and de Marsily 1997). Specifically, the scale dependence was associated with the systematic spatial variation of local percolation probabilities in a statistical treatment employing “local porosity theory” (Hilfer 2002). Geometrical details of the pore space like pore size and specific surface area and their distributions are not accounted for in mixture theory. Continuum-mixture theory is motivated by the idea that various (miscible or even nonmiscible) constituents coexist in a representative volume element (Fig. 3). Neglecting smallscale details (e.g., morphology of pore space) and upscaling physical properties of the observed constituents and their interaction, one reaches a model of superimposed continua. Note that each constituent follows its own motion; therefore kinematic quantities and balance relations have to be individually formulated for each constituent. In contrast to classic continua, interaction mechanisms between the constituents (phases) have to be accounted for in balance relations.
Basic Equations II: Biphasic Mixtures In mixture theory, the description of flow processes of a viscous pore fluid (phase ' f ) in a porous skeleton (phase ' s ) with a complex pore geometry requires the usage of effective, i.e., smeared out, flow properties of the RVE with given representative volume dv. The effective flow properties are based on homogenized, coarse-grained morphological quantities, like porosity WD dv f =dv or (more general) volume fractions n˛ WD dv ˛ =dv of the solid and the fluid constituent ˛ 2 fs; fg. The mass dm˛ of a constituent in the RVE related to the total volume of the RVE defines the partial density dm˛ =dv DW ˛ . Consequently, the effective (or true) density ˛R follows from the definition of the volume fractions n˛ and the definition of the partial density, i.e., ˛R D dm˛ =dv ˛ D ˛ =n˛ . Note that porosity constitutes the only quantity of the pore space preserved in classic macroscopic continuum-mixture theories. Morphological details, like specific surface areas, and
Modeling of Fluid Transport in Geothermal Research
1459
effective geological model (macro-scale)
discrete grain scale (micro-scale)
L λ
heterogeneous meso-scale l Fig. 3 Representative volume element (RVE) and related scales. Quantities like effective densities ˛R are defined on mesoscale
geometrical details of the pore channels like averaged diameter or distribution of diameters or grain size distribution are not taken into account. To extend the classic mixture theory to multiphase flow in porous media, it was attempted to account for specific surface area as additional microscopic information (Hassanizadeh and Gray 1979a,b). As a consequence of the consideration of two constituents, i.e., the porous solid skeleton and the viscous pore fluid, we have to extend the conservation laws. Neglecting mass exchange, i.e., reactions by which fluid mass can be transformed into solid mass or vice versa, the extended formulation of the partial conservation laws for phase ' ˛ should account for an exchange of momentum and also an exchange of energy.
Kinematics The relative (seepage) velocity is introduced as the difference between the velocity of the fluid and the velocity of the solid wf D vf vs . Furthermore, the filter velocity or Darcy’s velocity is given by qf D wf .
Partial Balance of Momentum of a Biphasic Mixture The momentum of each phase ' ˛
1460
J. Renner and H. Steeb
Z J˛ D
Z xP ˛ dm˛ D
B
xP ˛ ˛ dv;
(29)
B
is now complemented by a momentum interaction (compared to 33). The momentum interaction terms (e.g., drag forces) account for the exchange of momentum between the solid and the fluid phase. One important example of momentum interaction is related to flow of a viscous fluid through pores. Assuming no-slip boundary conditions, velocity is zero at the pore wall. In accord to this Dirichlet condition, the reaction forces associated with viscous flow constitute the interaction terms, i.e., Z PO ˛ D
pO ˛ dv;
with
X
pO ˛ D pO s C pO f D 0;
(30)
˛
@B
where pO ˛ denotes local momentum interaction. Finally, the momentum of a constituent is changed by the sum of body forces and contact forces, F˛ , and interaction forces, PO ˛ , .J˛ /0˛ WD @t J˛ C vE˛ grad J˛ D F˛ C PO ˛ :
(31)
Here, we represent the material or substantial derivative with respect to the partial velocities by terms in parentheses that are primed and subscripted to indicate the partial velocity to be used for the advection term. Near-field and far-field forces F˛ D F˛@B C F˛B are defined in analogy to single-phase continua (see presentation of Eq. 10): Z F˛@B D
Z t˛ da;
@B
F˛B D
˛ b˛ dv:
(32)
B
The constraint in Eq. (30) links the partial balance of momentum to the balance of momentum of the mixture when Eq. (31) is summed up for all constituents '˛. Constitutive Equations The extended momentum balance (31) requires formulating a further constitutive equation for the momentum interaction pO f . Based on thermodynamic considerations (cf. Ehlers and Bluhm 2002; Hutter and Schneider 2009), we split the momentum interaction into an equilibrium and nonequilibrium contribution, i.e., pO f D pO feq C pO fneq . Regarding the entropy principle at thermodynamic equilibrium, we obtain pO feq D p grad . Thus, RVEs with conical pore shapes exhibit a static (no-flow) solid-fluid interaction term. Close to thermodynamic equilibrium, the nonequilibrium momentum exchange is proportional to the seepage velocity, i.e.,
Modeling of Fluid Transport in Geothermal Research
1461
pO fneq / wf or pO fneq D
2 fR
2 fR w D wf : f kf ks
(33)
The proportionality factor is determined from experimental observations. Here, k s and k f denote intrinsic permeability (unit [ m2 ] or Darcy [ D ] 1012 [ m2 ]) and Darcy’s permeability or the hydraulic conductivity (unit [ m=s ]), respectively. These two transport measures are related by ks D
fR f
fR f k D k: fR g fR
(34)
Linear Darcy’s Flow Next, we discuss a mixture which consists of a rigid (undeformable) porous skeleton that is fully saturated with a viscous pore fluid. Chemical reactions, e.g., dissolution and precipitation, are disregarded. Furthermore, the system is assumed to be under isothermal conditions. Localizing the partial balance of momentum of the fluid phase ˛ D f (Eqs. 30–31) yields ˛ xR f div Tf D f b C pO f :
(35)
Disregarding inertia forces (˛ xR f 0) and viscous shear stresses of the fluid (Tf p I, cf. Hassanizadeh and Gray 1987) and combining the resulting equation with the constitutive equation for the exchange of momentum (33), we obtain Darcy’s law qf D k f
1 fR g
grad p
b : g
(36)
Regarding one-dimensional flow processes in the direction of ez , with b D g D g ez , we specifically yield qf D k
1
f
fR g
@z p C 1 D k f i
(37)
where we introduced the hydraulic gradient i . Taking a closer look at Eq. (36), we observe that the directions of Darcy’s velocity qf and of the “driving force” (right-hand side of (36)) are identical. For porous media with an anisotropic pore space, this equation can be generalized by replacing the scalar permeability k f with a second-order permeability tensor Kf D Kijf ei ˝ ej qf D Kf
1 fR g
grad p
b : g
(38)
1462
J. Renner and H. Steeb
Thus, the second-order permeability tensor transforms, in the sense of an affine mapping, the driving forces to Darcy’s velocity. Let us emphasize here that we have introduced Darcy’s law from the balance of momentum and a (linear) constitutive relation for the nonequilibrium momentum exchange between the solid and fluid phase. Thus, Darcy’s law does not constitute a linear constitutive law but rather a result of a linear constitutive assumption. Comparing Darcy’s law with HagenPoiseuille’s law (25) and the cubic law (28), we discover that both relations, though obtained in quite different ways, represent linear relationships between a total discharge in a pipe (Qz ) or a local seepage velocity (qf) and a driving force given by a pressure difference between two spatial positions (4p or grad p).
Nonlinear Extensions While formally often treated alike, the nonlinear extensions of the transport laws for simple conduits, like straight tubes with circular cross section or perfectly smooth and parallel slits, on the one hand, and real fractures with rough surfaces and porous media with complex pore structure, on the other hand, actually have to account for two different phenomena. Hagen-Poiseuille’s law and the cubic law are exact linear equations that break down suddenly when inertia terms become relevant at high flow rates. Then, flow lines exhibit substantial complexity due to the occurrence of spontaneous eddies even in simple conduits (e.g., Costa et al. 1999). In real fractures and porous media, the presence of obstacles and the intrinsically winding pores enforce curved flow lines at all rates, but changes in dominating fluid path may occur with increasing flow velocity and the increasing role of inertia effects (e.g., Andrade et al. 1997). A linear effective transport law thus requires nonlinear convective terms in the Navier-Stokes equations (16) to be negligible. The transition to turbulent flow has enormous consequences for the efficiency of fluid-volume transport and advective heat transport. The onset of turbulence and thus the range of validity of Darcy’s law are conventionally characterized by the magnitude of the dimensionless Reynolds number Re, the ratio of inertia, and viscous forces. This approach stems from the stability analysis for straight tubes and slits, though real tortuous conduits require estimating the magnitude of convective Q f where terms, too. Explicit formulations of Re lead to Re D qf lQ fR = fR D qf l= Ql denotes a characteristic length scale of the flow problem. For porous media, the p hydraulic radius rhyd D k s = is often employed as a measure of mean pore size in the calculation of Re. Fractures are characterized by their effective aperture. In turbulent flow in tubes, momentum exchange in the transverse direction causes the radial velocity distribution to be significantly more uniform than the parabolic distribution characteristic for laminar flow. The profile is almost flat, pluglike, with steep flanks. Despite the locally chaotic nature of the flow, analytic velocity profiles can be derived for steady turbulent flow in tubes with circular cross section (e.g., Munson et al. 2006) since the eddies reach a statistical steady state when the flow is sufficiently developed (e.g., Smits and Marusic 2013). The boundary layer remains devoid of eddies due to the local dominance of viscous forces. Relatively recent
Modeling of Fluid Transport in Geothermal Research
1463
work related the characteristics of turbulent flow in tubes to nonlinear traveling waves (Hof et al. 2004). In irregular tubes and slits (rough fractures) or granular media, “deterministic” eddies occur at predictable sites associated with obstacles, protrusions, and the like that may actually cause a sensitivity of transport to flow direction (Cardenas et al. 2009). The influence of inertial forces on the bulk flow rate across fractures however remains small when Re < 1 is met among other kinematic criteria (Brush and Thomson 2003). Numerical simulations of high-velocity flow in a self-affine channel indicated that the effective permeability is dominated by the narrowest constrictions at low velocity (Skjetne et al. 1999); the effective tube thickness decreases with increasing Reynolds number. Similarly, accounting for a reduction in effective aperture, extended parallel-plate models comprise a Re dependence of the friction factor f (e.g., Phipps 1981; Nazridoust et al. 2006). Numerical work that focused on modeling the results gained from analogue models of rock fractures yields a transitional regime for Re between 1 and 10 (in which the non-Darcy pressure drop varies with the cube of the flow rate) and a fully turbulent regime for Re larger than 20, in which the non-Darcy pressure drop is quadratic in the flow rate (Zimmerman et al. 2005). Such quadratic extensions of Darcy’s law are commonly addressed as Forchheimer equation (Skjetne and Auriault 1999) with the additional proportionality constant for the quadratic term in flow velocity termed “inertial resistance,” “inertial permeability,” or “turbulence factor” (Sen 1987).
4
Fundamental Laws: A Second Step
The set of governing partial differential equations for a biphasic porous medium (solid skeleton ' s and pore fluid ' f ) is described in closed form by the balance of momentum of a mixture and the balance of mass/volume of the mixture. This set of equations is of great practical value since the controllable conditions in physical, i.e., laboratory and field, experiments can be applied in numerical analysis as boundary conditions and initial conditions.
4.1
Balance of Momentum of the Mixture
Neglecting inertia forces, the balance of momentum of a biphasic mixture (31) reads in local form as E div T D b:
(39)
The total stress of the mixture contains a solid and a fluid contribution T D Ts CTf . The balance of momentum is unaffected by the compressibility of the individual constituents ' ˛ .
1464
4.2
J. Renner and H. Steeb
Balance of Mass of the Mixture
The partial balances of mass for both constituents ' ˛ are given as .˛ /0˛ C ˛ div vE˛ D 0:
(40)
Recalling the definition of relative velocity, w E f D vEf vEs , the partial mass balances are reformulated in the seepage velocity and the velocity of the skeleton (cf. Ehlers 2002) .f /0s C div .f w E f / C f div vEs D 0; .s /0s C s div vEs D 0:
(41)
This formulation of the basic mass balances in primary kinematic variables fw E f ; vEs g can easily be linked to experimentally motivated boundary conditions and can also be used to formulate numerical methods, e.g., finite element methods (cf. Zienkiewicz et al. 1999). Equation (41) can be formally linearized around a reference state B0 expressed by a set of state variables x.t0 / DW x0 D .w E f;0 ; vEs;0 ; 0fR ; 0 /T where initial porosity 0 and initial effective density of the fluid 0fR represent given material properties. In most linearized poro-elastic E 0; E fR ; 0 /T . While hardly applications, this set can be simplified to x0 D . 0; 0 explicitly stated, this simplification also implicitly underlies linear poro-elasticity. Thus, we will also use the latter set of state variables x0 to obtain the linearized balance of mass E f C 0 0fR div vEs D 0
0 @t fR C 0fR @t C 0 0fR div w
(42)
that forms the basis for the derivation of the diffusion equation (storage equation) in linear poro-elasticity.
4.3
Mixture Theory Versus Biot’s Theory of Linear Poro-elasticity
The derivation of Biot’s linear theory of poro-elasticity on the basis of continuummixture theory has been controversially discussed in the literature (e.g., Wilma´nski 2006; Gurevich 2007). Introducing subclasses of Biot’s model including Terzaghi’s consolidation theory (Terzaghi 1923; Verruijt 2010) and a so-called hybrid model based on the “rigid grain assumption” (cf. Steeb 2010) shows the fundamental basis of poro-elasticity. Let us first discuss a compressible poro-elastic medium (compressible grains and compressible skeleton) saturated with a compressible pore fluid. Counting the number of equations and the number of unknowns, we observe that we need to introduce one further (scalar) equation in order to close the system. This type of equation could be in general a balance equation or a constitutive
Modeling of Fluid Transport in Geothermal Research
1465
equation for porosity. Especially for nonlinear formulations, the (general) form of this equation was and still is under scientific discussion (cf. Wilma´nski 1998, 2006, section 2.3). One possibility is to introduce an evolution equation for porosity in the form (cf. Spanos 2010, p. 34) @t D a div vEs C b div vEf
or
D 0 C a div uE s C b div uE f
(43)
where a and b are dimensionless (material) coefficients. In principle, the relations of a and b to the set of (experimentally) observable material parameters, comprising bulk modulus of the grains K s, bulk modulus of the (dry and empty) skeleton K, bulk modulus of the fluid K f , and the initial porosity in the undeformed/reference configuration 0 , can and have to be found by “Gedankenexperiment/thought experiment” (cf. Biot and Willis 1957). In the framework of mixture theory, it appears natural to choose div vEs and div vEf as the two fundamental kinematic variables to which to relate porosity. However, other authors have chosen to rest their formulations on the two natural dynamic variables, mean stress and fluid pressure (e.g., Rice and Cleary 1976), or mixes of dynamic and kinematic variables. The total mean stress (often denoted as confining pressure p c from the perspective of laboratory investigations) is given as m WD tr T=3 D p c . Biot (1962) formulated his basic constitutive equations of linear poro-elasticity as m K C ˛ 2 M ˛M div us D p ˛M M
(44)
introducing two material parameters, ˛ and M , and the fluid increment D
div.Eus uE f/ instead of the volumetric strain of the fluid as the second kinematic variable. On the one hand, fluid increment is the most convenient field variable for calculations of any quantities associated with the amount of transported fluid, for example, heat transported by advection, since @t D 0 div wf D div q. On the other hand, cannot be measured directly in laboratory or field experiments. From Eq. (44), we readily identify M as a storage modulus or 1=M as a specific storage capacity, i.e., a measure of change in fluid volume in a bulk volume upon a change in fluid pressure for fixed mean stress. Furthermore, we find the illustrative relation m C ˛ p D Kdiv us that allows us to identify ˛, the so-called Biot-Willis parameter, as the weighting factor in the effective pressure law for bulk volume deformation. Note that ˛ > 0 (see comprehensive discussion on bounds of poroelastic parameters by Zimmerman et al. 1986). Skempton coefficient B, a further parameter with a simple interpretation, quantifies the ratio between a change in fluid pressure and a change in mean stress for undrained conditions ( D 0). According to Eq. (44), the set {K, M , ˛} constitutes a set of elastic parameters alternative to {K, K f , K s } (typically complemented by the shear modulus of the frame, G, and 0 to specify the necessary four elastic parameters and the relative fraction of phases).
1466
J. Renner and H. Steeb
Here, we restrict to presenting the specific relations between 1=M and ˛ and the fundamental bulk moduli:
0 ˛ 0
0 1 0 K 1 D fC D fC M K Ks K Ks .K s /2
and ˛ D 1
K : Ks
(45)
A full discussion of the choice of set and of the relations between the various parameters is beyond the scope of the current presentation. The interested reader is referred to Biot and Willis (1957), Rice and Cleary (1976), Stoll (1989), Kümpel (1991), Detournay and Cheng (1993), Wang (2000), Pride (2005), and Verruijt (2010).
4.4
Governing Set of Equations
To derive the governing equation of poro-elasticity, we insert Darcy’s law (36), the barotropic equation of state for the pressure .fR /0s D
fR .p/0s ; Kf
(46)
and the evolution equation of porosity (43) into the linearized balance of mass of the fluid (42) to arrive at
02 .b C 0 / K f
ks a 0 div vEs D 0: @t p fR div grad p C 0 C
.b C 0 /
(47)
The mathematical structure of this equation is identical to the storage equation in poro-elasticity (e.g., Verruijt 2010, Eq. 4.80), reading in its well-known compact form 1 ks @t p fR div grad p C ˛ div vEs D 0: M
(48)
By simple comparison of coefficients in Eqs. (47) and (48), the still unknown material parameters a and b are readily identified as aD
M 0 .˛ 0 / Kf
and
b D 0
M 0 K f : Kf
(49)
According to (48), pressure evolution p.x; E t/ is coupled with the rate of volumetric deformation of the solid constituent as expressed by (44) and subject to the partial differential equation resulting from the conservation of momentum (39). Only for special initial-boundary-value problems these equations decouple. One-dimensional consolidation is one such case for which we obtain an uncoupled pressure-diffusion
Modeling of Fluid Transport in Geothermal Research
1467
equation. This case is of importance in geomechanical applications and can be used as a benchmark solution to verify numerical codes (cf. analytic solution in Verruijt 2010).
Subclass: Rigid Grain Assumption A subclass of Biot’s poro-elasticity model is obtained for a material with incompressible solid constituent, i.e., sR D 0sR . This assumption is often called the “rigid grain assumption,” as single grains composing, e.g., a porous sandstone matrix are assumed to be incompressible. For this subclass the evolution of porosity can be directly determined from the balance of mass of the solid, without introducing a porosity evolution equation like (43). The rigid grain assumption reduces the balance of mass of the solid constituent to a relation between the volume fractions: ns0 D .det FEs / ns
with
@xE FEs D : @XEs
(50)
Taking into account the linearized determinant of the deformation gradient (Jacobian) lin.FEs / D lin.Js / D div uE s C 1;
(51)
the volume fractions in the undeformed state are linked to the volume fractions in the deformed state by ns0 D .div uE s 1/ ns :
(52)
With the saturation condition for both states D 1 ns and 0 D 1 ns0 , we finally get @t D .1 / div vEs
and D 0 C .1 / div uE s :
(53)
Comparing the evolution of porosity given for the hybrid model (53) with the one obtained for the more general compressible poro-elastic case (43), we observe that (53) does not contain any material parameter that has to be determined. Any change in porosity (pore space) is geometrically caused by a volumetric deformation of the skeleton. Equation (53) can also be derived applying the rigid grain assumption, i.e., sR D 0sR , that is identical to an incompressibility constraint for the bulk modulus, i.e., K s ! 1, to (43). Calculating the limit cases of the coefficients a and b, we obtain lim ˛ D 1;
K s !1
lim M DK f = 0 ;
K s !1
lim a D 1 0 ;
K s !1
and
lim b D 0;
K s !1
(54)
1468
J. Renner and H. Steeb
reducing (43)–(53). Accounting for these limits of the material parameters, the storage equation of the hybrid model is given by ks
0 @ p div grad p C div vEs D 0 t Kf
fR
(55)
or
1
0 C f K K
@t p
ks 1 @t m D 0: div grad p C fR
K
(56)
Subclass: Terzaghi’s Consolidation Equations Historically (cf. remarks in de Boer 2000), Terzaghi developed a model to describe the coupled hydromechanical behavior of consolidation with a further assumption in addition to the rigid grain assumption. He assumed that the fluid is also incompressible fR D 0fR or, equivalently, K f ! 1. For this additional assumption, the linearized balance of mass (volume) of the fluid is given as @t C 0 div w E f C 0 div vEs D 0:
(57)
Inserting the balance of mass/volume of the solid phase (52) and describing the evolution of porosity using K s ! 1 in (57), we obtain
0 div w E f C div vEs D 0:
(58)
Insertion of Darcy’s law into (58) gives the storage equation of Terzaghi’s case ks div grad p C div vEs D 0
fR
(59)
ks 1 div grad p C @t m D 0:
fR K
(60)
or
These equations can also be derived by modifying the first term in the hybrid model (55) according to Terzaghi’s assumption on fluid behavior. The storativity 1=M D
0 =K f of the hybrid model, already reduced in comparison to the full Biot theory, further simplifies for K f ! 1 to lim
1
K f !1 M
D0
(61)
exactly giving (59). Thus, we observe that all subclasses of Biot’s storage equation can be derived in a straightforward way using techniques of continuum-mixture theory.
Modeling of Fluid Transport in Geothermal Research
1469
In addition to the (scalar) balance of mass of the mixture that leads to the storage equations in the discussed situations, the (vectorial) balance of momentum of the mixture (39) is the governing balance equation describing the flow of a compressible/incompressible fluid through a deformable porous frame/skeleton/material.
4.5
Constitutive Equations for the Stresses: Biot’s Case
The introduced storage equation for the discussed subclasses is complemented by the balance of momentum of the mixture, i.e., the equilibrium condition for the fluid-filled solid. To derive the governing partial differential equation, we have to introduce constitutive equations for the stresses. The partial stresses for the solid and the fluid phase (see discussion following Eq. 35) are given for Biot’s poro-elastic case including grain deformations, i.e., T D Ts C Tf
with Ts D TsE .1 / p I and Tf D p I:
(62)
E t/ of the In Eq. (62), we introduced the so-called Cauchy extra stress tensor TsE .x; solid constituent related to the macroscopic strain field s , i.e., TsE D 3 K vol.s / C 2 G dev.s / C .1 ˛/ p I : „ ƒ‚ … „ ƒ‚ … skeleton deformation
(63)
grain deformation
Combining equations (62) and (63), we obtain the effective stress principle T D TsE ˛ p I:
(64)
Note again that the Biot-Willis parameter ˛ D 1 K=K s accounts for the compressibility of the grains.
4.6
Overview of the Governing Equations
Summarizing, we obtain a set of linear partial differential equations comprising one set of equations for the displacement field us given by the local form of the balance of momentum of the mixture. These equations are supplemented by the storage equations discussed above. The solution variables are the pore pressure and the displacement of the solid skeleton, i.e., fEus; pg (Table 1). The equations can be approached using Dirichlet boundary conditions for solid displacement and pore pressure as well as Neumann boundary conditions for the fluxes, i.e., the total stress vector and the seepage velocity. The presented displacement-pressure fEus; pg formulation of Biot’s theory is explained in detail in Zienkiewicz and Shiomi (1984, Sect. 7).
1470
J. Renner and H. Steeb
Table 1 Governing set of equations for a biphasic model consisting of a solid skeleton and a pore fluid for three subclasses regarding the deformability of the solid and the compressibility of the fluid. Note that ˛ D 1 K=K s (K, bulk modulus of the dry frame, and K s , bulk modulus of the grains). Furthermore, 1=M D =K f C .˛ /=K s . For K s ! 1 W ˛ D 1 and 1=M D =K f Biot’s model Equations in the domain, i.e., 8 xE 2 B E div .Ts E ˛ p I/ D b pP M
kf div grad p fR
Hybrid model Incompressible model (rigid grain assumption) (rigid grains and incompr. fluid) E div .Ts E div .Ts E p I/ D b E p I/ D b pP kf kf div grad p div grad p fR fR M Cdiv vEs D 0
Cdiv vEs D 0
C˛ div vEs D 0 Boundary conditions, i.e., 8 xE 2 @B u Es D EuN s on Ds f p D pN on D Tn E D Nt on Ns f w Ef n E D wNf on N
5
Transient Flow Processes: Storage
Stationary flow processes are controlled by transport properties, i.e., permeability or transmissivity (see Table 2 for relations among the conventionally used hydraulic parameters), that combine geometrical characteristics of the hydraulic conduits and the fluid’s inherent resistance to flow expressed by viscosity. Transient flows give rise to a whole set of new phenomena associated with storage properties. The compressibility of the fluid is an intrinsic contributor to storage properties since any change in pressure in a specified control volume of a flow requires gain or loss of fluid mass, i.e., a change in the fluid mass stored in that control volume. In addition, the response of the conduit to fluid-pressure changes, its deformability, contributes to the storage capacity. This sensitivity of conduit shape to the internal pressure distribution is an expression of hydromechanical coupling and is notably nonlocal in the sense that the deformation is not restricted to the site of force application, as can be readily rationalized when drawing the parallel to splitting wood. The deformation of the wood due to the intruding axe blade (fluid) extends to well ahead of the tip of the blade. The tip of the progressing fracture reaches the base of the wood chunk splitting it apart without the blade ever being in contact with the fracture tip. Reversing the roles of fluid pressure and solid deformation as cause and effect, any deformation of conduits due to external forces will clearly also affect the fluid pressure (Skempton effect, e.g., Detournay and Cheng 1993). In geoscientific applications, it is often simply assumed that the hydromechanical coupling can be neglected to a certain degree, citing several orders of magnitude difference in the compressibilities of minerals and relevant pore fluids to support this notion. Yet, in particular in porous rocks serving as reservoir for liquid and gaseous
Modeling of Fluid Transport in Geothermal Research
1471
Table 2 Typical pairs of hydraulic parameters. In hydrogeology, it is common practice to relate changes in stored fluid volume to changes in head [m] rather than to changes in pressure. In the relations below, h denotes the height of the conduit. For the slit between parallel plates, h D w For a material point Transport property (Intrinsic) permeability k s [m2 ] Hydraulic conductivity k f D fR gk s = fR [m s1 ] (Darcy’s permeability) For a structure or an individual conduit Transport property Transmissivity T D fR ghk s = fR [m2 s1 ] TO D hk s [m3 ] Transmissibility
Storage property Specific storage Capacity s [Pa1 ]
Storage property Storage factor Storativity
S D fR ghs [-] O SDhs [m Pa1 ]
resources, the deformation associated with pore-fluid-pressure perturbations is controlled by their effective elastic moduli (drained or skeleton modulus K) that may not differ significantly from that of rather incompressible fluids encountered in reservoirs for liquid resources. Rather than constituting an academic problem, hydromechanical coupling is evidenced by a range of phenomena at the earth’s surface or at shallow depth. Enhanced activity of warm water springs along fault traces has been observed after earthquakes (e.g., Sibson et al. 1975). Continuous monitoring of wells revealed a systematic correlation of level changes with seismic activity (Brodsky et al. 2003; Roeloffs 1998) or with barometric loading of the fluid column (Rojstaczer and Agnew 1989). In turn, seismicity is believed to be triggered by pore-fluidpressure changes related to natural precipitation (e.g., Hainzl et al. 2006) but also to anthropological surface activities, such as charging surface reservoirs (e.g., Chen and Doolen 1998; Simpson et al. 1988) or injection of fluid into the underground (e.g., Majer et al. 2007; Zoback and Harjes 1997). Though the full account of the nonlocal character of deformability requires more sophisticated approaches, specific storage capacity has been treated as a single-valued material parameter simply comprising the fluid’s and the solid’s local compressibilities. The classic Biot approach is devoid of any length scale on RVE level, and thus fundamental questions remain regarding correct upscaling of processes on the pore scale to the representative volume element of the macroscopic theory. For example, flow waves are reflected at pore bifurcations introducing a second length scale, mean pore length, into the problem in addition to pore diameter. Likewise, the nonlocal effect of pore pressure variations on the deformation of pores possibly requires a further length scale, the extent of the influence zone of pore pressure perturbations, as well as the need to simultaneously look into propagation of displacement pulses in solids with a speed close to the propagation velocity of elastic body waves and slow pore pressure diffusion.
1472
J. Renner and H. Steeb
In the following, we first look into some aspects of pressure diffusion with and without hydromechanical coupling relevant for hydraulic well testing. Homogeneous reservoirs are addressed as well as those containing prominent planar conduits. We conclude our selection of examples for fluid flow problems in geothermal research by presenting work on mesoscopic loss phenomena in complex conduit networks.
5.1
Homogeneous Poro-elastic media: Low-Frequency Pumping Tests
Pumping tests in wells and deep boreholes penetrating fluid reservoirs have been the primary means for determining hydraulic properties of the subsurface for as much as 50 years at least. Analyses of tests have mostly relied on the linear diffusion equation, a special case of the storage equations derived above (e.g., Butler 1997; Horne 1996; Matthews and Russell 1967). In the majority of approaches, hydromechanical coupling is not fully tackled but only accounted for by the specific choice of storage parameters.
Basic Equations III: Pressure Diffusion Two aspects of the derivation and the use of the linearized field equation (48) demand further consideration, (a) treatments of the coupling term and (b) pressure dependence of the involved material properties. Obviously, these two aspects are intimately related. Pressure dependence of storage capacity and permeability both result from deformation of the solid skeleton associated with the change in fluid pressure. Relation Between the Coupling Term and Storage Parameters A rigorous treatment of the coupling is quite challenging, and previous work focused mainly on outlining examples for which the coupling can be neglected (e.g., Detournay and Cheng 1993). Obviously, in cases where the bulk volume remains constant, i.e., div vEs D 0, the two equations (39) and (48) decouple, and fluid pressure obeys a linear diffusion equation @t p D Djdiv Eus div grad p;
(65)
with a hydraulic diffusivity Djdiv Eus WD k s M = fR . The inverse of the storage modulus M is often addressed as specific storage capacity in the context of pressure diffusion. In our notation, we emphasize the corresponding boundary conditions by a subscript for this parameter, i.e., here sdiv Eus WD 1=M . Constant bulk strain may be approximated during pumping experiments in confined aquifers (see, e.g., discussion by Wang 2000). Laboratory tests, however, are typically performed under hydrostatic conditions at constant pressure p c . For these conditions – called unconstrained in hydrogeology practice – we have to relate
Modeling of Fluid Transport in Geothermal Research
1473
the two kinematic variables div uE s and div uE f representing the volumetric strains of the bulk aggregate and the fluid, respectively, to the two dynamic variables mean stress m D tr T and fluid pressure p. Accounting for D div.Eus uE f / allows us to transform (44) to compressibility laws for our preferred kinematic variables (compare Detournay and Cheng 1993, Eqs. 15a, b): m K C ˛ 2 M ˛ 0 M div us D p div uf M .˛ 0 / 0 M
(66)
and 0
div us div uf
˛ K
1 K
1
B C m B C : D B C @ 1 ˛ ˛2 1 A p ˛ 1 K 0 K 0 K 0 M
(67)
Equation (67) gives us a relation between bulk volume-strain rate and temporal changes in the dynamic variables, div vs D .@t m C ˛ @t p/=K, which allows us to manipulate (48) to
˛2 1 C M K
@t p
ks ˛ @t m D 0: div grad p C
fR K
(68)
For conditions of constant mean stress, Eq. (68) again decouples from the deformation problem and degenerates to a linear diffusion equation. The associated hydraulic diffusivity differs from Djdiv Eus of (65) owing to the different specific storage capacity, s m D 1=M C ˛ 2 =K > sdiv Eus . The relation among the two measures of the relative change in fluid volume upon a change in fluid pressure in a porous medium at the specified conditions appeals to intuition since the lesser the skeleton is restricted to deform, the larger the possible storage. For hydrostatic conditions, the formulation s m D 0 .ˇ f C ˇpp / is illustrative where
0 ˇpp D 1=K .1 C 0 /=K s with the compressibility of pore volume with respect to pore-fluid-pressure variations ˇpp D @p ln Vp jpc of Zimmerman et al. (1986). This formulation readily illustrates the balance of the various contributors to fluid storage, fluid compressibility (K f), skeleton compressibility (K), and grain compressibility (K s). Laboratory experiments on rock samples documented cases with ratios of s m =ˇf 1 : : : 1;000 (e.g., Fischer and Paterson 1992; Song and Renner 2007, 2008) demonstrating that the approximation s ˇf – probably well justified in many hydrogeological applications – fails when rocks with large pore compliance and/or heterogeneous structures are involved. In dynamic variables, the porosity evolution (43) assumes the simple form @t D
˛ 0 .@t m C @t p/ : K
(69)
1474
J. Renner and H. Steeb
Notably, the weighting factor for the effective pressure is identical to 1 in contrast to the effective pressure relation for bulk volumetric strain for which it is ˛ < 1 (see Eq. 67). Approximating Fractures as Thin Layers of Porous Media At least basic features of flow in planar conduits can also be investigated employing the linear diffusion equation and neglecting the details of the velocity profile normal to the conduit’s extension. Flow boundary conditions are readily and probably sufficiently approximated by using the average velocity of the quadratic Poiseuille flow relation (27). Thus, the treatment of the coupling term for a homogeneous medium at conditions of constant mean stress is often also transferred to problems involving planar conduits, i.e., fractures. The characterization of storage then comprises the sensitivity of the fluid volume in the fracture to changes in fluid pressure, s D d ln V f =dp. For planar conduits, the normal stiffness dominates the changes in their geometry with a change in pressure and leads to a storativity component Scoup D dım =dp where ım denotes the mechanical aperture of the conduit (e.g., Ortiz et al. 2011). As for the homogeneous porous medium, the total storativity (or the total specific storage capacity, Table 2) then has two contributions, one from the compressibility of the fluid and one from the volume change of the conduit, SF D ım ˇf C dım =dp (sF D ˇf C d ln ım =dp). Inherent to the above approach of formulating storage properties of fractures is the assumption that the fluid pressure is constant in the conduit, and therefore a unique relation exists between its aperture and the current pressure. Since the pressure evolution in the conduit is the very aim of the analysis, this assumption obviously constitutes a significant simplification. For an order of magnitude consideration, the analytic solution for an ellipsoidal fracture (radius R, aperture ı, aspect ratio WD 2R=ı) with constant fluid pressure may be employed that gives dım =dp D R=EQ (EQ WD E=4 .1 2 / with Young’s modulus E and Poisson’s ratio ) and allows us to formulate a “constant-pressure reference” storativity Q For rather incompressible fluids like water and a fracture of Sref ı=Kf C R=E. an aspect ratio of 104 , the storativity contribution by the approximated coupling typically exceeds the one by the compressibility of the fluid by three orders of magnitude, while for gas the two will contribute roughly equally. The constant-mean-stress approximation of the coupling term transforms the governing equation into a linear but inhomogeneous diffusion equation that explicitly reads @t p D DF div grad p C
qF .x; t/ ; hSF
(70)
for a single vertical fracture of half length xF and height h where DF D TF =.SF fR / denotes the hydraulic diffusivity of the fracture, TF the fracture conductivity, and SF the total fracture storativity (for relations among the various hydraulic properties used in practice, see Table 2). For a fracture embedded in a porous matrix, qF .x; t/ represents the exchange of fluid between fracture and matrix. The latter
Modeling of Fluid Transport in Geothermal Research
1475
couples the diffusion equation for the fracture with the diffusion equation for the matrix (48). Interestingly, the governing equation for the fracture degenerates to an inhomogeneous stationary Laplace equation when the storativity of the fracture can be neglected (SF ! 0), i.e., the flow in the fracture is close to stationary. The time dependence of the pressure in the fracture is then solely controlled by the coupling with the matrix flow. In a similar way, the time-dependent deformation of the solid controls the pressure transients in Terzaghi’s consolidation problem (see Table 1). Accounting for Pressure-Dependent Parameters During the derivation of the linear field equation (48), we disregarded a pressure sensitivity of permeability though laboratory (and field) tests have since long documented the significant sensitivity of permeability to effective pressure (e.g., Bernabé 1987; Berryman 1992; Gangi 1978). Here, we introduce (compare Yilmaz et al. 1994) ˇOk D @p ln k s
and ˇO D @p ln
(71)
to express relative changes in permeability and porosity due to changes in fluid pressure. The hat indicates that the conditions for which the partial derivatives are evaluated remain unspecified (e.g., for constant mean stress or constant bulk volumetric strain). The permeability compliance ˇOk is related to permeability modulus Kk WD 1=ˇOk that assumes values as small as tens of MPa (e.g., Brace et al. 1968; Renner et al. 2000; Song and Renner 2008). Alternatively, the pressure dependence of permeability can be related to its dependence on porosity and thus to pore compressibility ˇO (corresponding to ˇ D .˛ 0 /=K in (69) obeying ˇ D ˇpp ˛ =K < ˇpp ), i.e., @p k s D ˇO @ k s . Typically power-law relations are found, i.e., @ln ln k s D n with exponents ranging from n 1 to 10 (see review by Bernabé et al. 2003). In addition, nonconstant pore compressibility leads to pressure-dependent storage parameters. From a micromechanical perspective, nonlinearity stems from (Hertzian) contact mechanics and may thus be encountered in granular and fractured media as well as individual planar conduits with rough surfaces. Pressure-dependent hydraulic properties were previously employed in diffusion equations (e.g., Ortiz et al. 2011; Shapiro and Dinske 2009; Yeung et al. 1993) without rigorously analyzing the questions whether the linearized equation may indeed be extended in this way or whether then further nonlinear terms have to be accounted for. To tackle these questions, we start from the general form of mass conservation for the fluid fR @t C @t fR C div fR vf D 0
(72)
and use the porosity evolution (43), the constitutive relation for a barotropic fluid (46), and Darcy’s law (37) to arrive at
1476
J. Renner and H. Steeb
Kf ks 1 1 ˇOk .gradp/2 C 0Df(43)g.p/C fR ˇO 1
M
M M
ˇO 1C ˇf
! vEs gradp (73)
where {(43)}.p/ represents the linearized storage equation (43) that now may be used with pressure-dependent parameters. The further summands correspond to a quadratic and a convective term. Notably, nonlinear extensions of Darcy’s law are mandated when dealing with strongly compressible fluids or inertia forces reach relevant levels (as outlined above), but we neglect such nonlinearities in the current treatment. The pressure dependence of viscosity of liquids is typically negligible in applications, while that of gases might become relevant. Here, a pressure dependence of fluid viscosity is neglected but could easily be included in the analysis, too. The two compressibilities (ˇO and ˇOk ) determining the weighting of the quadratic term may typically exceed mineral compressibilities by 3–4 orders of magnitude. Whether one of the two dominates the nonlinear behavior has to be analyzed for a given problem. From a practical perspective, it seems most pragmatic to determine the importance of the quadratic and the convective term simply by comparing numerical results gained with and without respecting them in the calculations. In particular, nonlinear effects due to pressure-dependent properties will be most significant close to a fluid source where pressure gradients intrinsically reach their maximum values.
Pressure Transient Analysis Based on the Linear Diffusion Equation Employing linear diffusion equations rests on a number of substantial assumptions: (1) homogeneity of considered materials, (2) material parameters do not depend on fluid pressure (or stress), (3) pressure variations are tracked only to linear order, and (4) decoupling is warranted or coupling is sufficiently accounted for by the used storage parameter. Nevertheless, analyzing solutions of pressure-diffusion equations is extremely instructive when potential limitations are kept in mind. Response of an Infinite Homogeneous Isotropic Medium to a Pulse Test We present one particular solution of the linear homogeneous diffusion equation (65) for fluid pressure in a homogeneous medium, since it provides ready insight into the specifics of pressure diffusion, as, for example, associated with hydraulic stimulation. Not belittling the sophistication of advanced pumping protocols, to zero order any pumping operation can be considered a pulse, i.e., a perturbation of fluid pressure in a well for a finite duration. A treatment of pulse testing was probably first presented by Johnson et al. (1966). We consider an infinite line source with a constant flow rate aligned with the z-axis of a cylindrical coordinate system and a homogeneous and isotropic infinite reservoir. The diffusion equation for radial symmetry reads 1 D 1 @t p D @rr p C @r p: r
(74)
Modeling of Fluid Transport in Geothermal Research
1477
In the classic solution approach, this partial differential equation is transformed into an ordinary differential equation using & D r 2 =4Dt (Boltzmann transformation) that is then solved by integrating twice. Solutions are self-similar in the transformation variable giving rise to the well-known hydraulic scaling relation between the characteristic distances lQ and the characteristic duration tQ of diffusion processes, D lQ2 =tQ. Accounting for initial and boundary conditions as appropriate for many practical applications, i.e., • homogeneous pressure everywhere in the reservoir before pumping commences, • the pressure at infinite distance from the well remains at the initial level, and • the flow rate of the source and the pressure gradient at the wall of the well are related by Darcy’s law, gives a change in pressure relative to the reference pressure of 4p.r; t/ D
Qf f Ei.=t/ 4kh
(75)
where the exponential integral function Z1 Ei.x/ D
e a da a
(76)
x
and D r 2 =4D. The series expansion gives Ei.x/ ' C ln x for x 1 with the Euler-Mascheroni constant ' 0:57721. After an injection that started at time t0 and ended at t1 , the pressure change obeys 4p.r; t/ D
Qf f ŒEi.=.t t0 // Ei.=.t t1 // ; 4kh
(77)
exhibiting a maximum at time tmax (Fig. 4) constrained by .tmax / D
.tmax t0 /.tmax t1 / tmax t0 : ln t1 t0 tmax t1
(78)
Pulse or interference testing uses a monitoring well in which 4pmax and tmax are determined allowing one to readily estimate permeability and specific storage capacity (or transmissivity and storativity) of the penetrated reservoir. The maximum in monitoring pressure occurs with a significant delay relative to the termination of the pumping at a specific distance (Fig. 4a, b). For example, an increase corresponding to about 5 % of the maximum pressure in the pumping well occurs at a distance of more than three times the characteristic penetration depth after a duration of ten times the pumping time. For any given time, pressure monotonically decays with increasing distance from the source (Fig. 4c). The
1478
normalized pressure difference
normalized pressure difference
normalized time 1.1 3.5 10.0
0.3
0.2
0.1
0.0
c
b
0.4
0
20
d
0.25 normalized time 1.1 3.5 10.0
0.20 0.15 0.10 0.05 0.00
1
2
3 4 5 6 normalized distance
7
8
100 10−1 10−2 10−3 10−4 10−5 10−6 1
100
80 40 60 normalized time
normalized pressure difference
normalized pressure difference
a
J. Renner and H. Steeb
10 normalized time
100
5 10 15 normalized distance
20
100
10−1
10−2
10−3
10−4
0
Fig. 4 Examples of the solution of the linear diffusion equation for an infinite line source (Eqs. 75 and 77). Normalized time is given as multiples of the pulse length, 4t , i.e., the duration of the pumping at constant flow rate. Pressure is normalized to the maximum pressure observed in the pumping p well. Normalized distance is given in multiples of the characteristic penetration depth rp D 4t of a perturbation of characteristic time 4t in a homogeneous underground with hydraulic diffusivity D. At a specific distance, pressure exhibits a maximum with significant delay relative to the termination of the pumping (a, b), while pressure is a monotonously decreasing function of distance at a given time (c). The maximum in pressure decays in magnitude with increasing distance (d)
maximum in pressure stays above fractions of 104 of the pressure amplitude in the pumping well for normalized distances of more than 20 (Fig. 4d). These fractions correspond to tens of kPa for typical injection pressures on the order of tens of MPa and are thus within resolution limit of standard pressure gauges. The temporal evolution of source locations of microseismic events observed during pumping operations correlates well with the predicted outward movement of the pressure maximum (Shapiro and Müller 1999). Indeed, the seismic activity associated with pumping operations has led to the notion that earth’s upper crust is naturally close
Modeling of Fluid Transport in Geothermal Research
1479
to failure conditions (Townend and Zoback 2000). The characteristics of the decay of the imposed pulse are the foundation for the practitioner’s wisdom that it takes about ten times the pumping duration before the reservoir can be considered back at equilibrium.
Flow in Fractures Penetrating Permeable Reservoirs Transient fluid flow in fractures or faults embedded in a permeable environment and intersecting a well plays an important role for the production of geothermal energy, especially from artificial fracture systems, so-called hot dry rock (HDR) or enhanced geothermal systems (EGS). Fluid exchange between a fracture or a dominating flow path and its hydraulically conductive environment constitutes a rather complex problem for mathematical treatment. A first analytic solution was presented in the context of well testing by Cinco-Ley et al. (1978) who simplified the flow field as a superposition of two fields of unidirectional flow, one in a fracture aligned with the line source, the well, and one in the rock matrix, the latter perpendicular to the fracture plane. The basic understanding of the characteristics of the solutions of the coupled set of diffusion equations (65) and (70) for specific imposed boundary conditions (traditionally constant injection or flow rate for a specific time) permits the practitioner to readily determine whether flow occurs in a homogeneous formation (radial flow); in a single, vertical fracture (fracture linear flow); or simultaneously in both (bilinear flow, formation linear flow) by analyzing the relations between the well pressure (pw ) and/or its derivative (dpw =d ln 4t) and the time elapsed since pumping commenced (4t) in diagnostic log-log plots (see, e.g., Bourdarot 1998; Bourdet 2002; Bourdet et al. 1989; Da Prat 1990; Horne 1996; Houzé et al. 2007; Poe and Economides 2000). Fracture linear flow (pw / 4t 1=2 ) inevitably occurs in fractures in an impermeable matrix but is also often briefly observed for a fracture in a permeable rock early on after commencing production or injection. During bilinear flow (pw / 4t 1=4 ), characteristic for fractures and matrices with finite conductivity, close to linear flow occurs along the fracture and concomitantly in the matrix (e.g., Cinco-Ley and Samaniego 1981; Ortiz et al. 2013). In contrast, formation linear flow (pw / 4t 1=2 ) is found for highly permeable fractures that do not exhibit pressure gradients, but pressure gradients are only significant in the matrix. The fracture surface essentially serves as an extension of the rock’s surface to which the boundary condition imposed in the well is applied. The above regimes are all transient for fractures of finite length, and pseudo-radial flow (pw / ln 4t) is inevitably reached after long pumping times. Neglecting hydromechanical coupling, Weir (1999) presented a comprehensive theoretical analysis of the flow regimes and their duration for “fractured blocks” based on Laplace transform of the governing equations providing fundamental scaling relations. Storage properties of a fracture can be neglected in instances where the pressure is fairly constant along the fracture. In general, however, coupling between the flow and the mechanical state has to be accounted for. So far, this challenging problem has not been addressed with the same generality as
1480
J. Renner and H. Steeb
the uncoupled hydraulic aspects (but see Murdoch and Germanovich 2006, for horizontal fractures).
Non-Darcian Flow in Well Testing Flow in narrow fractures may become turbulent when production rates are high (e.g., Kohl et al. 1997; Ortiz et al. 2013; Sen 1987, 1989; Wen et al. 2006) or when proppants are introduced in the wake of stimulation that tend to keep the fracture open but constitute obstacles in the flow path (e.g., Dvorkin et al. 1992). Furthermore nonlinearity arises due to strong curvature of flow lines at the well entry (e.g., Chen and Jiao 1999). Previous work employed the Forchheimer equation to model non-Darcian flow in fractured rocks toward a well either analytically (Sen 1987) or numerically (Kohl et al. 1997). The latter actually modeled multi-rate flow experiments conducted in the open-hole depth interval of a well bore at the HDRtest site Soultz. Semi-analytical solutions for non-Darcian flow in a single vertical fracture toward a well were obtained using the Boltzmann transform and a powerlaw relation between hydraulic gradient and specific discharge (Wen et al. 2006). Heterogeneity of the subsurface may cause eddy-like flow pattern also requiring employment of a nonlinear transport law (Hemker and Bakker 2006). The scales of spatial correlations and their effect on the response of pumping tests are a matter of ongoing research (e.g., Neuman et al. 2004).
5.2
Transient Flow in Deformable Hydraulic Conduits with Simple Geometries
Some field observations, such as counterintuitive pressure increase at the rims of reservoirs produced over extended time spans, known as Noordbergum effect (e.g., Berg 2009; Detournay and Cheng 1988; Hsieh 1996; Kim and Parizek 1997; Rodrigues 1983), or inverse water-level fluctuations in the monitoring wells surrounding a production or injection well (e.g., Gellasch et al. 2013; Slack et al. 2013; Vinci et al. 2013), are expressions of hydromechanical coupling that is not covered by linear, homogeneous diffusion equations. On the level of the linearized treatment, the coupling term constitutes a variable source term for diffusion that itself is the solution of a partial differential equation. When considering individual fluid conduits, e.g., joints and fractures or faults, their geometric features become central for the hydromechanical response (e.g., Murdoch and Germanovich 2006). In particular, the pressure-sensitive and stress-sensitive aperture of joints controls their transmissivity and their stiffness (closely linked to storativity) (e.g., Bandis et al. 1983; Barton et al. 1985), thus constituting the key element of the coupling (e.g., Murdoch and Germanovich 2012). Below we derive the governing equations for a deformable fracture in detail to demonstrate the procedure and to highlight their characteristics that go beyond linear storage equations.
Modeling of Fluid Transport in Geothermal Research
1481
Basic Equations IV: Radial Flow in a Deformable Ellipsoidal Conduit The full treatment of flow of compressible fluids in deformable conduits requires the solution of the Navier-Stokes equations (16) with boundaries for the fluid flow whose current geometry varies with fluid pressure and stress (e.g., Al-Yaarubi et al. 2005). Here, we use instead the simple model of an ellipsoidal conduit to demonstrate the difference in governing equations when hydromechanical coupling is included with respect to the diffusion equations introduced above (see also Vinci et al. 2013). At first glance, an ellipsoidal conduit appears to be a fairly unrealistic model for natural fractures with rough surfaces that are partly in contact when the normal stress on the fracture exceeds the fluid pressure on it. Yet, it is the very characteristic of real fractures to maintain finite hydraulic conductivity that one may link to an effective hydraulic aperture as suggested by the cubic law (28) at least from a phenomenological perspective (Witherspoon et al. 1980). This effective hydraulic aperture is determined by the surface characteristics of the fracture planes and acting external forces and is modulated by variations in fluid pressure during the transient flow. This response in aperture to external forces and pressure along the fracture can be addressed by a “bulk” or a “structure” stiffness that neglects the complexity of the local deformations of contacts. Thus, prescribing an ellipsoidal shape simply prescribes a specific “bulk” or “structural” stiffness parameter. A geometry of the model fracture different from the chosen ellipsoidal shape will undoubtedly affect the details of the pressure transients, but at this point, we aim solely at demonstrating the principal consequences of an account of purely elastic hydromechanical coupling. Conservation of mass of the fluid completely filling an ellipsoidal conduit of volume Vfrac , occupying different volumes V .t/ in space during the flow, is satisfied when d dt
Z
dmf D
V .t /Vfrac
d dt
Z
fR dv D 0 :
(79)
V .t /Vfrac
Exploiting the symmetry, we specifically consider hollow cylinders (inner radius ri , outer radius ro ) with end faces determined by the oblate shape of the ellipsoid to arrive at a line integral: d dt
Zro
2 fR ı.r; t/ r dr D 0
(80)
ri
where aperture ı describes the fracture dimension normal to the radial direction. We focus on ellipsoids with large aspect ratios (i.e., a diameter-aperture ratio 2R=ı 1) and are thus permitted to treat velocity as purely radial. Applying Reynolds transport theorem (e.g., Holzapfel 2005) to compute the material time derivative of the line integral and transferring back to the current configuration yields
1482
J. Renner and H. Steeb
Zro
fR ır
d ln ı 1 dr d ln fR C C C @r vf dt dt r dt
dr D 0 :
(81)
ri
Since we are free in the choice of the hollow cylinder, the integrand has to vanish everywhere. Thus, the local form of fluid mass balance in an ellipsoidal fracture results to
wf D 0; @t ln.fR ı/ C @r wf ln.fR ı/ C r
(82)
after writing out the material derivatives and approximating vf D wf , i.e., assuming that the velocity of the solid phase is negligible but for its effect on aperture evolution. The local form can also be found using the localization theorem (e.g., Gonzalez and Stuart 2008). Considering the large aspect ratio of the fracture and the small curvature of the fracture surface, we approximate the flow in the ellipsoidal conduit by Poiseuille’s law for flow between two parallel plates (27) as a specific representation of Darcy’s law and thus the balance of momentum to arrive at wfr D
ı 2 .r; t/ @r p: 12 fR
(83)
Upon insertion of (83) into (82), a prototype one-dimensional convection-diffusion equation is gained: @t p
ı2 ı ı2 1 1 @ @t ı .r@ p/ @ ı@ p .@r p/2 D r r r r fR fR fR 12 ˇf r 4 ˇf 12
ˇf ı (84)
comprising a time-dependent term, a diffusion term, a convection term, a quadratic term, and a term at the right-hand side, related to the fracture-deformation velocity. The terms that find their correspondence to the ones in (48) reflect linear diffusion in radial symmetry with a source term due to temporal changes in aperture. The quadratic term and the convection term (due to the tapering of the conduit) exemplify what is “lost” during the linearization applied when deriving (48). To close the problem, we need to specify the relation between local aperture and fluid-pressure distribution in the conduit. For an ellipsoid, an analytic solution of the associated deformation problem is available (see, e.g., p. 489 of Sneddon 1995):
ı.r D r=R/ D
2R EQ
Z1 r
0 1 1 Z @ p.a/ A d: p 2 r 1 2 0
(85)
Modeling of Fluid Transport in Geothermal Research
1483
This integral equation illustrates the nonlocal character of deformation; applying a pressure (normal stress) to the conduit surface at any position causes a displacement of the entire surface. Implicitly, we neglect differences between hydraulic and mechanical aperture. This approach seems fairly justified since changes in the two conceptually different apertures with pressure are of importance rather than their absolute values. Absolute values of these effective geometrical measures differ quite likely, but their sensitivity to pressure is probably similar as documented by the good correspondence between normal stiffness and pressure sensitivity of hydraulic conductivity of fractures (e.g., Pyrak-Nolte and Morris 2000).
Assessing the Practical Relevance of Hydromechanical Coupling for Flow in a Deformable Ellipsoidal Conduit For illustration, we present numerical results for an ellipsoidal conduit with radius R D 100 m and aperture ı.rw ; t0 / D 5 104 m in a rock with E D 10:6 GPa and D 0:34 subjected to a constant injection rate of 4104 m3 s1 starting at t D 0:1 s. Calculations were performed with various combinations of terms switched on or off and also for a range of storage properties. For long pumping duration, all calculations yield the same slope of the pressure transient since eventually a flat pressure distribution is reached in all cases and continued injection simply lifts the pressure level everywhere in the ellipsoidal conduit when embedded in a non-permeable environment. At a distance r D 50 m from the injection well, the fully coupled treatment yields the peculiar inverse pressure response as a first deviation from the prescribed initial pressure (Fig. 5). In the chosen example configuration, the quadratic term is minute. The convective term is larger, but its modest effect on the pressure transient in comparison to the large differences between results for diffusion equations without coupling and the coupled calculations documents that the flow model is not overly sensitive to the exact geometry of the conduit. A rigid fracture responds almost instantaneously to the injection, and pressure exhibits only a short period of adjusting to the long-time behavior due to the finite fluid compressibility. The series of calculations using different values for specific storage capacity demonstrates that in principle the pressure transient of the full solution can be approximated (apart from the inverse pressure response) when choosing a large enough capacity in the linear diffusion equation (Fig. 5). In comparison to the Q we “constant-pressure reference” storage capacity sref D Sref =ı D 1=Kf C R=ı E, however need too large a capacity to match the strong delay in response exhibited by the fully coupled solution. For the used elastic parameters and the prescribed fracture geometry, we estimate sref 7 104 ˇ f , while s > 106 ˇ f would be determined from matching the pressure record by an analysis based on the linear diffusion equation alone. The fracture appears significantly more deformable when its nonlocal deformation is neglected. To analyze the practical implications of the various physical effects covered by (84) in a more general way, we reformulate this governing equation in nondimensional variables by introducing characteristic scales. Employing characteristic
1484
J. Renner and H. Steeb
1011 fully coupled
linear diffusion
no convection
pressure [Pa]
109
id
rig
1
10
107
3
10
6
10
105
103 10−1
101
103 time [s]
105
Fig. 5 Pressure transient at a distance r D 50 m from the injection for an ellipsoidal fracture with radius R D 100 m and aperture ı.rw ; t0 / D 5 104 m. Water was considered as the pumped fluid, with a density of 1,000 kg/m3 , a compressibility of 4:2 1010 Pa1 , and a viscosity of 103 Pa s. The red lines representing calculations based on linear diffusion equations are labeled according to the chosen specific storage capacity value: “rigid” corresponds to storage due to the compressibility ˇ f of the fluid alone; the numbers give the multiples of ˇ f employed. The black lines represent results for the coupled problem either with all terms of Eq. (86) or with the convective term switched off (no convection) in the calculations. In this example, the effect of the quadratic is too small to be resolved graphically
Q r D r ? rQ , values (,Q ) allows us to define nondimensional variables (, ? ): ı D ı ? ı, ? Q ? t D t t , and p D p p. Q Fracture radius is chosen as characteristic length scale, i.e., rQ D R, and the analytic solution for a fracture deformed by a constant pressure ıQ D 4 .1 2 / pQ R=.E/ WD pQ R=EQ (see, e.g., Valko and Economides 1995) is used as characteristic aperture allowing us to replace the characteristic value for pressure by the one for the aperture. The characteristic time tQ is represented by the time it takes in reality to apply the boundary conditions (i.e., the time needed until the pump reached a prescribed pressure or flow rate). Finally, we define characteristic fracture Q In the introduced dimensionless quantities, Eq. (84) reads aperture as Q D 2R=ı.
2
ı? B 2 1 Q 2 A @t ? p ? ? .r ? @r ? p ? / 3ı ? @r ? ı ? @r ? p ? ı ? .@r ? p ? /2 D Q 3 C ? @t ? ı ? Q r ı (86)
Modeling of Fluid Transport in Geothermal Research
1485
with dimensionless numbers A D 3 fR ˇf =tQ, B D 2EQ ˇf , and C D 3 fR =.2EQ tQ/ scaling pressure transients, nonlinear diffusion, and the transients caused by hydromechanical coupling relative to diffusion-convection, respectively. With order of magnitude estimates of tQ 100 : : : 102 s, EQ 1010 Pa, and fR
103 (105 ) Pa s, ˇf 1010 (106 ) Pa1 for water (gas), we find that the quadratic term is negligible for apertures exceeding 101 (104). The occurrence of significant pressure transients and/or coupling terms requires apertures in excess of about 104 . The convection term may reach some significance at the tip of the conduit given sufficiently large curvature. Apertures of 104 are rather typical from a field perspective for fractured and jointed media since even a feature with a substantial hydraulic width of up to 1 mm, for example, then has to have a length on the order of only 10 m.
Oscillatory Flow and Dynamic Permeability As a special case of transient flow that is rather accessible to analytic solution, oscillatory flow has received quite a bit of attention. Studies were motivated by actual periodic excitation in various applications (such as the blood system) but also by the perspective of finding causal transient solutions by integration of the partial harmonic solutions when linearity of the underlying differential equations permits such a superposition. Periodic fluid-pressure perturbations have been used in the laboratory (e.g., Fischer and Paterson 1992; Kranz et al. 1990; Song and Renner 2006) and in the field (Rasmussen et al. 2003; Renner and Messar 2006) to derive hydraulic properties from the analyses of the diffusion equation when inertia effects are thought to be subordinate. Periodic pumping tests constitute a particularly useful extension of pulse testing (e.g., Black and Kipp 1981; Fokker et al. 2012, 2013; Kuo 1972; Marschall and Barczewski 1989). The flow rate is varied a number of times with a fixed frequency and amplitude. Advantages compared to single-pulse testing emerge for heterogeneous reservoirs since a variation in frequency allows the operator to “screen” through the underground to varying penetration depths (Renner and Messar 2006; Song and Renner 2007; Fokker et al. 2013). Also, the net-zero balance of fluid is advantageous when dealing with contaminated fluids. A large amount of mostly theoretical and numerical work on flow in simple conduits and regular porous media investigated harmonics with frequencies that require treatment of inertia forces. The classic problem of oscillatory flow of a compressible fluid through a deformable pipe is associated with quite a range of phenomena depending on material properties of the solid and the fluid and the frequency of the oscillations. The viscous forces dominating the stationary creeping flow (22) may be exceeded by inertia forces of either fluid or solid leading to elastic waves propagating in the fluid and/or the tube that may interfere in complex ways. Early studies by prominent scientists serve as an indication of the eminent technical importance of transient flow in tubes (e.g., Kirchhoff 1868; Rayleigh 1896). The number of subsequent theoretical treatments of this problem is impressive (e.g., Bernabé 2009; Korneev 2010; Tijdeman 1975; Womersley
1486
J. Renner and H. Steeb
1955), yet experimental studies are comparatively scarce (but see Auriault et al. 1985). Significant application potential goes way beyond the mentioned modeling of pore space as a network composed of conduits with close to tubular geometry. Applications encompass problems as diverse as the propagation of blood in the bifurcated blood-vein system (e.g., Zamir 2000) or Stoneley or tube waves in kilometer deep boreholes (e.g., Norris 1989). Radial profiles of axial velocity for harmonic flow through a pipe assume plugflow characteristics but with overshoots near the pipe wall (e.g., Kurzeja et al. 2013; Zhou and Sheng 1989) that are not observed for steady turbulent flow. The timedependent velocity profiles scale with the Womersley number Wo D ! 2 R= fR , representing the ratio of transient inertial force and viscous force. Velocity profiles for oscillating radial flow between parallel plates exhibit similar features (e.g., Elkouh 1975). As the frequency of the applied pore fluid gradient increases, less pore fluid can be brought into motion due to inertia. The increasing dominance of inertial effects leads to an apparent decrease of permeability. Introducing a complex-valued dynamic permeability constitutes a compact account of the switch from viscosity-controlled diffusion to inertia-controlled wave propagation with increasing frequency (e.g., Charlaix et al. 1988; Johnson et al. 1987; Sheng and Zhou 1988). The associated generalized or dynamic Darcy’s law was previously justified from first principles (e.g., Zhou and Sheng 1989), using a homogenization process (e.g., Auriault et al. 1985; Boutin and Geindreau 2008), or other averaging techniques (e.g., Smeulders et al. 1992). Here, we give an account from the perspective of mixture theory. We start from the balance of momentum of the fluid phase (35) in which now the inertia forces cannot be neglected nor will we assume the solid to be rigid. Furthermore, we distinguish between fluid volume that is free to flow through the pore network and fluid volume that is essentially stagnant, for example, in dead-end pores, and therefore moves in phase with the solid. The stagnant fluid volume is thought to constitute added mass for the solid. Thus the balance equation reads ˛1 fR xR f C .1 ˛1 / fR xR s div Tf D fR b C pO f
(87)
where 0 ˛1 1 denotes the fraction of mobile fluid. Inserting the relations of stress and momentum exchange for a linear viscous fluid (33, 62) into (87), we arrive at P f C fR xR s C grad p f D ˛1 fR w
fR 2 wf C fR b: ks
(88)
For harmonic fields, e.g., wf D wf exp. i !t /, the complex amplitudes obey a relation closely resembling Darcy’s law (36) q f D wf D
k.!/ grad p f C fR b C ! 2 fR xs : fR
(89)
Modeling of Fluid Transport in Geothermal Research
1487
As a further manifestation of the hydromechanical coupling associated with transient flow problems, the deformation of the solid skeleton adds a third driving force for oscillatory fluid flow in addition to pressure forces and body forces acting on the fluid. The complex-valued dynamic permeability
k.!/ WD
1 1 s k s WD ! k ˛1 fR k s 1i 1 i! !c
fR
(90)
has a real-valued low-frequency limit that coincides with k s expressing that flow and pressure gradient remain in phase for infinitely slow changes (i.e., its imaginary part is zero). At high frequencies, the real part of the dynamic permeability is inversely proportional to frequency to a power of 3/2, while its imaginary part is inversely proportional to frequency. The transition from the plateau to the power-law decrease occurs over a narrow frequency interval characterized by the critical frequency !c D
fR =.˛1 fR k s /. In some derivations, instead of =k s , the square of a Q is used, for example, hydraulic radius characteristic length scale of the pore space, l, or pore radius if pore space is constituted by cylindrical pores. The frequency scaling of dynamic permeability initially determined for straight tubes was suggested to be rather universal and fairly insensitive to corrugations of the tubes (Cortis et al. 2003) or details of the microstructure of synthetic and numerically modeled porous media (Chapman and Higdon 1992; Charlaix et al. 1988; Smeulders et al. 1992; Zhou and Sheng 1989) though randomness of the pore space appeared a crucial parameter (Zhou and Sheng 1989). Dispersion of pore size was found to lead to discrepancies between the scaling relation and numerical results (Achdou and Avellaneda 1992). The transition from dynamic frequency dependent to static constant permeability marks the crossover of twoplength scales, the characteristic pore radius and the viscous skin depth ıv D 2 f=.fR !/, the characteristic size of the boundary layer at the pore wall in which transport is controlled by viscous forces. Thus, the transition frequency corresponds to a critical skin depth of ıv 2 D ˛1 lQ2 that gives the characteristic pore scale only for ˛1 D O.1/. Here introduced from the perspective of added mass, ˛1 was first called “3D dynamic viscosity factor” by Biot (1956b) and is considered to be inversely proportional to the tortuosity of the pore space (see Berryman 2003, for a detailed discussion). When connectivity between the pores becomes poor and the percolation threshold is approached, tortuosity may exceed O.1/ by large. Thus, one may expect heterogeneity to significantly affect the transition, an aspect so far hardly covered in the above-cited experimental and numerical investigations that employed rather regular porous media. Indeed, increasing heterogeneity in numerical network models was shown to alter the transition characteristics (Bernabé 1997). In experimental rock mechanics, frequency dependence is investigated coming from the low-frequency end (Fischer and Paterson 1992; Gutman and Davidson 1975; Song and Renner 2006). The observed decrease in permeability with
1488
J. Renner and H. Steeb
increasing frequency is accompanied by an amplitude dependence of permeability that may serve as an indication of inertia effects. However, the deviation from the low-frequency plateau occurs at frequencies that are orders of magnitude different from estimates based on viscous-skin-depth scaling when relying on “characteristic” pore radii. Furthermore, the expected power-law relation with an exponent of 3/2 is not observed. Not accounting for heterogeneity of the pore network and neglecting hydromechanical coupling are likely responsible for the discrepancy between observations and theoretical predictions. Obviously, more work is needed to understand to what extent currently available experimental observations on rock samples at in situ conditions reflect features of inertia-related dynamic permeability. It was previously attempted to transform dynamic Darcy’s law (89) back into the time domain (e.g., Masson and Pride 2010). Apart from the involvement of cumbersome convolution integrals, this procedure raises the question to what extent it is warranted to extend the relation derived for oscillatory flow to general transient flows. Likely, transient solutions of the full set of the Navier-Stokes equations should be directly investigated in the time domain (e.g., Auriault 1980; Carcione and Cavallini 1993; Simon et al. 1984). The derivation of an extended, dynamic Darcy’s law by adding an acceleration term scaled with a relaxation time (Rehbinder 1989) may also proof suitable for general transient solutions under certain conditions. The relaxation time corresponds to the inverse of the critical frequency for dynamic permeability. In the limit of low pressures, a damped wave equation results for the fluid pressure for which transient and oscillatory solutions were presented and subsequently used to experimentally determine the relaxation time (Rehbinder 1992).
5.3
Complex Geometries: Mesoscopic Loss Phenomena
As discussed in the previous sections, the mathematical framework for hydromechanics of homogeneous, isotropic poro-elastic media is constituted by two coupled partial differential equations, one representing conservation of momentum of the fluid-saturated porous medium and one deriving from conservation of partial momentum of the fluid phase. Unconditional decoupling is found only when these equations are formulated using the nonmeasurable quantity fluid increment introduced by Biot. Certain assumptions on relations among the physical properties of the involved fluid and solid may allow for an approximate decoupled description, such as homogeneous diffusion equations for pore fluid pressure. The dominance of decoupled diffusion processes also requires inertia terms to be negligible, i.e., flow processes have to be sufficiently slow. Elastic waves employed in geoscience to analyze the subsurface, in contrast, cover a large range in frequencies depending on the particular application, the source characteristics, etc. Accordingly, a physical description is needed that includes the “full” range of physical processes (from wave propagation to diffusion). The
Modeling of Fluid Transport in Geothermal Research
1489
passage of seismic waves through porous media, for example, causes changes in pore pressure that in turn may – depending on the time scale of deformation, i.e., the wave’s frequency – induce fluid flow, an intrinsically dissipative mechanism leading to attenuation of the wave. Since Biot’s seminal work (Biot 1956a), it was shown in several treatments how the diffusive fluid flow can be equivalently described as a highly dissipative acoustic wave in the pore fluid, often called the slow P-wave or Biot’s wave (e.g., Pride 2005; Smeulders 2005). The treatment of these waves, including attenuation and dispersion phenomena, rests on a (homogeneous) macroscopic analysis that employs microscopic aspects of the pores only in the first steps through the involved model approximations. It is now widely recognized that Biot’s poro-elastic approach cannot account for attenuation in the low-frequency range ( 0 for t ! 1. Secondly, the results permit the determination of effective time-dependent properties of the investigated configuration. Regarding the configuration as a heterogeneous unit cell (RVE), the time-dependent evolution of pore pressure and solid displacements in the RVE lead to effective material properties that evolve in time. In the light of such an interpretation of a RVE in a heterogeneous
1494
J. Renner and H. Steeb 100
1.20
Gassmann-Hill limit c∞ p
10−1 1=Q
c p=c0p
1.15 1.10
∝!
Gassmann-Wood limit c 0p
1.00 0.1
√ ∝ 1= !
10−2
1.05
1
10 100 ! [ 1=s]
1000
10−3 0.1
1
10 100 ! [ 1=s]
1000
Fig. 8 Effective frequency-dependent phase velocity (cp =cp0 ) and inverse quality factor (1=Q) of the interlayer problem related to geometry in Fig. 6 (right). A strong dispersive behavior of the velocity and attenuation can be observed. For the quasi-static case (! ! 0), velocity matches the Gassmann-Wood average cp0 . For frequencies ! ! 1 velocity matches the Gassmann-Hill limit cp1 . Intrinsic attenuation 1=Q matches the expected slopes in the high- and low-frequency range p depicted with / ! and / 1= !, respectively
poro-elastic medium, various investigations have been performed in the past that are often called “mesoscopic loss” or “interlayer flow” problems (e.g., Pride et al. 2004; Quintal et al. 2011; White et al. 1975) owing to the interest focus on viscous loss phenomena associated with such mesoscopic flow phenomena. In analytic or numerical investigations, effective properties can be easily calculated according to the boundary conditions discussed. The total stress tN applied at the outer undrained boundary r D Rout (see (92)) leads to a measurable (time-dependent) displacement at this boundary. The observations are similar to a quasi-static classic creep problem, but the time dependence is caused by nonlocal pressure diffusion within the spheres rather than time-dependent deformation of the solid grains (cf. Quintal et al. 2011). From the results of the quasi-static analysis, we determine the time-dependent and volumetrically averaged stresses hTrr i.t/ and strains h"rr i.t/. Applying a discrete Fourier transform to the averaged quantities, we obtain the complex and frequency-dependent values TNrr .!/ and "Nrr .!/. A complex modulus can then be calculated from H .!/ D
TNrr .!/ : "Nrr .!/
(95)
The modulus H of heterogeneous fluid-saturated media can be calculated for the two limit cases ! ! 0 and ! ! 1 applying the bounds of GassmannWood and Gassmann-Hill (e.g., Mavko et al. 2003; Quintal et al. 2011). The inverse of the quality factor Q, a measure of the intrinsic attenuation of the
Modeling of Fluid Transport in Geothermal Research
1495
RVE, is obtained from the imaginary and real part of the complex modulus H .!/: Im.H .!// 1 D D tan ı: Q.!/ Re.H .!//
(96)
In the field of rheology, 1=Q D tan ı is often called the “loss factor” determining the lag between stresses and strains expressed by the phase angle ı. The real part of the complex modulus measures the stored energy of the RVE (storage modulus). As such effective properties are often used in geophysical applications, we calculate a “primary” phase velocity cp .!/ from the real part of H .!/ " cp .!/ D Re
s 1 c.!/
s
!#1 ;
with
c.!/ D
H .!/ ; RVE
(97)
here, RVE is the averaged density of the RVE. We observe a strong dispersive behavior of the effective phase velocity cp .!/ (normalized by cp0 D cp .! D 0/) converging to the Gassmann-Hill limit for ! ! 1 (Fig. 8 left). The intrinsic attenuation, expressed by the inverse quality factor 1=Q.!/, peaks at around ! D 10 s1 (Fig. 8 right). This peak in attenuation is significantly shifted to lower frequencies compared to attenuation peaks in classic homogeneous poro-elastic media. Furthermore, the slope of 1=Q in the low- and the high-frequency range cannot be simply matched by viscoelastic models indicating that broadly distributed “relaxation times” are inherent to diffusive mechanisms. Mesoscopic loss is one physical source of attenuation observed in heterogeneous porous media (e.g., Mavko et al. 2003; Pride et al. 2004), and in principle attenuation determining attenuation characteristics provides a means of “hydraulic structure identification.” It will have to be seen to what extent the inherent “nonuniqueness” hampers profitable application.
6
Concluding Remarks
Up to now, the sets of partial differential equations describing hydromechanical problems appear rather inaccessible to analytic solutions. Thus, a range of numerical methods have been developed and employed addressing either poro-elastic media (see review by Carcione et al. 2010) or individual conduits (e.g., Zimmerman et al. 2005). Wave phenomena received considerable attention, notably Biot’s slow wave (e.g., Bokov and Ionov 2002; Ferrazzini and Aki 1987; Korneev 2008, 2011; White 1965), investigating bulk waves as well as waves along conduits. Injection-induced seismicity is one crucial field beyond elasticity (see recent reviews
1496
J. Renner and H. Steeb
by Ellsworth 2013; Evans et al. 2012). Key topics are understanding the source mechanics and the coupling between failure and evolution of hydraulic properties (e.g., Cappa and Rutqvist 2011). Further challenges, partly related to evolution during failure, remain the scale dependence of hydraulic properties and the degree and characterization of their heterogeneity (e.g., Neuman 2008). Smart pumping protocols and geophysical surveys are needed for optimum characterization; the ambiguity in model parameters may be partly offset by the need for the identification of changes in system characteristics rather than constraining absolute properties. Albeit crucial – since the fluid transport is the very prerequisite for transporting heat to the surface – understanding subsurface fluid flow remains just one of several steps along the way of modeling deep geothermal reservoirs to the extent that energy provision could reliably predicted. Chemical processes and – obviously not to be overlooked – heat transport are intimately coupled to the fluid transport (e.g., Kümpel 2003). The tools for and the community performing thermo-hydromechanicalchemical modeling are growing steadily. Apparently, three-dimensional modeling is important to capture heat diffusion and advective-conductive heat transfer (e.g., Kolditz 1995a,b). The role of fractures in heat transport continuously experiences particular attention (e.g., McDermott et al. 2009; Mohais et al. 2011; Xiang and Zhang 2012). Massive collection of direct temperature data will probably remain the exception for deep reservoirs, but tracer tests may serve as surrogates (e.g., Qian et al. 2011; Read et al. 2013). Since nonlinear transport characteristics (i.e., nonFickian, non-Fourier in addition to non-Darcian) are also encountered for solutes and heat (Neuman and Zhang 1990; Amiri and Vafai 1994; Bruderer and Bernabé 2001; Geiger and Emmanuel 2010), a lot of research work is still required in the context of geothermal energy provision.
References Achdou Y, Avellaneda M (1992) Influence of pore roughness and pore-size dispersion in estimating the permeability of a porous medium from electrical measurements. Phys Fluids A 4:2651–2673 Al-Yaarubi AH, Pain CC, Grattoni CA, Zimmerman RW (2005) Navier-Stokes simulations of fluid flow through a rock fracture. In: Faybishenko B, Witherspoon P, Gale J (eds) Dynamics of fluids and transport in fractured rocks. Geophysical monograph, vol 162. American Geophysical Union, Washington, DC, pp 55–64 Amiri A, Vafai K (1994) Analysis of dispersion effects and non-thermal equilibrium, non-darcian, variable porosity incompressible flow through porous media. Int J Heat Transf 37(6):939–954 Andrade J, Almeida M, Filho JM, Havlin S, Suki B, Stanley H (1997) Fluid flow through porous media: the role of stagnant zones. Phys Rev Lett 82:3901–3904 Auradou H, Drazer G, Boschan A, Hulin J, Koplik J (2006) Flow channeling in a single fracture induced by shear displacement. Geothermics 35:576–588 Auriault J (1980) Dynamic behaviour of a porous medium saturated by a newtonian fluid. Int J Eng Sci 18:775–785 Auriault JL, Borne L, Chambon R (1985) Dynamics of porous saturated media, checking of the generalized law of darcy. J Acoust Soc Am 77(5):1641–1650 Avellaneda M, Torquato S (1991) Rigorous link between fluid permeability, electrical conductivity, and relaxation times for transport in porous media. Phys Fluids A 3:2529–2540
Modeling of Fluid Transport in Geothermal Research
1497
Bandis S, Lumsden A, Barton N (1983) Fundamentals of rock joint deformation. Int J Rock Mech Min Sci Geomech Abstr 20(6):249–268 Barton N, Bandis S, Bakhtar K (1985) Strength, deformation and conductivity coupling of rock joints. Int J Rock Mech Min Sci Geomech Abstr 22(3):121–140 Bear J (1972) Dynamics of fluids in porous media. American Elsevier Publishing Co., New York. Reprinted by Dover, New York, 1988 Berg SJ, Hsieh PA, Illman WA (2011) Estimating hydraulic parameters when poroelastic effects are significant. Ground Water. 49(6):815–29. doi: 10.1111/j.1745–6584.2010.00781.x Berg SJ (2009) The Noordbergum effect and its impact on the estimation of hydraulic parameters. AGU spring meeting abstracts, pp 200–207 Berkowitz B, Ewing R (1998) Percolation theory and network modeling – applications in soil physics. Surv Geophys 19:23–72 Bernabé Y (1987) The effective pressure law for permeability during pore pressure and confining pressure cycling of several crystalline rocks. J Geophys Res 92(B1):649–657 Bernabé Y (1995) The transport properties of networks of cracks and pores. J Geophys Res 10(B3):4231–4241. doi:10.1029/94JB02986 Bernabé Y (1997) The frequency dependence of harmonic fluid flow through networks of cracks and pores. Pure Appl Geophys 149:489–506 Bernabé Y (2009) Oscillating flow of a compressible fluid through deformable pipes and pipe networks: wave propagation phenomena. Pure Appl Geophys 166:1–26. doi:10.1007/s00024009-0484-3 Bernabé Y, Bruderer C (1998) Effect of the variance of pore size distribution on the transport properties of heterogeneous networks. J Geophys Res 103(B1):513–525. doi:10.1029/97JB02486 Bernabé Y, Brace W, Evans B (1982) Permeability, porosity and pore geometry of hot-pressed calcite. Mech Mater 1:173–183 Bernabé Y, Mok U, Evans B (2003) Permeability-porosity relationships in rocks subjected to various evolution processes. Pure Appl Geophys 160:937–960 Bernabé Y, Li M, Maineult A (2010) Permeability and pore connectivity: a new model based on network simulations. J Geophys Res 115:B10203. doi:10.1029/2010JB007444 Bernabé Y, Zamora M, Li M, Maineult A, Tang Y (2011) Pore connectivity, permeability, and electrical formation factor: a new model and comparison to experimental data. J Geophys Res 116:B11204. doi:10.1029/2011JB008543 Berryman J (1992) Effective stress for transport properties of inhomogeneous porous rock. J Geophys Res 97(B12):17409–17424 Berryman J (2003) Dynamic permeability in poroelasticity. Tech. rep., Stanford University, Stanford Exploration Project, Report 113 Biot MA (1956a) Theory of propagation of elastic waves in a fluid-saturated porous solid. I. Lowfrequency range. J Acoust Soc Am 29:168–191 Biot MA (1956b) Theory of propagation of elastic waves in a fluid-saturated porous solid. II. Highfrequency range. J Acoust Soc Am 29:168–191 Biot MA (1962) Mechanics of deformation and acoustic propagation in porous media. J Appl Phys 33:1482–1498 Biot MA, Willis DG (1957) The elastic coefficients of the theory of consolidation. J Appl Mech 24:594–601 Black JH, Kipp KL (1981) Determination of hydrogeological parameters using sinusoidal pressure test: a theoretical appraisal. Water Resour Res 17(3):686–692 Blair S, Berge P, Berryman J (1996) Using two-point correlation functions to characterize microgeometry and estimate permeabilities of sandstone and porous glass. J Geophys Res 101:20359–20375 Bokov P, Ionov A (2002) Tube-wave propagation in a fluid-filled borehole generated by a single point force applied to the surrounding formation. J Acoust Soc Am 112(6):2634–2644. doi:10.1121/1.1508781 Bourdarot G (1998) Well testing: interpretation methods. Editions Technip, Paris and Institut francais du pètrole, Rueil-Malmaison
1498
J. Renner and H. Steeb
Bourdet D (2002) Well test analysis: the use of advanced interpretation models. Elsevier, Amsterdam Bourdet D, Ayoub J, Pirard Y (1989) Use of pressure derivative in well test interpretation. SPE Form Eval (12777):293–302 Boutin C, Geindreau C (2008) Estimates and bounds of dynamic permeability of granular media. J Acoust Soc Am 124(6):3576–3593. doi:10.1121/1.2999050 Bowen RM (1982) Compressible porous media models by use of the theory of mixtures. Int J Eng Sci 20:697–735 Bower AF (2010) Applied mechanics of solids. CRC/Taylor & Francis Group, Boca Raton Brace W (1980) Permeability of crystalline and argillaceous rocks. Int J Rock Mech Min Sci 17:241–251 Brace W, Walsh J, Frangos W (1968) Permeability of granite under high pressure. J Geophys Res 73:2225–2236 Brodsky EE, Roeloffs E, Woodcock D, Gall I, Manga M (2003) A mechanism for sustained groundwater pressure changes induced by distant earthquakes. J Geophys Res 108(2390). doi:10.1029/2002JB002321 Brown SR (1987) Fluid flow through rock joints: the effect of surface roughness. J Geophys Res 92(B2):1337–1347. doi:10.1029/JB092iB02p01337 Bruderer C, Bernabé Y (2001) Network modeling of dispersion: transition from Taylor dispersion in homogeneous networks to mechanical dispersion in very heterogeneous ones. Water Resour Res 37(4):897–908. doi:10.1029/2000WR900362 Brush D, Thomson N (2003) Fluid flow in synthetic rough-walled fractures: Navier-Stokes, Stokes, and local cubic law simulations. Water Resour Res 39:1085. doi:10.1029/2002WR001346 Butler JJ (1997) The design, performance, and analysis of slug tests. CRC/LLC, Boca Raton. Florida, US Cappa F, Rutqvist J (2011) Modeling of coupled deformation and permeability evolution during fault reactivation induced by deep underground injection of CO2. Int J Greenh Gas Control 5(2):336–346 Carcione JM, Cavallini F (1993) Energy balance and fundamental relations in anisotropicviscoelastic media. Wave Motion 18:11–20 Carcione J, Morency C, Santos J (2010) Computational poroelasticity – a review. Geophysics 75(5):75A229–75A243. doi:10.1190/1.3474602 Cardenas M, Slottke D, Ketcham R, Sharp J (2009) Effects of inertia and directionality on flow and transport in a rough asymmetric fracture. J Geophys Res 114:B06204 Chapman A, Higdon J (1992) Oscillatory Stokes flow in periodic porous media. Phys Fluids A 4:2099–2116. doi:10.1063/1.858507 Chapman M, Liu M, Li X (2006) The influence of fluid-sensitive dispersion and attenuation on AVO analysis. Geophys J Int 167:89–105 Charlaix E, Kushnik A, Stokes J (1988) Experimental study of dynamic permeability in porous media. Phys Rev Lett 61:1595–1598 Chen S, Doolen GD (1998) Lattice Boltzmann method for fluid flow. Ann Rev Fluid Mech 30:329–364 Chen C, Jiao J (1999) Numerical simulation of pumping tests in multilayer wells with non-Darcian flow in the wellbore. Groundwater 37(3):465–474 Cinco-Ley H, Samaniego-V F (1981) Transient pressure analysis for fractured wells. J Pet Tech 33(9):1749–1766, SPE 7490 Cinco-Ley H, Samaniego VF, Dominguez AN (1978) Transient pressure behavior for a well with a finite-conductivity vertical fracture. Soc Pet Eng J 18(4):253–264, SPE 6014 Cortis A, Smeulders DMJ, Guermond JL, Lafarge D (2003) Influence of pore roughness on highfrequency permeability. Phys Fluids 15:1766–1775 Costa JAU, Almeida M, Makse H, Stanley H (1999) Inertial effects on fluid flow through disordered porous media. Phys Rev Lett 82(26):5249–5252. doi:10.1103/PhysRevLett.82.5249 Coussy O (1995) Mechanics of porous continua. Wiley, Chichester
Modeling of Fluid Transport in Geothermal Research
1499
Cryer CW (1963) A comparison of the three-dimensinal consolidation theories of Biot and Terzaghi. Q J Mech Appl Math 16(4):401–412 Da Prat G (1990) Well test analysis for fractured reservoir evaluation. No. 27 in Developments in Petroleum Science. Elsevier, Amsterdam/New York Daniel D (1993) Geotechnical practice for waste disposal. Chapman & Hall, London/New York de Boer R (2000) Theory of porous media. Springer, Berlin/New York Detournay E, Cheng AD (1988) Poroelastic response of a borehole in a non-hydrostatic stress field. Int J Rock Mech Min Sci Geomech Abstr 25(3):171–182 Detournay E, Cheng AHD (1993) Fundamentals of poroelasticity. In: Fairhurst C (ed) Comprehensive rock engineering: principles, practice and projects. Analysis and design method, vol II. Pergamon, Oxford/New York, pp 113–171 Doré A, Sinding-Larsen R (eds) (1996) Quantification and prediction of hydrocarbon resources. Norwegian petroleum society special publications, vol 6. Elsevier, Amsterdam/New York Dvorkin J, Nur A (1993) Dynamic poroelasticity: a unified model with the squirt and the Biot mechanisms. Geophysics 58:524–533 Dvorkin J, Mavko G, Nur A (1992) The dynamics of viscous compressible fluid in a fracture. Geophysics 57(5):720–726 Ehlers W (2002) Foundations of multiphasic and porous materials. In: Ehlers W, Bluhm J (eds) Porous media: theory, experiments and numerical applications. Springer, Berlin, pp 3–86 Ehlers W, Bluhm J (eds) (2002) Porous media. Springer, Berlin Elkouh A (1975) Oscillating radial flow between parallel plates. Appl Sci Res 30:401–417 Ellsworth W (2013) Injection-induced earthquakes (Review). Science 341(6142). doi:10.1126/science.1225942 Evans K, Zappone A, Kraft T, Deichmann N, Moia F (2012) A survey of the induced seismic responses to fluid injection in geothermal and CO2 reservoirs in Europe. Geothermics 41:30–54. doi:10.1016/j.geothermics.2011.08.002 Ferrazzini V, Aki K (1987) Slow waves trapped in a fluid-filled infinite crack: implications for volcanic tremor. J Geophys Res 92:9215–9223 Fischer G, Paterson M (1992) Measurement of permeability and storage capacity in rocks during deformation at high temperature and pressure. In: Evans B, Wong TF (eds) Fault mechanics and transport properties of rocks. Academic, London, pp 213–252 Fokker P, Renner J, Verga F (2012) Applications of harmonic pulse testing to field cases. SPE 154048–MS Fokker P, Renner J, Verga F (2013) Numerical modeling of periodic pumping tests in wells penetrating a heterogeneous aquifer. Am J Environ Sci 9:1–13 Gangi A (1978) Variation of whole and fractured porous rock permeability with confining pressure. Int J Rock Mech Min Sci 15:249–257 Garven G (1995) Continental-scale groundwater flow and geological processes. Ann Rev Earth Planet Sci 23:89–117 Gavrilenko P, Gueguen Y (1989) Pressure dependence of permeability: a model for cracked rocks. Geophys J Int 98:159–172. doi:10.1111/j.1365-246X.1989.tb05521.x Geiger S, Emmanuel S (2010) Non-fourier thermal transport in fractured geological media. Water Resour Res 46:W07504. doi:10.1029/2009WR008671 Gellasch C, Wang H, Bradbury K, Bahr J, Lande L (2013) Reverse water-level fluctuations associated with fracture connectivity. Groundwater Epub ahead of print:1–13. doi:10.1111/gwat.12040 Giacomini A, Buzzi O, Ferrero A, Migliazza M, Giani G (2008) Numerical study of flow anisotropy within a single natural rock joint. Int J Rock Mech Min Sci 45:47–58 Gleeson T, Wada Y, Bierkens M, van Beek L (2012) Water balance of global aquifers revealed by groundwater footprint. Nature 488:197–200. doi:10.1038/nature11295 Gonzalez O, Stuart AM (2008) A first course in continuum mechanics. Cambridge University Press, Cambridge Greenberg R, Brace W (1969) Archie’s law for rocks modeled by simple networks. J Geophys Res 74:2099–2102
1500
J. Renner and H. Steeb
Gubbins D (2001) The Rayleigh number for convection in the Earth’s core. Phys Earth Planet Inter 128:3–12 Gurevich B (2007) Comparison of the low-frequency predictions of Biot’s and de Boer’s poroelasticity theories with Gassmann’s equation. Appl Phys Lett 91:091919 Gutman R, Davidson J (1975) Darcy’s law for oscillatory flow. Chem Eng Sci 30(1):85–95 Hainzl S, Kraft T, Wassermann J, Igel H, Schmedes E (2006) Evidence for rainfall-triggered earthquake activity. Geophys Res Lett 33(L19303). doi:10.1029/2006GL027642 Hassanizadeh SM, Gray WG (1979a) General conservation equations for multi-phase systems: 1. Averaging procedure. Adv Water Resour 2:131–144 Hassanizadeh SM, Gray WG (1979b) General conservation equations for multi-phase systems: 2. Mass, momenta, energy, and entropy equations. Adv Water Resour 2:191–203 Hassanizadeh SM, Gray WG (1987) High velocity flow in porous media. Transp Porous Med 2:521–531 Haupt P (2000) Continuum mechanics and theory of materials. Springer, Berlin Hemker K, Bakker M (2006) Analytical solutions for whirling groundwater flow in two-dimensional heterogeneous anisotropic aquifers. Water Resour Res 42:W12419. doi:10.1029/2006WR004901 Hilfer R (2002) Review on scale dependent characterization of the microstructure of porous media. Transp Porous Media 46:373–390 Hof B, van Doorne C, Nieuwstadt JWF, Faisst H, Eckhardt B, Wedin H, Kerswell R, Waleffe F (2004) Experimental observation of nonlinear traveling waves in turbulent pipe flow. Science 305:1594–1598. doi:10.1126/science.1100393 Holzapfel GA (2005) Nonlinear solid mechanics. Wiley, Chichester/New York Horne RN (1996) Modern well test analysis: a computer-aided approach, 2nd edn. Petro Way, Palo Alto Houzé O, Viturat D, Fjaere OS (2007) Dynamic flow analysis – the theory and practice of pressure transient and production analysis and the use of data from permanent downhole gauges. V. 4.02. Available at http://www.kappaeng.com/ Howes FA, Whitaker S (1985) The spatial averaging theorem revisited. J Chem Eng Sci 40:1387–1392 Hsieh PA (1996) Deformation-induced changes in hydraulic head during ground-water withdrawal. Ground Water 34(6):1082–1089 Hünges E, Erzinger J, Kück J, Engeser B, Kessels W (1997) The permeable crust: geohydraulic properties down to 9101 m depth. J Geophys Res 102(B8):18255–18265. doi:10.1029/96JB03442 Hunt A (2005) Percolation theory for flow in porous media. Lect Notes Phys 674:173–186 Hutter K, Jöhnk K (2004) Continuum methods and physical modeling. Springer, Berlin Hutter K, Schneider L (2009) Solid-fluid mixtures of frictional materials in geophysical and geotechnical context. Springer, Berlin Ingebritsen S, Manning C (1999) Geological implications of a permeability-depth curve for the continental crust. Geology 27:1107–1110 Ingebritsen SE, Manning CE (2003) Implications of crustal permeability for fluid movement between terrestrial fluid reservoirs. J Geochem Explor 78–79:1–6 Johnson C, Greenkorn R, Woods E (1966) Pulse-testing: a new method for describing reservoir flow properties between wells. J Pet Technol 18(12):1599–1604. doi:10.2118/1517-PA Johnson D, Koplik J, Dashen R (1987) Theory of dynamic permeability and tortuosity in fluid saturated porous media. J Fluid Mech 176:379–402 Kim JM, Parizek RR (1997) Numerical simulation of the noordbergum effect resulting from groundwater pumping in a layered aquifer system. J Hydrol 202:231–243 Kirchhoff G (1868) Über den Einfluss der Wärmeleitung in einem Gase auf die Schallbewegung. Poggendorfer Annalen 134:177–193 Kohl T, Evans K, Hopkirk R, Jung R, Rybach L (1997) Observation and simulation of non-Darcian flow transients in fractured rock. Water Resour Res 33(3):407–418. doi:10.1029/96WR03495
Modeling of Fluid Transport in Geothermal Research
1501
Kolditz O (1995a) Modelling flow and heat transfer in fractured rocks: conceptual model of a 3-D deterministic fracture network. Geothermics 24(3):451–470. doi:10.1016/03756505(95)00020-Q Kolditz O (1995b) Modelling flow and heat transfer in fractured rocks: dimensional effect of matrix heat diffusion. Geothermics 24(3):421–437. doi:10.1016/0375-6505(95)00018-L Korneev V (2008) Slow waves in fractures filled with viscous fluid. Geophysics 73(1):N1–N7. doi:10.1190/1.2802174 Korneev V (2010) Low-frequency fluid waves in fractures and pipes. Geophysics 75(6):N97–N107. doi:10.1190/1.3484155 Korneev V (2011) Krauklis wave in a stack of alternating fluid-elastic layers. Geophysics 76(6):N47–N53. doi:10.1190/GEO2011-0086.1 Kranz R, Saltzman J, Blacic J (1990) Hydraulic diffusivity measurements on laboratory rock samples using an oscillating pore pressure method. Int J Rock Mech Min Sci Geomech Abstr 27:345–352 Kümpel HJ (1991) Poroelasticity: parameters reviewed. Geophys J Int 105:783–799 Kümpel HJ (ed) (2003) Thermo-hydro-mechanical coupling in fractured rock. Topical volume of pure applied geophysics. Birkhäuser, Basel/Boston. Reprinted from vol 160, no 5–6 Kuo C (1972) Determination of reservoir properties from sinusoidal and multirate flow tests in one or more wells. SPE J 12:499–507. doi:10.2118/3632-PA Kurzeja P, Steeb H, Renner J (2013) The critical frequency in biphasic media: beyond Biot’s approach. Poromechanics V:2361–2370. doi:10.1061/9780784412992.276 Madden T (1976) Random networks and mixing laws. Geophysics 41:1104–1125 Majer EL, Baria R, Stark M, Oates S, Bommer J, Smith B, Asanuma H (2007) Induced seismicity associated with enhanced geothermal systems. Geothermics 36(3):185–222 Mandel J (1953) Consolidation des sols. Géotechnique 7:287–299 Marschall P, Barczewski B (1989) The analysis of slug tests in the frequency domain. Water Resour Res 25(11):2388–2396. doi:10.1029/WR025i011p02388 Martys N, Garboczi E (1992) Length scales relating the fluid permeability and electrical conductivity in random two-dimensional model porous media. Phys Rev B 46(10):6080–6090 Masson YJ, Pride SR (2007) Poroelastic finite difference modeling of seismic attenuation and dispersion due to mesoscopic-scale heterogeneity. J Geophys Res 112:B03204 Masson Y, Pride S (2010) Finite-difference modeling of Biot’s poroelastic equations across all frequencies. Geophysics 75(2):N33–N41. doi:10.1190/1.3332589 Matthews C, Russell DG (1967) Pressure buildup and flow tests in wells. SPE monograph series, vol 1. Society of petroleum engineers of AIME, New York Matthews G, Moss A, Spearing M, Voland F (1993) Network calculation of mercury intrusion and absolute permeability in sandstone and other porous media. Powder Technol 76:95–107 Mavko G, Mukerji T, Dvorkin J (2003) The rock physics handbook. Tools for seismic analysis in porous media. Cambridge University Press, Cambridge May L (2010) Water resources engineering, 2nd edn. Wiley, New York McDermott C, Walsh R, Mettier R, Kosakowski G, Kolditz O (2009) Hybrid analytical and finite element numerical modeling of mass and heat transport in fractured rocks with matrix diffusion. Comput Geosci 13(3):349–361 McGarr A, Simpson D, Seeber L (2002) 40 case histories of induced and triggered seismicity. In: Lee WHK, Hiroo Kanamori PCJ, Kisslinger C (eds) International handbook of earthquake and engineering seismology. International geophysics, vol 81, part A. Academic, Amsterdam/Boston, pp 647–661. doi:10.1016/S0074-6142(02)80243-1 Mehrabian A, Abousleiman YN (2009) The dilative intake of poroelastic inclusions an alternative to the mandel-cryer effect. Acta Geotech 4:249–259 Mohais R, Xu C, Dowd P (2011) Fluid flow and heat transfer within a single horizontal fracture in an enhanced geothermal system. J Heat Transf 133(11):112603. doi:10.1115/1.4004369 Mok U, Bernabé Y, Evans B (2002) Permeability, porosity and pore geometry of chemically altered porous silica glass. J Geophys Res 107(B1):2015. doi:10.1029/2001JB000247
1502
J. Renner and H. Steeb
Munson B, Young D, Okiishi T (2006) Fundamentals of fluid mechanics. Wiley, New York Müntener O (2011) Serpentine and serpentinization: a link between planet formation and life. Geology 38:959–960. doi:10.1130/focus102010.1 Murdoch LC, Germanovich LN (2006) Analysis of a deformable fracture in permeable material. Int J Numer Anal Methods Geomech 30:529–561 Murdoch L, Germanovich L (2012) Storage change in a flat-lying fracture during well tests. Water Resour Res 48:W12528. doi:10.1029/2011WR011571 Nazridoust K, Ahmadi G, Smith D (2006) A new friction factor correlation for laminar, singlephase flows through rock fractures. J Hydrol 329:315–328. doi:10.1016/j.jhydrol.2006.02.032 Neuman S (2008) Multiscale relationships between fracture length, aperture, density and permeability. Geophys Res Lett 35:L22402. doi:10.1029/2008GL035622 Neuman S, Zhang YK (1990) A quasi-linear theory of non-Fickian and Fickian subsurface dispersion: 1. Theoretical analysis with application to isotropic media. Water Resour Res 26(5):887–902. doi:10.1029/WR026i005p00887 Neuman S, Guadagnini A, Riva M (2004) Type-curve estimation of statistical heterogeneity. Water Resour Res 40:W04201. doi:10.1029/2003WR002405 Nolte D, Pyrak-Nolte L, Cook N (1992) The fractal geometry of flow paths in natural fractures in rock and the approach to percolation. Pure Appl Geophys 131:111–138 Norris A (1989) The tube wave as a Biot slow wave. Geophysics 52:694–696 Olsson W, Brown S (1993) Hydromechanical response of a fracture undergoing compression and shear. Int J Rock Mech Min Sci 30(7):845–851. doi:10.1016/0148-9062(93)90034-B Ortiz A, Renner J, Jung R (2011) Hydromechanical analyses of the hydraulic stimulation of borehole Basel 1. Geophys J Int 185:1266–1287. doi:10.1111/j.1365-246X.2011.05005.x Ortiz A, Renner J, Jung R (2013) Two-dimensional numerical investigations on the termination of bilinear flow in fractures. Solid Earth 4:331–345. doi:10.5194/se-4-331-2013 Phipps L (1981) On transitions in radial flow between closely-spaced parallel plates. J Phys D Appl Phys 14:L197–L201 Piggott A, Elsworth D (1992) Analytical models for flow through obstructed domains. J Geophys Res 97(B2):2085–2093 Poe BD, Economides MJ (2000) Post-treatment evaluation and fractured well performance. In: Economides MJ, Nolte KG (eds) Reservoir stimulation, chap 12. Wiley, Chichester/New York Pride SR (2005) Relationships between seismic and hydrological properties. In: Rubin Y, Hubbard S (eds) Hydrogeophysics, chap 9. Springer, Dordrecht Pride S, Harris J, Johnson DL, Mateeva A, Nihei K, Nowack R, Rector J, Spetzler H, Wu R, Yamomoto T, Berryman J, Fehler M (2003) Permeability dependence of seismic amplitudes. Lead Edge 2(6):518–525 Pride SR, Berryman JG, Harris JM (2004) Seismic attenuation due to wave-induced flow. J Geophys Res 109:B01201. doi:10.1029/2003JB002639 Pyrak-Nolte LJ, Morris JP (2000) Single fractures under normal stress: the relation between fracture specific stiffness and fluid flow. Int J Rock Mech Min Sci 37:245–262 Qian J, ZChen, Zhan H, Luo S (2011) Solute transport in a filled single fracture under non-Darcian flow. Int J Rock Mech Min Sci 48:132–140 Quintal B, Steeb H, Frehner M, Schmalholz S (2011) Quasi-static finite element modeling of seismic attenuation and dispersion due to wave-induced fluid flow in poroelastic media. J Geophys Res 116:B01201 Quintal B, Steeb H, Frehner M, Schmalholz S, Saenger E (2012) Pore fluid effects on s-wave attenuation caused by wave-induced fluid flow. Geophysics 77:L13–L23 Rasmussen T, Haborak K, Young M (2003) Estimating aquifer hydraulic properties using sinusoidal pumping at the Savannah River site, South Carolina, USA. Hydrogeol J 11:466–482 Rayleigh L (1896) Theory of sound, vol II. Macmillan, London Read T, Bour O, Bense V, Borgne TL, Goderniaux P, Klepikova M, Hochreutener R, Lavenant N, Boschero V (2013) Characterizing groundwater flow and heat transport in fractured
Modeling of Fluid Transport in Geothermal Research
1503
rock using fiber-optic distributed temperature sensing. Geophys Res Lett 40:2055–2059. doi:10.1002/grl.50397 Rehbinder G (1989) Darcyan flow with relaxation effect. Appl Sci Res 46:45–72 Rehbinder G (1992) Measurement of the relaxation time in Darcy flow. Transp Porous Media 8(3):263–275 Renard P, de Marsily G (1997) Calculating equivalent permeability: a review. Adv Water Resour 20(5–6):253–278 Renner J, Messar M (2006) Periodic pumping tests. Geophys J Int 167:479–493 Renner J, Hettkamp T, Rummel F (2000) Rock mechanical characterization of an argillaceous host rock of a potential radioactive waste repository. Rock Mech Rock Eng 33(3):153–178 Renshaw C (1995) On the relationship between mechanical and hydraulic apertures in roughwalled fractures. J Geophys Res 100(B12):24629–24636 Revil A, Cathles LM (1999) Permeability of shaly sands. Water Resour Res 35(3):651–662 Rice JR, Cleary MP (1976) Some basic stress diffusion solutions for fluid-saturated elastic porous media with compressible constituents. Rev Geophys Space Phys 14:227–241 Rink M, Schopper JR (1968) Computations of network models of porous media. Geophys Prospect 16:277–294 Rodrigues JD (1983) The Noordbergum effect and characterization of aquitards at the Rio Maior mining project. Ground Water 21(2):200–207 Roeloffs E (1998) Persistent water level changes in a well near Parkfield, California, due to local and distant earthquakes. J Geophys Res 103:869–889 Rojstaczer S, Agnew D (1989) The influence of formation properties on the response of water levels in wells to earth tides and atmospheric loading. J Geophys Res 94:12403–12411 Schopper J (1982) Porosity and permeability. In: Angenheister G (ed) Landolt-Börnstein: Zahlenwerte und Funktionen aus Naturwissenschaften und Technik, vol V, 1a. Springer, Berlin, pp 184–304 Sen Z (1987) Non-Darcian flow in fractured rocks with a linear flow pattern. J Hydrol 92:43–57 Sen Z (1989) Non-linear flow toward wells. J Hydrol Eng 115(2):193–209 Shapiro S, Dinske C (2009) Fluid-induced seismicity: pressure diffusion and hydraulic fracturing. Geophys Prospect 57(2):301–310 Shapiro SA, Müller TM (1999) Seismic signatures of permeability in heterogeneous porous media. Geophysics 64:99–103 Sheng P, Zhou MY (1988) Dynamic permeability in porous media. Phys Rev Lett 61(14):1591–1594. doi:10.1103/PhysRevLett.61.1591 Sibson R, Moore J, Rankin A (1975) Seismic pumping – a hydrothermal fluid transport mechanism. J Geol Soc Lond 131:653–659 Simon BR, Zienkiewicz OC, Paul DK (1984) An analytical solution for the transient response of saturated porous elastic solids. Int J Numer Anal Met 8:381–398 Simpson DW, Leith WS, Scholz CH (1988) Two types of reservoir-induced seismicity. Bull Seism Soc Am 178(6):2025–2040 Sisavath S, Al-Yaarubi A, Pain C, Zimmerman R (2003) A simple model for deviations from the cubic law for a fracture undergoing dilation or closure. Pure Appl Geophys 160:1009–1022 Skjetne E, Auriault J (1999) New insights on steady, non-linear flow in porous media. Eur J Mech B/Fluids 13(1):131–145 Skjetne E, Hansen A, Gudmundsson J (1999) High-velocity flow in a rough fracture. J Fluid Mech 383:1–28. doi:10.1017/S0022112098002444 Slack T, Murdoch L, Germanovich L, Hisz D (2013) Reverse water-level change during interference slug tests in fractured rock. Water Resour Res 49:1552–1567. doi:10.1002/wrcr.20095 Smeulders D, Eggels R, van Dongen M (1992) Dynamic permeability: reformulation of theory and new experimental and numerical data. J Fluid Mech 245:211–227. doi:10.1111/j.1365246X.2007.03339.x Smeulders DMJ (2005) Experimental evidence for slow compressional waves. J Eng Meth-ASCE 131:908–917
1504
J. Renner and H. Steeb
Smits A, Marusic I (2013) Wall-bounded turbulence. Phys Today 66(9):25–30. doi:10.1063/PT.3.2114 Sneddon IN (1995) Fourier transforms. Dover, New York Song I, Renner J (2006) Experimental investigation into the scale dependence of fluid transport in heterogeneous rocks. Pure Appl Geophys 163:2103–2123. doi:10.1007/s00024-006-0121-3 Song I, Renner J (2007) Analysis of oscillatory fluid flow through rock samples. Geophys J Int 175:195–204. doi:10.1111/j.1365-246X.2007.03339.x Song I, Renner J (2008) Hydromechanical properties of Fontainebleau sandstone: Experimental determination and micromechanical modeling. J Geophys Res 113:B09211. doi:10.1029/ 2007JB005055 Spanos TJT (2010) The thermophysics of porous media. Taylor & Francis, Boca Raton Steeb H (2010) Ultrasound propagation in cancellous bone. Arch Appl Mech 80:489–502 Stein C, Stein S (1994) Constraints on hydrothermal flux through the oceanic lithosphere from global heat flow. J Geophys Res 99:3081–3095 Stein C, Stein S, Pelayo A (1995) Heat flow and hydrothermal circulation. In: Humphris S, Mullineaux L, Zierenberg R, Thomson R (eds) Physical, chemical, biological and geological interactions within hydrothermal systems. AGU monographs. American Geophysical Union, Washington, DC, pp 425–445 Stoll RD (1989) Sediment acoustics. Lecture notes in earth sciences. Springer, Berlin Terzaghi K (1923) Die Berechnung der Durchlässigkeitsziffer des Tones aus dem Verlauf der hydromechanischen Spannungserscheinungen. Sitzungsberichte der Akademie der Wissenschaften in Wien, mathematisch-naturwissenschaftliche Klasse 132:125–138 Tidwell V, Wilson J (1999) Permeability upscaling measured on a block of Berea sandstone: Results and interpretation. Math Geol 31(7):749–786 Tijdeman L (1975) On the propagation of sound waves in cylindrical tubes. J Sound Vib 39(1):1–33 Torquato S (2002) Random heterogeneous materials: microstructure and macroscopic properties, vol 16. Springer, New York Townend J, Zoback M (2000) How faulting keeps the crust strong. Geology 28(5):399–402 Truckenbrodt EA (1996) Fluidmechanik: Band 1: Grundlagen und elementare Strömungsvorgänge dichtebeständiger Fluide (German Edition). Springer, Berlin Valko P, Economides M (1995) Hydraulic fracture mechanics. Wiley, Chichester Verruijt A (2010) An introduction to soil dynamics. Springer, Berlin Vinci C, Renner J, Steeb H (2013) A hybrid-dimensional approach for an efficient numerical modeling of the hydro-mechanics of fractures. Water Resour Res 50: 1616–1635. doi:10.1002/2013WR014154 Wang HF (2000) Theory of linear poroelasticity with applications to geomechanics and hydrogeology. Princeton University Press, Princeton/Oxford Weir G (1999) Single-phase flow regimes in a discrete fracture model. Water Resour Res 35:65–73 Wen Z, Huang G, Zhan H (2006) Non-Darcian flow in a single confined vertical fracture toward a well. J Hydrol 330:698–708 White J (1965) Seismic waves: radiation, transmission, and attenuation. McGraw-Hill, New York White JE, Mikhaylova NG, Lyakhovitskiy FM (1975) Low-frequency seismic waves in fluidsaturated layered rocks. Earth Phys 10:44–52 Wilma´nski K (1998) A thermodynamic model of compressible porous materials with the balance equation of porosity. Transp Porous Med 32:21–47 Wilma´nski K (2006) A few remarks on Biot’s model and linear acoustics of poroelastic saturated materials. Soil Dyn Earthq Eng 26:509–536 Witherspoon P, Wang J, Iwai K, Gale J (1980) Validity of cubic law for fluid-flow in a deformable rock fracture. Water Resour Res 16:1016–1024 Womersley G (1955) Method for the calculation of velocity, rate of flow and viscous drag in arteries when the pressure gradient is known. J Physiol 127(3):553–563 Xiang Y, Zhang Y (2012) Two-dimensional integral equation solution of advective-conductive heat transfer in sparsely fractured water-saturated rocks with heat source. Int J Geomech 12(2):168–175
Modeling of Fluid Transport in Geothermal Research
1505
Yeo I, de Freitas M, Zimmerman R (1998) Effect of shear displacement on the aperture and permeability of a rock fracture. Int J Rock Mech Min Sci 35:1051–1070 Yeung K, Chakrabarty C, Zhang X (1993) An approximate analytical study of aquifers with pressure-sensitive formation permeability. Water Resour Res 29(10):3495–3501 Yilmaz O, Nolen-Hoeksema R, Nur A (1994) Pore pressure profiles in fractured and compliant rocks. Geophys Prospect 42:693–714 Zamir M (2000) The physics of pulsatile flow. Springer, New York Zhou MY, Sheng P (1989) First-principles calculations of dynamic permeability in porous media. Phys Rev B 39(16):12027–12039. doi:10.1103/PhysRevB.39.12027 Zienkiewicz OC, Shiomi T (1984) Dynamic behaviour of saturated porous media; the generalized Biot formulation and its numerical solution. Int J Numer Anal Met 8:71–96 Zienkiewicz OC, Chan HC, Pastor M, Schrefler BA, Shiomin T (1999) Computational geomechanics with special reference to earthquake engineering. Wiley, Chichester Zimmerman R, Somerton W, King M (1986) Compressibility of porous rocks. J Geophys Res 91:12765–12777 Zimmerman R, Chen D, Cook N (1992) The effect of contact area on the permeability of fractures. J Hydrol 139:79–96 Zimmerman R, Al-Yaarubi A, Pain C, Grattoni C (2005) Nonlinear regimes of fluid flow in rock fractures. Int J Rock Mech Min Sci 41:1A27 Zoback MD, Harjes H (1997) Injection-induced earthquakes and crustal stress at 9 km depth at the KTB deep drilling site, Germany. J Geophys Res 102:18477–18492
Fractional Diffusion and Wave Propagation Yuri Luchko
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Continuous Time Random Walk Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 CTRW Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Advection-Diffusion Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Fractional Diffusion-Wave Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Time-Fractional Diffusion-Wave Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Generalized Time-Fractional Diffusion Equation . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Fractional Diffusion-Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Fractional Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Analysis of the Fractional Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Fundamental Solution as a pdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Extrema Points, Gravity, and “Mass” Centers of the Fundamental Solution and Location of Its Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Velocities of the Damped Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Discussion of the Obtained Results and Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusions and Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1508 1509 1510 1511 1513 1514 1518 1518 1525 1529 1530 1534 1535 1539 1540 1542 1543
Abstract
In this chapter, a short overview of the current research towards applications of the partial differential equations of an arbitrary (not necessarily integer) order for modeling of the anomalous transport processes (diffusion, heat transfer, and wave propagation) in the nonhomogeneous media is presented. On the microscopic level, these processes are described by the continuous time random walk (CTRW)
Y. Luchko () Department of Mathematics, Physics, and Chemistry, Beuth Technical University of Applied Sciences Berlin, Berlin, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_60
1507
1508
Y. Luchko
model that is a starting point for derivation of some deterministic equations for the time- and space-averaged quantities that characterize the transport processes on the macroscopic level. In this work, the deterministic models are derived in the form of the partial differential equations of the fractional order. In particular, a generalized time-fractional diffusion equation and a time- and spacefractional wave equation are introduced and analyzed in detail. Finally, some open questions and directions for further work are suggested.
1
Introduction
In many geological applications, e.g., modeling of the geothermal energy extraction (Luchko and Punzi 2011), stability and seismicity of the fractal fault systems (Gudehus and Touplikiotis 2012), or propagation of the damped waves (Luchko 2013), one has to deal with the transport processes that take place in the highly inhomogeneous media and are subject to external and internal forces being applied at different time and space scales. This raises the question of how reliable are the standard models for the transport processes in these complex environments. In particular, even though the Fourier and Fick’s laws are still the standard tools for modeling of the transport processes on the macro-level, they often fail to grasp the behavior of systems with the anomalous components and phenomena, the so-called anomalous transport processes (see, e.g., Geiger and Emmanuel 2010). Within the last few decades, the anomalous transport processes that do not follow the Gaussian statistics have been observed and confirmed in several different application areas in natural sciences, biology, geological sciences, medicine, etc. This forced even stronger research activities towards techniques and approaches for their adequate modeling. In this chapter, we consider one powerful approach, namely, the continuous time random walk (CTRW) model combined with the fractional dynamics on the macro-level and apply it for modeling of anomalous diffusion, heat transport, and wave propagation in heterogeneous media. The models for anomalous transport processes in the form of the time- and/or space-fractional partial differential equations enjoyed a particular attention and were introduced and analyzed by a number of researches since the 1980s. In particular, this kind of phenomena is known to occur in inhomogeneous media that combine characteristics of solid-like materials that exhibit wave propagation and fluidlike materials that support diffusion processes. In addition to the physical motivation and setting up of the models, mathematical analysis of the models and an overview of the numerical methods for their solution, some plots, and interpretation of the obtained results are presented in this chapter. In particular, we deal with the generalized time-fractional diffusion equation that of course can be employed to describe the anomalous heat conduction, too. To investigate this equation, a maximum principle well known for the elliptic and parabolic type PDEs is extended to the initial-boundary-value problems for the generalized diffusion equation of the fractional order. Then the Fourier spectral
Fractional Diffusion and Wave Propagation
1509
method is applied to obtain solutions to these problems in explicit form. Another important equation that is considered in this chapter is a fractional generalization of the wave equation that describes propagation of damped waves in inhomogeneous media. In contrast to the fractional diffusion-wave equation, the fractional wave equation contains fractional derivatives of the same order ˛; 1 ˛ 2, both in space and in time. We show that this feature is a decisive factor for inheriting some crucial characteristics of the wave equation like a constant propagation velocity of both the maximum of its fundamental solution and its gravity and mass centers. Moreover, the first, the second, and the Smith centrovelocities of the damped waves described by the fractional wave equation are constant and depend just on the equation order ˛. The fundamental solution of the fractional wave equation is determined and shown to be a spatial probability density function evolving in time that possesses finite moments up to the order ˛. To illustrate analytical findings, results of numerical calculations and some plots are presented.
2
Continuous Time Random Walk Model
The continuous time random walk (CTRW) model was first introduced in Montroll and Weiss (1965) to model transport processes that show anomalous behavior. The main idea behind a CTRW is first to interpret a transport process on the microlevel as a flow of many parcels. If one assumes that the parcels are independent from each other, then their state and behavior can be described in terms of the probability P .x; t/V t of an individual parcel to be located inside the volume V within the time interval t. The function P .x; t/ is an unknown probability density function (pdf) that satisfies the so-called master equation and is connected with the jump pdf that characterizes the transport process and is supposed to be known. Raising certain conditions on the jump pdf, the master equation can be transformed to some deterministic differential or integrodifferential equations that the pdf P .x; t/ has to satisfy at least on the large time and space scales. In its turn, the macrocharacteristics of the transport process like the concentration c.x; t/ of the substance or its temperature T .x; t/ at a certain place x to a certain time instant t averaged over the time interval t; t 2 t and the space volume V; x 2 V are proportional to P .x; t/ and consequently governed by the same equations. This means that in the framework of the CTRW model, the key role is played by the pdf P .x; t/ that describes a random walk of an individual parcel within the transport process. In what follows, we consider the random walk of an individual parcel and analyze its characteristics. For notational simplicity, we focus on the one-dimensional random walks. The multidimensional version follows the same steps. For more details regarding the CTRW models and their applications for modeling of the anomalous transport processes, see, e.g., Berkowitz et al. (2002), Emmanuel and Berkowitz (2007), Fulger et al. (2008), Gorenflo and Mainardi (2009), Luchko (2012a), and Metzler and Klafter (2004).
1510
2.1
Y. Luchko
Brownian Motion
In the framework of the well-known random walk model for the Brownian motion, the random walker jumps at each time step t D t0 ; t0 C t; t0 C 2t; : : : in a randomly selected direction, thereby covering the distance x, the lattice constant. Denoting by P .x; t/ x the probability that the random walker is located between x and x C x at the time t, the formula of total probability easily leads to the master equation P .x; t C t/ D
1 1 P .x C x; t/ C P .x x; t/: 2 2
(1)
For the one-dimensional Brownian motion, we substitute the Taylor expansions @P C O .t/2 as t ! 0; @t @P .x/2 @2 P P .x ˙ x; t/ D P .x; t/ ˙ x C as x ! 0 C O .x/3 2 @x 2 @x P .x; t C t/ D P .x; t/ C t
into the master equation (1) and get the formula @P .x/2 @2 P .x/2 D C O.t/ C x O @t 2t @x 2 t
as t ! 0; x ! 0:
(2)
In the continuum limit t ! 0 and x ! 0, this equation becomes the standard diffusion equation @P @2 P D d1 2 @t @x
(3)
under the condition that the diffusion coefficient d1 D lim
x!0 t !0
.x/2 2t
is finite. Of course, the same procedure leads to the two- or three-dimensional diffusion equations for the two- or three-dimensional Brownian motion, respectively: X @2 @P D d1 P; WD ; n D 2; 3: @t @xi2 i D1 n
(4)
When the random walker is located at the starting point x D 0; x 2 R; n D 1; 2; 3 at the time t D 0, then the initial condition
Fractional Diffusion and Wave Propagation
P .x; 0/ D
n Y
1511
ı.xi /; n D 1; 2; 3;
(5)
i D1
with the Dirac ı-function has to be added to the model. The solution x2 1 exp P .x; t/ D p 4d1 t 4d1 t
(6)
of the one-dimensional diffusion equation (3) with the initial condition (5) can be easily obtained, e.g., with the Laplace and Fourier integral transforms technique. The pdf (6) is a spatial Gaussian distribution at p any time point t > 0 with the middle value D 0 and with the deviation D 2d1 t that means that the mean squared displacement of a parcel that participates in the transport process is given by 2 .t/ D 2d1 t. It is important to mention that the central limit theorem ensures the same behavior of the pdf P .x; t/ on the large time and space scales in the case when the waiting time t is not fixed, but the pdf that describes a distribution of the waiting times between two successive jumps possesses a finite mean value t.
2.2
CTRW Model
In contrast to the random walk model for the Brownian motion, the CTRW model is based on the idea that the lengths of the jumps and the waiting times between two successive jumps are governed by a joint pdf .x; t/ that is referred to as the jump pdf. From .x; t/, the jump length pdf Z
1
.x/ D
.x; t/ dt
(7)
.x; t/ dx
(8)
0
and the waiting time pdf Z
1
w.t/ D 1
can be deduced. The main characteristics of the CTRW models are the characteristic waiting time Z
1
w.t/ t dt
T D
(9)
0
and the jump length variance Z
1
˙2 D
.x/ x 2 dx: 1
(10)
1512
Y. Luchko
They can be finite or infinite and this makes the difference between the CTRW models. Usually, the following different cases are distinguished: • Both T and ˙ 2 are finite: Brownian motion (diffusion equation as a deterministic model) • T diverges, ˙ 2 is finite: Sub-diffusion (time-fractional diffusion equation as a deterministic model) • T is finite, ˙ 2 diverges: Markovian Levy flights (space-fractional diffusion equation as a deterministic model) • Both T and ˙ 2 are infinite: Non-Markovian Levy flights (time-space-fractional diffusion equation as a deterministic model) It is known that the master equations for the CTRW model can be formulated in the form of some integral equations of the convolution type (see, e.g., Metzler and Klafter 2000). Below we give a short summary of how to derive these equations. Let us denote by .x; t/ the probability of the event that at the time instant t a parcel just arrives to the point x. According to the law of total probability, .x; t/ satisfies the equation Z
Z
C1
dx 0
.x; t/ D 1
t
x0 ; t 0
x x 0 ; t t 0 dt 0 C P0 .x/ı.t/;
(11)
0
.x; t/ being the jump pdf that is supposed to be known. The pdf P .x; t/ that governs the event that at the time instant t the parcel is located at the position x is given by Z
t
P .x; t/ D
x; t 0 , t t 0 dt 0 ;
(12)
0
where Z
t
, .t/ D 1
w t 0 dt 0
(13)
0
is assigned to the probability of no jump event within the time interval Œ0; t and w.t/ is the waiting time pdf. The integral equations (11)–(13) determine the one-point probability density function that is an important part of the mathematical model for the CTRW but of course not enough to fully characterize the underlying stochastic process (see, e.g., Germano et al. 2009 for more details). Let us now transform Eqs. (11)–(13) into the frequency domain by applying the Fourier and the Laplace transforms. Applying the well-known convolution theorems for the Fourier and the Laplace transforms and solving the transformed equations for the unknown Fourier and Laplace transformed pdf POQ .; s/, we get the formula
Fractional Diffusion and Wave Propagation
1513
PO0 ./ 1 w.s/ Q ; POQ .; s/ D s 1 OQ .; s/
(14)
where PO0 ./ denotes the Fourier transform of the initial condition P0 .x/ WD P .x; 0/. It is worth to mention that a purely probabilistic proof of Eq. (14) is given in Germano et al. (2009). We remind the readers that the Fourier transform is defined by Z fO./ D F ff .x/I g D
C1
e Ci x f .x/ dx ; 2 R ; 1
and the Laplace transform by Z
1
fQ.s/ D Lff .t/I sg D
e st f .t/ dt; s 2 C : 0
2.3
Advection-Diffusion Equation
Until now, no particular assumptions for the densities .x/ and w.t/, except one for their integrability, have been made. In what follows, we assume for simplicity that the jump lengths and waiting times are independent random variables and the jump pdf .x; t/ can be written in the decoupled form .x; t/ D .x/ w.t/. In this case O w.s/ the equation OQ .; s/ D ./ Q holds true. Straightforward calculations show that if w.t/ possesses a finite mean and .x/ a finite variance, then Eq. (14) can be transformed to the standard advection-diffusion equation for large t and jxj and therefore describes the Brownian motion on the large time and space scales. Indeed, in this case the Fourier transform of .x/ has the asymptotic behavior M2 2 O ./
1 iM1 2
as
! 0;
(15)
where M1 and M2 are the first and the second moments of the jump length pdf .x/, respectively. Substituting this expression into (14), we get PO0 ./ 1 w.s/ Q POQ .; s/ D s 1 w.s/ Q 1 iM1
M2 2 2
as ! 0;
that is equivalent to s POQ .; s/ PO0 ./ D
s w.s/ Q M2 2 OQ P .; s/ iM1 1 w.s/ Q 2
as
! 0:
1514
Y. Luchko
Applying the inverse Fourier transform and use of the differentia ˚ making @ tion theorem for the Fourier transform F @x f .x; t/I D i F ff .x; t/I g; o n 2 @ D 2 F ff .x; t/I g , one gets F @x 2 f .x; t/I s PQ .x; s/ P0 .x/ D
@ @2 s w.s/ Q v P .x; s/ C d1 2 P .x; s/ 1 w.s/ Q @x @x
as jxj ! 1; (16)
where v D M1 and d1 D M22 can be interpreted as the velocity and the diffusion coefficient, respectively. When the waiting time pdf w.t/ possesses a finite mean (for an in-depth treatment of this problem, we refer to Emmanuel and Berkowitz (2007) and Geiger and Emmanuel (2010), where some methods for determination of the mean value of w.t/ in the case of heat transport in porous media were discussed), then the asymptotics of its Laplace transform wQ can be represented in the form w.s/ Q
1 s
as
s ! 0;
(17)
where denotes the first moment of the pdf w.t/. Substituting (17) into Eq. (16) and applying the inverse transform and the differentiation theorem for Laplace ˚
the Laplace transform L @t@ f .x; t/I s D sLff .x; t/I sg f .x; 0/ , we obtain an initial-value problem P .x; 0/ D P0 .x/ for the standard advection-diffusion equation
@ @ @2 P .x; t/ D v P .x; t/ C d1 2 P .x; t/ @t @x @x
as t ! 1; jxj ! 1
(18)
on the large time and space scales.
2.4
Fractional Diffusion-Wave Equations
With the CTRW model, it is possible to go beyond this standard framework and to explore other kinds of transport processes including the anomalous transport. Let us first assume that the mean value of the waiting time pdf w.t/ is not finite. As an example, a particular long-tailed waiting time pdf with the asymptotic behavior w.t/ A˛ .=t/1C˛ ; t ! C1; 0 < ˛ < 1
(19)
is considered. Its asymptotics in the Laplace domain can be easily determined by the so-called Tauberian theorem and is as follows: w.s/ Q 1 .s/˛ ; s ! 0:
Fractional Diffusion and Wave Propagation
1515
It is important to mention that the specific form of w.t/ is of minor importance. In particular, the so-called Mittag-Leffler waiting time pdf X d xk E˛ .t ˛ / ; E˛ .x/ WD dt .˛ k C 1/ 1
w.t/ D
kD0
can be taken without loss of generality. The Laplace transform of the Mittag-Leffler pdf can be evaluated in explicit form w.s/ Q D
1 1 C s˛
and has the desired asymptotics. Together with the Gaussian jump length pdf 1=2 exp x 2 = 4 2 ; ˙ 2 D 2 2 .x/ D 4 2 with the Fourier transform in the form O ./ 1 2 2 ; ! 0; the asymptotics of the Fourier-Laplace transform of the pdf P .x; t/ becomes POQ .; s/
PO0 ./=s ; s ! 0; ! 0: 1 C d˛ s ˛ 2
(20)
Using the Tauberian theorems for the Laplace and Fourier transforms, the last equation can be transformed for large t and jxj to a time-fractional partial differential equation. Namely, after multiplication with the denominator of the right-hand side, Eq. (20) becomes 1 C d˛ s ˛ 2 POQ .; s/ PO0 ./=s; s ! 0; ! 0:
(21)
Making use of the differentiation theorem for the Fourier transform and employing the integration rule L f.I ˛ f / .t/I sg D s ˛ fQ.s/ for the Riemann-Liouville fractional integral I ˛ defined by .I ˛ f /.t/ WD
1 .˛/
Z
t
f ./.t /˛1 d ; ˛ > 0; 0
I 0 f .t/ D f .t/;
(22)
1516
Y. Luchko
Eq. (21) can be rewritten in the form of the fractional integral equation 2 ˛ @ P .x; t/ P0 .x/ D d˛ I P .x; / .t/ @x 2
(23)
for large t and jxj. Application of a fractional differential operator Dt˛ to Eq. (23) transforms it for large t and jxj to the initial-value problem P .x; 0/ D P0 .x/ for the so-called time-fractional diffusion equation ˛ @2 P Dt P .t/ D d˛ 2 ; 0 < ˛ < 1: @x
(24)
In what follows, the fractional derivative Dt˛ .n 1 < ˛ n; n 2 N/ is taken in the Caputo sense: ˛ Dt f .t/ WD I n˛ f .n/ .t/;
(25)
I ˛ being the Riemann-Liouville fractional integral (22). For the theory and applications of this fractional derivative and for other forms of the fractional derivatives, we refer the readers to Samko et al. (1993) and Diethelm (2010). It is worth mentioning that the integrodifferential kind of the fractional differential operator in Eq. (24) ensures the non-Markovian nature of the sub-diffusive process we are dealing with. Indeed, calculating the Laplace transform of the mean squared displacement via the relation xQ 2 .s/ D lim !0
d2 P .; s/ d 2
and applying the Laplace inversion transform, the formula x 2 .t/ D
2d˛ t˛ .1 C ˛/
for the mean squared displacement in time is obtained. As we see, in contrast to the case of the Brownian motion, the mean squared displacement does not linearly depend on the time t, but is a power function with the exponent ˛. Mathematical theory of the general time-fractional diffusion equation is presented in the third section. Now we discuss the case when the characteristic waiting time T is finite, but the jump length variance ˙ 2 is infinite. Again, a specific form of the pdf .x/ is of minor importance, so that without a loss of generality, we can, e.g., consider one of the Levy-stable pdfs with the Fourier transform given by the formula
Fractional Diffusion and Wave Propagation
1517
O ./ D exp ˇ jjˇ 1 ˇ jjˇ ; 1 < ˇ < 2; jj ! 0:
(26)
In the spatial domain, we get the asymptotical formula .x/ Aˇ ˇ jxj1ˇ ; x ! 1 that shows the “long tails” of the pdf .x/. For the Poissonian waiting time pdf w.t/ D 1 exp.t=/; > 0 with the Laplace transform of the form w.s/ Q 1 s; s ! 0; the asymptotics of the Fourier-Laplace transform of the pdf P .x; t/ can be written in the form POQ .; s/
1 ; s ! 0; jj ! 0: s C cˇ jjˇ
(27)
By inverting the Laplace and Fourier transforms in Eq. (27), an initial-value problem P .x; 0/ D P0 .x/ for the space-fractional diffusion equation @P D cˇ Rxˇ P .x; t/ @t
(28)
ˇ
is obtained for large t and jxj, where Rx is the Riesz fractional derivative defined for a sufficiently well-behaved function f and 0 < ˇ 2 as a pseudo-differential operator with the symbol jjˇ (see, e.g., Samko et al. 1993; Mainardi et al. 2001): F
˚
Rxˇ f .x/I D jjˇ F ff .x/I g:
(29)
In the one-dimensional case, the Riesz fractional derivative (29) with 0 < ˇ < 2 can be represented as a hypersingular integral (see Samko et al. 1993 for the case ˇ 6D 1 and Gorenflo and Mainardi 2001 for the general case): ˇ 1 Rx f .x/ D .1 C ˇ/ sin.ˇ=2/
Z
1
0
f .x C / 2f .x/ C f .x / d : ˇC1 (30)
For ˇ D 1, the relation (30) can be interpreted in terms of the Hilbert transform
1 d Rx1 f .x/ D dx
Z
C1
1
f . / d ; x
(31)
1518
Y. Luchko
where the integral is understood in the sense of the Cauchy principal value as first noted in Feller (1952) and then revisited and stated more precisely in Gorenflo and Mainardi (2001). In the case both T and ˙ 2 diverge, we employ in the CTRW model, e.g., the long-tailed pdf (19) as the waiting time pdf w.t/ and the Levy-stable pdf (26) as the jump length pdf .x/. Following the same way as above, an initial-value problem P .x; t/ D P .x; 0/ for the time- and space-fractional diffusion equation ˛ Dt P .t/ D s˛;ˇ Rxˇ P .x/
(32)
with the Caputo fractional derivative Dt˛ in time and the Riesz fractional derivative ˇ Rx in space is deduced from the CRTW model on the large time and space scales. For the mathematical and numerical analysis of the time- and space-fractional diffusion equations of type (32), we refer to Mainardi et al. (2001), where an even more general equation with the Riesz-Feller space-fractional derivative was investigated in detail in the one-dimensional case, and to Hanyga (2002), where the multidimensional time-space fractional diffusion equations were treated. Let us now mention one important particular case of the time- and spacefractional diffusion equation that is referred to as the neutral-fractional diffusion equation (Mainardi et al. 2001; Metzler and Nonnenmacher 2002) or the fractional wave equation (Luchko 2013). This equation is obtained from (32) when we set ˛ D ˇ, i.e., when the orders of the fractional derivatives in time and in space are the same. From the viewpoint of the CTRW model, this condition means that the asymptotics of the waiting time pdf w.t/ and the jump length pdf .x/ are the same on large time and space scales that means that the waiting times and jump length are adapted to each other in the way that the corresponding CTRW model describes an anomalous wave propagation rather than anomalous diffusion. Mathematical theory of the fractional wave equation is presented in the Sect. 4.1. Further important models that generalize the well-known conventional transport equations are the fractional diffusion-advection equation (anomalous diffusion with an additional velocity field) and the fractional Fokker-Plank equation (anomalous diffusion in the presence of an external field). Of course, like in the conventional case, the multidimensional generalizations, equations of the fractional order with the nonconstant coefficients and nonlinear fractional differential equations appear in the corresponding models and are worth to be investigated.
3
Time-Fractional Diffusion-Wave Equations
3.1
Generalized Time-Fractional Diffusion Equation
Motivated by the models derived in the previous section, we consider in this subsection the n-dimensional generalized time-fractional diffusion equation (GTFDE) ˛ Dt u .t/ D L.u/ C F .x; t/; 0 < ˛ 1;
(33)
Fractional Diffusion and Wave Propagation
1519
where u D u.x; t/; .x; t/ 2 ˝T WD G .0; T /; G Rn is the unknown function, the operator L is given by L.u/ D div.p.x/ grad u/ C q.x/u with the coefficients p and q that satisfy the conditions N p 2 C 1 GN ; q 2 C GN ; p.x/ > 0; q.x/ 0; x 2 G;
(34)
the fractional derivative Dt˛ is defined in the Caputo sense (see (25)), and the domain G with the boundary S is open and bounded in Rn . The operator L is a linear elliptic type differential operator of the second order L.u/ D
n X @2 u @p @u p.x/ 2 C q.x/u; @xk @xk @xk kD1
that can be represented in the form L.u/ D p.x/4u C .grad p; grad u/ q.x/u;
(35)
4 being the Laplace operator. For ˛ D 1, Eq. (33) is reduced to a linear secondorder parabolic PDE. The theory of this equation is well known, so that the main focus in this section is on the case 0 < ˛ < 1. Of course, in applications, the transport processes that we model with the timefractional diffusion equations take place in some bounded domains, so that mainly the initial-boundary-value problems for these equations are worth to be studied from the viewpoint of applications. In this section, the initial-boundary-value problem ˇ N uˇt D0 D u0 .x/; x 2 G;
ˇ uˇS D v.x; t/; .x; t/ 2 S Œ0; T
(36) (37)
for Eq. (33) is considered. A solution to the problem (33), (36), (37) is called a function u D u.x; t/ defined in the domain ˝N T WD GN Œ0; T that belongs to the space C ˝N T \ W 1 ..0; T / \ C 2 .G/ and satisfies both Eq. (33) and the initial and boundary conditions (36)–(37). By W 1 ..0; T /, the space of the functions f 2 C 1 ..0; T / such that f 0 2 L..0; T // is denoted. If the problem (33), (36), (37) possesses a solution, then the functions F , u0 , and v given in the problem have to belong to the spaces C .˝T /; C GN , and C .S Œ0; T /, respectively. In the further discussions, these inclusions are always supposed to be valid. The presentation of the results in this section follows Luchko (2009a,b, 2010, 2011a,b, 2012a,b), and the readers are advised to consult these papers for the proofs and more details.
1520
Y. Luchko
Uniqueness of the Solution First, we investigate uniqueness of the solution to the problem (33), (36), (37). The main component of the uniqueness proof is an appropriate maximum principle for Eq. (33). In its turn, the proof of the maximum principle uses an extremum principle for the Caputo fractional derivative that is formulated in the following theorem. Theorem 1 (Luchko 2009b). Let a function f 2 W 1 ..0; T / \ C .Œ0; T / attain its maximum over the interval Œ0; T at the point D t0 ; t0 2 .0; T . Then the Caputo fractional derivative of the function f is nonnegative at the point t0 for any ˛; 0 < ˛ < 1: ˛ Dt f .t0 / 0; 0 < ˛ < 1:
(38)
Let us mention that recently a more strong estimate for the Caputo derivative of a function f; f 2 C 1 Œ0; 1 at the maximum point t0 was proved in Al-Refai (2012), namely,
Dt˛ f .t0 /
t0˛ .f .t0 / f .0// 0; 0 < ˛ < 1: .1 ˛/
The extremum principle for the Caputo fractional derivative is used to prove a maximum principle for the generalized time-fractional diffusion equation (33) that is formulated in the same way as the one for the parabolic type PDEs. Theorem 2 (Luchko 2009b). Let a function u 2 C ˝N T \W 1 ..0; T //\C 2 .G/ be a solution of the generalized time-fractional diffusion equation (33) and F .x; t/ 0; .x; t/ 2 ˝T . Then either u.x; t/ 0; 8.x; t/ 2 ˝N T , or the function u attains its positive maximum on the part SGT WD GN f0g [.S Œ0; T / of the boundary of the domain ˝T , i.e., ) ( (39) u.x; t/ max 0; max u.x; t/ ; 8.x; t/ 2 ˝N T : .x;t /2SGT
Similarly to the case of the partial differential equations of parabolic type (˛ D 1), an appropriate minimum principle is valid, too. The maximum and minimum principles can be applied to show that the problem (33), (36)–(37) possesses at most one solution and this solution – if it exists – continuously depends on the data given in the problem. Theorem 3 (Luchko 2009b). The initial-value problem (36)–(37) for the GTFDE (33) possesses at most one solution. This solution continuously depends on the data given in the problem in the sense that if
Fractional Diffusion and Wave Propagation
1521
kF FQ kC .˝N T / ; ku0 uQ 0 kC .GN / 0 ; kv vk Q C .S Œ0;T / 1 ; and u and uQ are the solutions to the problem (33), (36)–(37) with the source functions F and FQ , the initial conditions u0 and ue0 , and the boundary conditions v and v, Q respectively, then the norm estimate ku uQ kC .˝N T / maxf0 ; 1 g C
T˛ .1 C ˛/
(40)
for the solutions u and uQ holds true. Because the problem under consideration is a linear one, the uniqueness of the solution immediately follows from the fact that the homogeneous problem (33), (36)–(37), i.e., the problem with F 0, u0 0, and v 0, has only one solution, namely, u.x; t/ 0; .x; t/ 2 ˝N T .
Existence of the Solution To tackle the existence problem, notion of a generalized solution is first introduced following Vladimirov (1971), where the case ˛ D 1 was considered. Definition 1 (Luchko 2010). Let Fk 2 C ˝N T ; u0k 2 C GN , and vk 2 C .S Œ0; T /; k D 1; 2; : : : be the sequences of functions that satisfy the following conditions: (1) There exist the functions F , u0 , and v, such that kFk F kC .˝N T / ! 0 as k ! 1;
(41)
ku0k u0 kC .GN / ! 0 as k ! 1;
(42)
kvk vkC .S Œ0;T / ! 0 as k ! 1;
(43)
(2) For any k D 1; 2; : : : , there exists a solution uk of the initial-boundary-value problem ˇ N uk ˇt D0 D u0k .x/; x 2 G; ˇ uk ˇS D vk .x; t/; .x; t/ 2 S Œ0; T
(44) (45)
for the generalized time-fractional diffusion equation ˛ Dt uk .t/ D L.uk / C Fk .x; t/:
(46)
Suppose there exists a function u 2 C ˝N T such that kuk ukC .GN / ! 0 as k ! 1:
(47)
1522
Y. Luchko
The function u is called a generalized solution of the problem (33), (36)–(37). The generalized solution of the problem (33), (36)–(37) is a continuous function, not a generalized one. the generalized solution is not required to be from the Still, functional space C ˝N T \W 1 ..0; T /\C 2 .G/, where the solution has to belong to. It follows from Definition 1 that if the problem (33), (36)–(37) possesses a solution, then this solution is a generalized solution of the problem, too. In this sense, Definition 1 extends the notion of the solution of the problem (33), (36)– (37). This extension is needed to get some existence results. But of course one does not want to lose the uniqueness of the solution. Let us now discuss some properties of the generalized solution including its uniqueness. If the problem (33), (36), (37) possesses a generalized solution, then the functions F, u0 , and v given in the problem have to belong to the spaces C ˝N T ; C GN , and C .S Œ0; T /, respectively. In the further discussions, these inclusions are always supposed to be valid. First, we show that the sequence uk ; k D 1; 2; : : : defined by the relations (41)– N (46) of Definition 1 is always a uniformly convergent one in ˝T , i.e., there always exists a function u 2 C ˝N T that satisfies the property (47). Indeed, applying the estimate (40) from Theorem 3 to the functions uk and up that are solutions of the corresponding initial-boundary-value problems (44) and (45) for Eq. (46), one gets the inequality o n kuk up kC .˝N T / max ku0k u0p kC .GN / ; kvk vp kC .S Œ0;T / T˛ kFk Fp kC .˝N T / ; C .1 C ˛/ that, together with relations (41)–(43), means that uk; k D 1; 2; : : : is a Cauchy the sequence in C ˝N T that converges to a function u 2 C ˝N T . Moreover, the following important uniqueness theorem holds true. Theorem 4 (Luchko 2010). The problem (33), (36)–(37) possesses at most one generalized solution in the sense of Definition 1. The generalized solution – if it exists – continuously depends on the data given in the problem in the sense of the estimate (40). In contrast to the situation with the solution to the problem (33), (36)–(37), existence of the generalized solution can be shown under some standard restrictions on the problem data and the boundary S of the domain G. In this section, existence of the solution of the problem ˛ (48) Dt u .t/ D L.u/; ˇ N uˇt D0 D u0 .x/; x 2 G; (49) ˇ uˇS D 0; .x; t/ 2 S Œ0; T (50)
Fractional Diffusion and Wave Propagation
1523
is considered to demonstrate the technique that can be used with the appropriate standard modifications in the general case, too. The generalized solution of the problem (48)–(50) can be constructed in an analytical form by using the Fourier method of the variables separation. Let us look for a particular solution u of Eq. (48) in the form u.x; t/ D T .t/ X .x/; .x; t/ 2 ˝N T ;
(51)
that satisfies the boundary condition (50). Substituting (51) into Eq. (48) and separating the variables, we get the equation ˛ Dt T .t/ L.X / D D ; T .t/ X .x/
(52)
being a constant not depending on the variables t and x. The last equation, together with the boundary condition (50), is equivalent to the fractional differential equation ˛ Dt T .t/ C T .t/ D 0
(53)
and the eigenvalue problem L.X / D X; ˇ X ˇS D 0; x 2 S
(54) (55)
for the operator L. Due to the condition (34), the operator L is a positive definite and self-adjoint linear operator. The theory of the eigenvalue problems for such operators is well known (see, e.g., Vladimirov 1971). In particular, the eigenvalue problem (54)–(55) has a counted number of the positive eigenvalues 0 < 1 2 : : : with the finite multiplicity, and if the boundary S of G is a smooth surface, any function f 2 ML can be represented through its Fourier series in the form f .x/ D
1 X .f; Xi / Xi .x/;
(56)
i D1
where .f; g/ denotes the scalar product of two functions in L2 .G/ and Xi 2 ML are the eigenfunctions corresponding to the eigenvalues i : L.Xi / D i Xi ; i D 1; 2; : : : :
(57)
By ML , the space of the f that satisfy the boundary condition (55) and functions the inclusions f 2 C 1 ˝N T \ C 2 .G/, L.f / 2 L2 .G/ is denoted. The solution of the fractional differential equation (53) with D i ; i D 1; 2; : : : has the form (see, e.g., Luchko 1999; Luchko and Gorenflo 1999)
1524
Y. Luchko
Ti .t/ D ci E˛ .i t ˛ / ;
(58)
E˛ being the Mittag-Leffler function defined by E˛ .z/ WD
1 X kD0
zk : .˛ k C 1/
(59)
Any of the functions ui .x; t/ D ci E˛ .i t ˛ / Xi .x/; i D 1; 2; : : :
(60)
and thus the finite sums uk .x; t/ D
k X
ci E˛ .i t ˛ / Xi .x/; k D 1; 2 : : :
(61)
i D1
satisfy both Eq. (48) and the boundary condition (50). To construct a function that satisfies the initial condition (49), too, the notion of a formal solution is introduced. Definition 2 (Luchko 2010). A formal solution to the problem (48)–(50) is called the Fourier series in the form u.x; t/ D
1 X
.u0 ; Xi / E˛ .i t ˛ / Xi .x/;
(62)
i D1
Xi ; i D 1; 2; : : : being the eigenfunctions corresponding to the eigenvalues i of the eigenvalue problem (54)–(55). Under certain conditions, the formal solution (62) can be proved to be the generalized solution of the problem (48)–(50). Theorem 5 (Luchko 2010). Let the initial condition u0 be from the space ML . Then the formal solution (62) of the problem (48)–(50) is its generalized solution. Indeed, it can be easily verified that the functions uk ; k D 1; 2; : : : defined by (61) are solutions of the problem (48)–(50) with the initial conditions
u0k .x/ D
k X
.u0 ; Xi / Xi .x/
(63)
i D1
instead of u0 . Because the function u0 is from the functional space ML , its Fourier series converges uniformly to the function u0 , so that
Fractional Diffusion and Wave Propagation
1525
ku0k u0 kC .GN / ! 0 as k ! 1: To prove the theorem, one only needs to show that the sequence uk ; k D 1; 2; : : : of the partial sums (61) converges uniformly on ˝N T . But this statement immediately follows from the estimate (see, e.g., Haubold et al. 2011) jE˛ .x/j
M M; 0 x; 0 < ˛ < 1 1Cx
(64)
for the Mittag-Leffler function (59) and the fact that the Fourier series 1 P .u0 ; Xi / Xi .x/ of the function u0 2 ML uniformly converges on ˝N T . i D1
In some cases, the generalized solution (62) can be shown to be the solution of the initial-value problem for the generalized time-fractional diffusion equation, too. One important example is given by the following theorem. Theorem 6 (Luchko 2012b). Let an open domain G be a one-dimensional interval .0; l/, u0 2 ML , and L.u0 / 2 ML . Then the unique solution of the initial-valueproblem ˇ uˇt D0 D u0 .x/; 0 x l; u.0; t/ D u.l; t/ D 0; 0 t T for the one-dimensional generalized time-fractional equation ˛ @ Dt u .t/ D @x
@u p.x/ q.x/ u @x
is a continuously differentiable function with respect to the time variable on the interval .0; T / that is given by the formula (62).
3.2
Fractional Diffusion-Wave Equation
Of course, in the case of the fractional differential equations with the constant coefficients, other techniques than the spectral method presented in the previous section like the integral transforms method can be applied to deduce explicit formulas for solutions of the initial-, boundary-, or initial-boundary-value problems for these equations. As an example, we consider in this section the initial-value problem (the Cauchy problem)
1526
Y. Luchko
8 ˆ ˆ 0
(65)
for the time-fractional diffusion-wave equation of order ˛ with 1 ˛ 2, namely,
@2 u Dt˛ u .t/ D 2 ; x 2 R; t 2 RC : @x
(66)
This equation interpolates between the diffusion equation (˛ D 1) and the wave equation (˛ D 2) and was considered in detail, e.g., in Mainardi (1994, 1996a,b) (see also Luchko et al. 2013 for more recent results). To simplify the notations in the formulas, we set D ˛=2, so that 1=2 1 for 1 ˛ 2. The Green function Gc .x; tI / of the problem under consideration is its solution with the initial condition f .x/ D ı.x/, ı being the Dirac ı-function. The solution of the Cauchy problem (65) is obtained via the Green function in the form Z
C1
Gc .x ; tI / f . / d :
u.x; tI / D 1
For the diffusion equation ( D 1=2), the Green function is well known and is given by t 1=2 2 Gc .x; tI 1=2/ D Gc d .x; t/ D p ex =.4 t / ; 2
(67)
whereas for the wave equation ( D 1), we get the representation Gc .x; tI 1/ D Gc w .x; t/ D
1 .ı.x t/ C ı.x C t// : 2
(68)
Following Mainardi (1994, 1996a,b) and Luchko et al. (2013), several representations of the Green function Gc in the form of integrals and series as well as some methods for its numerical evaluation are presented and discussed in this subsection. We start by transforming the Cauchy problem for Eq. (66) into the LaplaceFourier domain using the known formula (see, e.g., Podlubny 1999) L
˚
n1 X
Dt˛ f .t/I s D s ˛ L ff .t/I sg f .k/ 0C s ˛1k ;
n1 < ˛ n; n 2 N
kD0
(69) for the Laplace transform of the Caputo fractional derivative. This formula, together with the standard formulas for the Fourier transform of the second derivative and of the Dirac ı-function, leads to the representation
Fractional Diffusion and Wave Propagation
ec .; s; / D Gb
1527
s 21 ; D ˛=2 s 2 C 2
(70)
ec of the Green function Gc . Using the Laplace of the Laplace-Fourier transform Gb transform formula (see, e.g., Podlubny 1999) L fE˛ .t ˛ / I sg D
s ˛1 s˛ C 1
and applying to the R.H.S of the formula (70) first the inverse Laplace transform and then the inverse Fourier transform, we obtain the integral representation 1 Gc .x; tI / D
Z
1
E2 2 t 2 cos.x/ d ;
(71)
0
if we take into consideration the fact that the Green function of the Cauchy problem is an even function of x that follows from the formula (70). In Mainardi (1994), another representation of the Green function was obtained by applying to the R.H.S of the formula (70) first the inverse Fourier transform and then the inverse Laplace transform: 2 x Gc .x; tI / D F .r/ D r M .r/;
(72)
where r D x=t is the similarity variable and F .r/ WD
1 2i
Z
e r d ; M .r/ WD Ha
1 2i
Z
e r d 1 Ha
are the two auxiliary functions nowadays often referred to as the Mainardi functions and Ha denotes the Hankel integration path. Let us note that the form of the similarity variable can be explained by the Lie group analysis of the time-fractional diffusion-wave equation (66). In Buckwar and Luchko (1998), Luchko and Gorenflo (1998), and Gorenflo et al. (2000b), symmetry groups of scaling transformations for the time- and spacefractional partial differential equations have been constructed. In particular, it has been proved in Buckwar and Luchko (1998) that the only invariant of the symmetry group T of scaling transformations of the time-fractional diffusion-wave equation (66) has the form .x; t; u/ D x=t that explains the form of the scaling variable. Using the well-known representation of the Wright function, which reads (in our notation) for z 2 C W; .z/ WD
1 2i
Z e Cz Ha
1 X d zn ; D nŠ .n C / nD0
(73)
1528
Y. Luchko
where > 1 and > 0, we recognize that the auxiliary functions F and M are related to the Wright function according to the formula F .z/ D W;0 .z/ D z M .z/ ;
M .z/ D W;1 .z/ :
(74)
This relation (74) along with (73) provides us with the series representations of the Mainardi functions and thus of the Green function Gc (for x > 0): Gc .x; tI / D
1 .x=t /n 1 1 1 X F .r/ D M .r/ D : 2x 2t 2 t nD0 nŠ . n C 1 /
(75) Let us now shortly introduce some algorithms for numerical evaluation of the Green function Gc . Because Gc is a particular case of the Wright function (see formula (74)), one can of course use the algorithms for the numerical evaluation of the Wright function suggested in Luchko (2008) to evaluate the Green function Gc . Another approach to numerical evaluation of Gc we employed to produce the plots in Figs. 1 and 2 is in employing the integral representation (71). To evaluate the Mittag-Leffler function E˛ in (71), we applied the algorithms suggested in Gorenflo et al. (2002) and the MATLAB programs that implement these algorithms and are available from Matlab File Exchange (2005). In Fig. 1, several plots of the Green function Gc .xI / WD Gc .x; 1I / for different values of the parameter . D ˛=2/ are presented. It can be seen that for x 0 each Green function has an only maximum and that location of the maximum point changes with the value of . For a detailed discussion of the maximum location, maximum value, and the propagation velocity of the maximum point, we refer to Fig. 1 Green function Gc .xI / WD Gc .x; 1I /: plots for several different values of
0.8 0.7
ν=1
0.6
ν = 0.9 Gc(x;ν)
0.5 0.4
ν = 0.5
0.3 0.2
ν = 0.65 0.1 0
0
0.5
x
1
1.5
Fractional Diffusion and Wave Propagation
1529
0.7 0.6
Gc(x,t;ν)
0.6 1 0.4
Gc(x,t;ν)
0.5 0.8
0.4 0.3 0.2
0.2
0.1
1.5 t
0 0
0.5
1
1.5 x
2
2.5
3
2
0 1
4 1.2
1.4
1.6
1.8
2 0
2 x
t
Fig. 2 Green function Gc .x; t I /: plots for D 0:875 from different perspectives
Luchko et al. (2013). In Fig. 2, the Green function Gc .x; tI / is plotted for D 0:875.˛ D 1:75/ from different perspectives. The plots show that both the location of maximum and the maximum value depend on time t > 0: whereas the maximum value decreases with time (Fig. 2, right), the x-coordinate of the maximum location becomes even larger (Fig. 2, left). Surprisingly, the product of the maximum location and the maximum value of Gc .x; tI / does not depend on time t > 0 and is just a function of the parameter (see Luchko et al. 2013 for more details).
4
Fractional Wave Equation
In this section, a fractional generalization of the wave equation that describes propagation of damped waves is considered. In contrast to the fractional diffusionwave equation that was considered in the previous section, the fractional wave equation contains fractional derivatives of the same order ˛; 1 ˛ 2, both in space and in time. We show that the fractional wave equation inherits some crucial characteristics of the wave equation like a constant propagation velocity of both the maximum of its fundamental solution and its gravity and “mass” centers. Moreover, the first, the second, and the Smith centrovelocities of the damped waves described by the fractional wave equation are constant and depend just on the equation order ˛. In this section, the fundamental solution of the one-dimensional fractional wave equation is obtained in explicit form and shown to be a spatial probability density function evolving in time all whose moments of order less than ˛ are finite. To illustrate analytical findings, results of numerical calculations and plots are presented. From the mathematical viewpoint, the one-dimensional fractional wave equation we deal with in this section was introduced for the first time in Gorenflo et al. (2000a), where this equation was called the neutral-fractional diffusion equation. In Mainardi et al. (2001), a time-space fractional diffusion-wave equation with the Riesz-Feller derivative of order ˛ 2 .0; 2 and skewness has been investigated in detail. A particular case of this equation that for D 0 corresponds to our fractional
1530
Y. Luchko
wave equation has been shortly mentioned in Mainardi et al. (2001). In Metzler and Nonnenmacher (2002), a fundamental solution to the neutral-fractional diffusion equation was deduced and analyzed in terms of the Fox H-function. For a detailed treatment of the one-dimensional fractional wave equation, we refer to the recent paper Luchko (2013). In the applications, the fractional wave equations of different types were employed, e.g., for modeling of dynamics of sand and fissured rock with the seismic excitations in Gudehus and Touplikiotis (2012) and for description of the causal elastic waves with a frequency power-law attenuation in Näsholm and Holm (2013). As has been shown, e.g., in Szabo and Wu (2000), elastic wave attenuation in complex media such as biological tissue, polymers, rocks, and rubber often follows a frequency power law, and thus, such elastic waves can be modeled with the fractional wave equations.
4.1
Analysis of the Fractional Wave Equation
Problem Formulation In this section, we consider the fractional wave equation in the form Dt˛ u.x; t/ D C˛ Rx˛ u.x; t/; x 2 Rn ; t 2 RC ; 1 ˛ 2;
(76)
where u D u.x; t/ is a real field variable, Rx˛ is the Riesz space-fractional derivative (29) of order ˛, and Dt˛ is the Caputo time-fractional derivative (25) of order ˛. The Caputo and the Riesz fractional derivatives were introduced and short discussed in the second section. Here we just note that the Riesz fractional derivative is a symmetric operator with respect to the space variable x. Because of the relation ˛=2 jj˛ D 2 , it can be formally interpreted as Rx˛ D ./˛=2 ; i.e., as a power of the self-adjoint and positive definite operator , being the Laplace operator. To make analysis of Eq. (76) more simple and clear, in the further discussions we focus on the model one-dimensional fractional wave equation Dt˛ u.x; t/ D Rx˛ u.x; t/; x 2 R ; t 2 RC ; 1 ˛ 2:
(77)
In (77), all quantities are supposed to be dimensionless, so that the coefficient at the Riesz space-fractional derivative can be taken to be equal to one without loss of generality. As we have mentioned in the second section, in the one-dimensional case, the Riesz fractional derivative (29) can be represented as the hypersingular integral (30) that for ˛ D 1 can be rewritten via the Hilbert transform (31).
Fractional Diffusion and Wave Propagation
1531
This means that for ˛ D 1 Eq. (77) can be represented in the form 1 d @u .x; t/ D @t dx
Z
C1 1
u. ; t/ d x
(78)
that we call a modified convection equation and that is of course different from the standard convection equation. For ˛ D 2, Eq. (77) is reduced to the one-dimensional wave equation. In what follows, we focus on the case 1 ˛ < 2 because the case ˛ D 2 (wave equation) is well studied in the literature. For Eq. (77), the initial-value problem u.x; 0/ D '.x/ ;
@u .x; 0/ D 0; x 2 R @t
(79)
is considered for 1 < ˛ < 2. If ˛ D 1, the second initial condition in (79) is omitted. In this section, we are mostly interested in behavior and properties of the fundamental solution (Green function) G˛ of Eq. (77), i.e., its solution with the initial condition '.x/ D ı.x/, ı being the Dirac delta function.
Fundamental Solution of the Fractional Wave Equation We start our analysis by applying the Fourier transform with respect to the space variable x to Eq. (77) with 1 < ˛ < 2 and to the initial conditions (79) with '.x/ D ı.x/. Using definition of the Riesz fractional derivative, for the Fourier transform GO ˛ , we get the initial-value problem (
O G.; 0/ D 1; @GO @t .; 0/ D 0
(80)
for the fractional differential equation
D ˛ GO ˛ .t/ C jj˛ GO ˛ .; t/ D 0:
(81)
The unique solution of (80), (81) is given by the expression (see, e.g., Luchko 1999) GO ˛ .; t/ D E˛ .jj˛ t ˛ / in terms of the Mittag-Leffler function (59). The well-known formula (see, e.g., Podlubny 1999) E˛ .x/ D
m X .x/k C O x 1m ; m 2 N; x ! C1 .1 ˛k/ kD1
(82)
1532
Y. Luchko
for asymptotics of the Mittag-Leffler function that is valid for 0 < ˛ < 2 and the formula (82) show that GO ˛ belongs to L1 .R/ with respect to for 1 < ˛ < 2. Therefore, we can apply the inverse Fourier transform and get the representation 1 G˛ .x; t/ D 2
Z
C1
e i x E˛ .jj˛ t ˛ / d ; x 2 R; t > 0
(83)
1
for the Green function G˛ . The last formula shows that the fundamental solution G˛ is an even function in x, i.e., G˛ .x; t/ D G˛ .x; t/; x 2 R; t > 0
(84)
and (83) can be rewritten as the cos-Fourier transform: G˛ .x; t/ D
1
Z
1
cos.x/ E˛ . ˛ t ˛ / d ; x 2 R; t > 0:
(85)
0
Remarkably, the fundamental solution G˛ can be represented in terms of elementary functions for every ˛; 1 < ˛ < 2. To show this, the technique of the Mellin integral transform was applied in Luchko (2013) to rewrite the integral (85) as a particular case of the Fox H-function: s Z Ci 1 s ˛ 1 ˛s 1 1 t s G˛ .x; t/ D ds; 0 < < ˛ (86) s ˛x 2i i 1 1 2 2 x that can be represented in the form 1 1 G˛ .x; t/ D ˛x 2i
Z
Ci 1 i 1
sin.s=2/ sin.s=˛/
s t ds; 0 < < ˛ x
(87)
using the duplication and reflection formulas for the Euler gamma function . From (86) or (87), a useful representation G˛ .x; t/ D
1 L˛ .t=x/; x > 0; t > 0 x
(88)
of the fundamental solution G˛ in terms of the auxiliary function L˛ defined by L˛ .x/ D
1 1 ˛ 2i
Z
Ci 1 i 1
sin.s=2/ .x/s ds; 0 < < ˛ sin.s=˛/
(89)
can be obtained. It follows from the representation (89) (or from the formulas (84) and (88)) that L˛ is an odd function, i.e., L˛ .x/ D L˛ .x/; x 2 R:
(90)
Fractional Diffusion and Wave Propagation
1533
Moreover, the important formula L˛ .x/ D L˛ .1=x/; x 6D 0
(91)
can be obtained from the representation (89) by the variables substitution s D s1 in the integral at the right-hand side of (89). From (88) and (91) and the fact that G˛ .x; t/ D G˛ .x; t/ D G˛ .jxj; t/ for all x ¤ 0, t > 0, the similarity properties of the fundamental solution 1 1 1 t G˛ .1; t=jxj/ D G˛ .1; jxj=t/ D G˛ .x=t; 1/ D 2 G˛ .t=x; 1/ jxj jxj t x (92) are deduced. It is worthwhile to stress the remarkable fact that two of these similarity properties hold with the variable x fixed to 1, the other two ones with the variable t fixed to 1. This fact reflects the property that in the fractional wave equation (77) we deal with in this section, the time-fractional derivative and the space-fractional derivative are of the same order ˛. The correctness of the four similarity properties (92) can be also directly checked using the final formula (97) for the fundamental solution G˛ of the fractional wave equation. Because the auxiliary function L˛ is defined in (89) as an inverse Mellin transform, its Mellin transform is given by the formula G˛ .x; t/ D
Z L˛ .s/ D
1
L˛ .x/ x s1 dx D 0
1 sin.s=2/ ; 0 < 0; x 2 R; 1 < ˛ < 2: t 2˛ C 2jxj˛ t ˛ cos. ˛=2/ C jxj2˛ (97)
Fundamental Solution as a pdf
We begin with a remark that the formula (97) is valid for ˛ D 1 (modified convection equation (78)), too, that can be proved by direct calculations. In this case we get the well-known Cauchy kernel G1 .x; t/ D
t 1 2 t C x2
(98)
that is a spatial probability density function evolving in time. For 1 < ˛ < 2, the Green function (97) is a spatial probability density function evolving in time, too. Indeed, the function (97) is evidently nonnegative for all t > 0. Furthermore, Z
1
G˛ .x; t/ dx D .F G˛ .x; t// .0/ D GO ˛ .0; t/ D E˛ .jj˛ t ˛ / jD0 D 1 1
(99) for all t > 0 and 1 < ˛ < 2 according to the formula (82). Thus, G˛ given by (97) is a spatial probability density function evolving in time that can be considered to be a fractional generalization of the Cauchy kernel (98) for the case of an arbitrary index ˛; 1 ˛ < 2. Now let us study some properties of the fundamental solution (97) as a pdf. Because G˛ is an even function, we consider the function G˛C .r; t/ D G˛ .jxj; t/ D
r ˛1 t ˛ sin. ˛=2/ 1 ; t > 0; 1 < ˛ < 2 t 2˛ C 2r ˛ t ˛ cos. ˛=2/ C r 2˛
with r D jxj 0. It is easy to see that G˛C behaves like a power function in r both at r D 0 and at r D C1 for a fixed t > 0: ( G˛C .r; t/
r ˛1 ; r ! 0; r ˛1 ; r ! C1:
(100)
This means that the pdf G˛ possesses finite moments of order for 0 < ˛, but the moment of order ˛ is infinite. In particular, the mean value of G˛ (its first
Fractional Diffusion and Wave Propagation
1535
moment) exists for all ˛ > 1 (we note that the Cauchy kernel does not possess a mean value). Let us now evaluate the moments of the one-sided fractional Cauchy kernel G˛C for a fixed t > 0. To do this, we refer to the representation (88) of G˛C in terms of the auxiliary function L˛ that is given by the formula (see (90) and (97)) L˛ .x/ D
sign.x/jxj˛ sin. ˛=2/ 1 ; x 2 R; 1 < ˛ < 2: 2˛ jxj C 2jxj˛ cos. ˛=2/ C 1
(101)
Taking into account this formula, the function C˛ .x/ WD
L˛ .x/ x
can be interpreted as a fractional Cauchy pdf of the order ˛. Indeed, C˛ .x/ is evidently nonnegative for all x 2 R, and the property Z
C1
C˛ .x/ dx D 1 1
is valid because of the formulas (102) and (103). Of course, for ˛ D 1, the pdf C˛ .x/ coincides with the Cauchy pdf. The moment of the order , 0 < ˛ of G˛C can be represented in terms of the Mellin integral transform of L˛ that is known (see the formula (93)) and thus evaluated: Z 1 Z 1 t sin.=2/ C : (102) G˛ .r; t/r dr D t L˛ ./ 1 d D ˛ sin.=˛/ 0 0 In particular, we get the formula Z
1 0
G˛C .r; t/ dr D
1 2
that is in accordance with (99) because G˛ is an even function in x. We mention also the important formula Z 1 t ; 1 0 for x 6D 0, so that x D 0 is a minimum point of the fundamental solution G˛ for any t > 0. Because G˛ is an even function in x, we again consider the function G˛ .jxj; t/ that was denoted by G˛C .r; t/ with r D jxj. To determine the maximum locations of G˛C for the fixed values of t and ˛, we solve the equation @G˛C .r; t/ D 0 @r that turns out to be equivalent to the quadratic equation .˛ C 1/
r˛ t˛
2
C 2 cos. ˛=2/
r˛ t˛
.˛ 1/ D 0
with solutions given by p r˛ cos. ˛=2/ ˙ ˛ 2 sin2 . ˛=2/ : D t˛ ˛C1 Since we are interested in the nonnegative solutions, the only candidate for this role is the point p r˛ cos. ˛=2/ C ˛ 2 sin2 . ˛=2/ : D c˛ ; c˛ WD t˛ ˛C1 C
Because @G@r˛ .r; t/ is positive for that the point
r˛ t˛
< c˛ and negative for
1
r˛? .t/ D vp .˛/t; vp .˛/ WD .c˛ / ˛
r˛ t˛
(105)
> c˛ , we conclude
(106)
with c˛ given by (105) is the only maximum point of the one-sided fractional Cauchy kernel G˛C . Of course, this point and the point r˛? .t/ < 0 are maximum points of G˛ because G˛ is an even function in x. To determine the maximum value of the function G˛ that coincides with the maximum value of G˛C and is denoted by G˛? .t/, we substitute the point r D r˛? .t/ given by (106) into the function G˛C and get the formula
Fractional Diffusion and Wave Propagation
G˛? .t/ D
1537
1 1 c˛ sin. ˛=2/ m˛ ; m˛ WD G˛? .1/ D ; t vp .˛/ 1 C 2c˛ cos. ˛=2/ C c˛2
(107)
where vp .˛/ and c˛ are defined as in the formulas (105) and (106). Of course, we can also use the relation (88) and obtain the formula G˛? .t/ D
1 L˛ .vp .˛// t vp .˛/
(108)
via the auxiliary function L˛ . It follows from the formulas (106) and (107) (or (108)) that for a fixed value of ˛; 1 < ˛ < 2, the product p˛ of the maximum value G˛? .t/ and the maximum locations ˙r˛? .t/ is time independent: p˛ D ˙r˛? .t/ G˛? .t/ D ˙
c˛ sin. ˛=2/ 1 D ˙L˛ vp .˛/ : 1 C 2c˛ cos. ˛=2/ C c˛2
(109)
t For ˛ D 1, the maximum location of the Green function G1 .x; t/ D t 2 Cx 2 does not move with time and is evidently at the point x D 0 for any t > 0, i.e., c1 D vp .1/ D p1 D 0 that is in accordance with the formulas (105), (106), and (109). Now we calculate some physical characteristics of the damped waves that are described by the fundamental solution G˛ . Because G˛ consists in fact of two symmetric branches that move in opposite directions, we again consider only one of them, say, G˛C that is a restriction of G˛ to x 0. g The location of the gravity center r˛ .t/ of G˛C is defined by the formula
R1
r˛g .t/
r G C .r; t/ dr : D R0 1 C˛ 0 G˛ .r; t/ dr
(110)
For 1 < ˛ < 2, the formulas (103) and (104) lead to the following result: r˛g .t/ D
2t : ˛ sin.=˛/
(111) g
If ˛ D 1, the mean value of G1C does not exists and thus the gravity center r1 .t/ of G1C is located at C1 for any t > 0. The “mass” center r˛m .t/ of G˛C is determined by the formula (Gurwich 2001) R1
r˛m .t/
2 r G˛C .r; t/ dr D R1 : C .r; t/ 2 dr G ˛ 0 0
(112)
Substituting the representation (88) into (112) and transforming the obtained integrals, we get the formula
1538
Y. Luchko
R1 r˛m .t/
D vm .˛/ t; vm .˛/ D
0
1 L2˛ ./ d R1 ; 2 0 L˛ ./ d
(113)
where the function L˛ is defined by (101). The formula (113) (as well as the formula (118)) includes some integrals of the form Z
1
I .ˇ/ D 0
ˇ L2˛ ./ d ; 2˛ 1 < ˇ < 2˛ 1
(114)
that in general cannot be expressed via known elementary or special functions. Remarkably, there exists an explicit formula for the integrals (114) in the case ˛ D 1, namely, (Prudnikov et al. 1986) Z 0
1
ˇ L21 ./ d D
1Cˇ 1 ; 3 < ˇ < 1: 4 cos.ˇ=2/
(115)
It follows from this formula that the “mass” center r1m of G1C can be represented by the simple formula r1m .t/ D
2 t:
(116)
In the general case, we just note that the symmetry relation I .ˇ/ D I .ˇ 2/; 2˛ 1 < ˇ < 2˛ 1 holds true because of the formula (91). Finally we mention that the location of energy of the damped wave G˛C that is defined as the time corresponding to the centroid of the function G˛C in the time domain is given by the formula (Carcione et al. 2010) R1 C 2 t G .r; t/ dt t˛c .r/ D R0 1 ˛ : C .r; t/ 2 dt G ˛ 0
(117)
For 1 < ˛ < 2, both integrals at the right-hand side of (117) converge and the finite location of energy can be represented in the form t˛c .r/ D
R1 2 r 0 L˛ ./ d ; vc .˛/ D R 1 ; 2 vc .˛/ 0 L˛ ./ d
where the function L˛ is defined by (101). For ˛ D 1, the integral diverges, so that vc .˛/ tends to 0 as ˛ ! 1.
(118) R1 0
L2˛ ./ d
Fractional Diffusion and Wave Propagation
4.4
1539
Velocities of the Damped Waves
It is well known (see, e.g., Smith 1970; Bloch 1977; Groesen and Mainardi 1989, 1990; Gurwich 2001; Carcione et al. 2010) that several different definitions of the wave velocities and in particular of the light velocity can be introduced. For the damped waves that are described by the fractional wave equation (77), we evaluate the propagation velocity of the maximum of its fundamental solution G˛ that can be interpreted as the phase velocity, the propagation velocity of the gravity center of G˛ , the velocity of its “mass” center or the pulse velocity, and three different kinds of its centrovelocity. It turns out that all these velocities are constant in time and depend just on the order ˛ of the fractional wave equation. Whereas four out of six velocities are different to each other, the first centrovelocity coincides with the Smith centrovelocity, and the second centrovelocity is the same as the pulse velocity. We start with the phase velocity and determine it using the formula (106) that leads to the result that the maximum locations of the fundamental solution G˛ propagate with the constant velocities vp .˛/ that are given by the expression ! ˛1 p cos. ˛=2/ C ˛ 2 sin2 . ˛=2/ dr˛? .t/ D˙ : vp .˛/ WD ˙ dt ˛C1
(119)
For ˛ D 1 (modified convection equation (78)), the propagation velocity of the maximum of G˛ is equal to zero (the maximum point stays at x D 0), whereas for ˛ D 2 (wave equation), the maximum points propagate with the constant velocity ˙1. To determine the propagation velocity vg .˛/ of the gravity center of G˛ , we employ the formula (111) and get the following result: g
vg .˛/ WD
2 dr˛ .t/ D : dt ˛ sin.=˛/
(120)
vg .˛/ is thus time independent and determined by the order ˛ of the fractional wave equation. Evidently, vg .2/ D 1 and vg .˛/ ! C1 as ˛ ! 1 C 0. The velocity vm .˛/ of the “mass” center of G˛ or its pulse velocity (Gurwich 2001) is obtained from the formula (113) and is equal to dr m .t/ D vm .˛/ WD ˛ dt
R1 0
1 L2˛ ./ d R1 ; 2 0 L˛ ./ d
(121)
where the function L˛ is defined by (101). For ˛ D 1, the pulse velocity is equal to 2 0:64 (see the formula (116)). Following Carcione et al. (2010) we define the second centrovelocity v2 .˛/ as the mean pulse velocity computed from the time 0 to the time t. It follows from (113) and (121) that for the damped wave that is described by the fundamental solution of the fractional wave equation, the second centrovelocity is equal to its pulse velocity vm .˛/:
1540
Y. Luchko
r m .t/ D vm .˛/ D v2 .˛/ WD ˛ t
R1 0
1 L2˛ ./ d R1 : 2 0 L˛ ./ d
(122)
The Smith centrovelocity vc .˛/ (Smith 1970) of the damped waves describes the motion of the first moment of their energy distribution and can be evaluated in explicit form using the formula (118): vc .˛/ WD
dt˛c .r/ dr
1
R1 0 D R1 0
L2˛ ./ d
L2˛ ./ d
;
(123)
R1 where the function L˛ is defined by (101). Because the integral 0 L2˛ ./ d diverges for ˛ D 1, the Smith centrovelocity tends to 0 as ˛ ! 1. Finally, we evaluate the first centrovelocity v1 .˛/ that is defined as the mean centrovelocity from 0 to x (Carcione et al. 2010). It follows from (118) and (123) that for the damped wave G˛ the first centrovelocity is equal to the Smith centrovelocity vc .˛/: R1 2 r 0 L˛ ./ d v1 .˛/ WD c D vc .˛/ D R 1 : 2 t˛ .r/ 0 L˛ ./ d
(124)
As we have seen, all velocities introduced above are constant in time and depend just on the order ˛ of the fractional wave equation. The phase velocity, the velocity of the gravity center of G˛ , the pulse velocity, and the Smith centrovelocity are different to each other, whereas the first centrovelocity coincides with the Smith centrovelocity and the second centrovelocity is the same as the pulse velocity. For the physical interpretation and meaning of the velocities that were determined above, we refer to, e.g., Bloch (1977), Groesen and Mainardi (1989, 1990), Gurwich (2001), and Carcione et al. (2010).
4.5
Discussion of the Obtained Results and Plots
To start with, let us consider the evolution of the fundamental solution G˛ in time for some characteristic values of ˛. In Fig. 3, the plots of G˛ for ˛ D 1:01; 1:1; 1:5, and 1:9 are presented. As we can see, in all cases the maximum location of G˛ is moved in time according to the formula (106), whereas the maximum value decreases according to the formula (107). The behavior of G˛ can be thus interpreted as propagation of the damped waves whose amplitude decreases with time. This phenomena can be very clearly recognized on the 3D plot presented in Fig. 4. Of course, because of the nonlocal character of the fractional derivatives in the fractional wave equation, the solutions to this equation show some properties of diffusion processes, too. In particular, the fundamental solution G˛ is positive for all x 6D 0 at any small time instance t > 0 that means that a disturbance of the initial conditions spreads infinitely fast and Eq. (33) is nonrelativistic like the classical
Fractional Diffusion and Wave Propagation
1541
10
10
8
8
t=0.1 t=0.1
6
4
G1.1
G1.01
6
t=0.2
0 −0.5 −0.4 −0.3 −0.2 −0.1
t=0.3
t=0.3
2
0 x
0.1
0.2
0.3
0.4
t=0.2
4 2 0 −0.5 −0.4 −0.3 −0.2 −0.1
0.5
0 x
0.1
0.2
0.3
0.4
0.5
70
14
60
12 t=0.1
50
8
t=0.2
G1.9
G1.5
10
6 t=0.3
4
t=0.1
40 30
t=0.2
20 t=0.3
10
2 0 −0.5 −0.4 −0.3 −0.2 −0.1
0 x
0.1
0.2
0.3
0.4
0.5
0 −0.5 −0.4 −0.3 −0.2 −0.1
0 x
0.1
0.2
0.3
0.4
0.5
Fig. 3 Fundamental solution G˛ : plots for ˛ D 1:01 (1st line, left), ˛ D 1:1 (1st line, right), ˛ D 1:5 (2nd line, left), and ˛ D 1:9 (2nd line, right) for 0:5 x 0:5 and t D 0:1; 0:2; 0:3 Fig. 4 Plot of G˛ for ˛ D 1:5, 0:5 x 0:5, and 0 < t 0:3
25
G1.5
20 15 10 5 0.1
0 0.5
0.2
0 x
−0.5
0.3
t
diffusion equation. But in contrast to the diffusion equation, both the maximum location of the fundamental solution G˛ , its gravity and “mass” centers, and location of its energy propagate with the finite constant velocities like the fundamental solution of the wave equation. The plots of the propagation velocity vp of the maximum location of the fundamental solution G˛ (phase velocity), the velocity vg of its gravity center, its pulse velocity vm , and its centrovelocity vc are presented in Fig. 5. As expected, vp D vc D 0, vm D 2 0:64 for ˛ D 1 (modified convection equation), and all velocities smoothly approach the value 1 as ˛ ! 2 (wave equation). For 1 < ˛ < 2, vp ; vm ; and vc monotonously increase, whereas vg monotonously decreases. It is interesting to note that for all velocities v D v.˛/, the property dv.˛/ .2 0/ D 0 holds true, i.e., in a small neighborhood of the point d˛
1542
Y. Luchko
Fig. 5 Plots of the gravity center velocity vg .˛/, the pulse velocity vm .˛/, the phase velocity vp .˛/, and the centrovelocity vc .˛/ for 1 ˛ 2
V_g
3.0
V_m
2.5
V_p 2.0
V_c
1.5 1.0 0.5 0.0 1.0
1.2
1.4
1.6
1.8
2.0
˛ D 2, the velocities of G˛ are nearly the same as those of the fundamental solution of the wave equation. The velocity vg of the gravity center of G˛ tends to C1 for ˛ ! 1C0 and t > 0 (modified convection equation) because the first moment of the Cauchy kernel (98) does not exist. It is interesting to note that for all ˛; 1 < ˛ < 2, the velocities vp ; vg ; vm ; vc are different to each other and fulfill the inequalities vc .˛/ < vp .˛/ < vm .˛/ < vg .˛/. For ˛ D 2, all velocities are equal to 1.
5
Conclusions and Open Problems
In this chapter, anomalous transport processes have been first modeled with the continuous time random walks on the microlevel. On the macrolevel, the CTRW models were reduced to the deterministic fractional diffusion-wave equations on the large time and space scales and under some suitable assumptions posed on the jump pdf. This kind of equations has been already successfully employed, e.g., for modeling of the geothermal energy extraction (Luchko and Punzi 2011), stability and seismicity of the fractal fault systems (Gudehus and Touplikiotis 2012), and propagation of the damped waves (Luchko 2013) that shows their potential importance for different geomathematical applications. In this chapter, some important types of the partial differential equations of fractional order including the generalized time-fractional diffusion equation, the time-fractional diffusion-wave equation, and the time- and space-fractional wave equation were treated. For these equations, both the initialboundary-value problems with the Dirichlet boundary conditions and the Cauchy initial-value problems have been posed and investigated. Of course, the same method can be applied for the initial-boundary-value problems with the Neumann, Robin, or mixed boundary conditions. For the generalized time-fractional diffusion equation, a powerful maximum principle has been established. It enables us to obtain information regarding solutions and their a priori estimates without any explicit knowledge of the form of the solutions themselves and thus is a valuable tool in scientific research. In this connection we mention an important problem that is still waiting for its solution,
Fractional Diffusion and Wave Propagation
1543
namely, to try to extend the maximum principle to the space- and time-spacefractional partial differential equations. These equations are nowadays actively employed in modeling of relevant complex phenomena like anomalous diffusion in inhomogeneous and porous mediums, Levy processes and Levy flights, and the socalled fractional kinetics and are worth to be treated in detail from the mathematical viewpoint. In the last section of the chapter, a fractional wave equation with the fractional derivatives of order ˛; 1 ˛ 2, both in space and in time was introduced and analyzed. We showed that the fractional wave equation inherits some crucial characteristics of the wave equation like the constant propagation velocities of the maximum of its fundamental solution, its gravity and “mass” centers, and its energy location. Because the maximum value of the fundamental solution G˛ (wave amplitude) decreases with time whereas its location moves with a constant velocity, solutions to the fractional wave equation can be interpreted as the damped waves. Moreover, G˛ that turns out to be expressed in terms of elementary functions for all values of ˛; 1 ˛ < 2, can be interpreted as a spatial pdf evolving in time all whose moments of order less than ˛ are finite. In connection with the fractional wave equation, an important problem for further research would be determination of other velocities like the group velocity or the ratio-of-units velocity (see, e.g., Bloch 1977; Gurwich 2001) for the damped waves described by the fractional wave equation. Finally, the fractional wave equations with the nonconstant coefficients as well as qualitative behavior of solutions to the nonlinear fractional wave equations would be worth to consider from the mathematical viewpoint and to employ them as models in the suitable applications.
References Al-Refai M (2012) On the fractional derivatives at extreme points. Electron J Qual Theory Differ Equ 55:1–5 Berkowitz B, Klafter J, Metzler R, Scher H (2002) Physical pictures of transport in heterogeneous media: advection-dispersion, random walk and fractional derivative formulations. Water Resour Res 38:1191–1203 Bloch SC (1977) Eighth velocity of light. Am J Phys 45:538–549 Buckwar E, Luchko Yu (1998) Invariance of a partial differential equation of fractional order under the Lie group of scaling transformations. J Math Anal Appl 227:81–97 Carcione JM, Gei D, Treitel S (2010) The velocity of energy through a dissipative medium. Geophysics 75:T37–T47 Diethelm K (2010) The analysis of fractional differential equations. Springer, Berlin Emmanuel S, Berkowitz B (2007) Continuous time random walks and heat transfer in porous media. Transp Porous Media 67:413–430 Feller W (1952) On a generalization of Marcel Riesz’ potentials and the semi-groups generated by them. Meddelanden Lunds Universitets Matematiska Seminarium (Comm. Sém. Mathém. Université de Lund), Tome suppl. dédié à M. Riesz: 73–81 Fulger D, Scalas E, Germano G (2008) Monte Carlo simulation of uncoupled continuous time random walks yielding a stochastic solution of the space-time fractional diffusion equation. Phys Rev E 77:021122
1544
Y. Luchko
Geiger S, Emmanuel S (2010) Non-Fourier thermal transport in fractured geological media. Water Resour Res 46:W07504 Germano G, Politi M, Scalas E, Schilling RL (2009) Stochastic calculus for uncoupled continuoustime random walks. Phys Rev E 79:066102 Gorenflo R, Mainardi F (2001) Random walk models approximating symmetric space-fractional diffusion processes. In: Elschner J, Gohberg I, Silbermann B (eds) Problems in mathematical physics. Birkhäuser Verlag, Boston/Basel/Berlin Gorenflo R, Mainardi F (2009) Some recent advances in theory and simulation of fractional diffusion processes. J Comput Appl Math 229:400–415 Gorenflo R, Iskenderov A, Luchko Yu (2000a) Mapping between solutions of fractional diffusionwave equations. Fract Calc Appl Anal 3:75–86 Gorenflo R, Luchko Yu, Mainardi F (2000b) Wright functions as scale-invariant solutions of the diffusion-wave equation. J Comput Appl Math 118:175–191 Gorenflo R, Loutchko J, Luchko Yu (2002) Computation of the Mittag-Leffler function and its derivatives. Fract Calc Appl Anal 5:491–518 Groesen E, Mainardi F (1989) Energy propagation in dissipative systems, Part I: centrovelocity for linear systems. Wave Motion 11:201–209 Groesen E, Mainardi F (1990) Balance laws and centrovelocity in dissipative systems. J Math Phys 30:2136–2140 Gudehus G, Touplikiotis A (2012) Clasmatic seismodynamics – oxymoron or pleonasm? Soil Dyn Earthq Eng 38:1–14 Gurwich I (2001) On the pulse velocity in absorbing and nonlinear media and parallels with the quantum mechanics. Prog Electromagn Res 33:69–96 Hanyga A (2002) Multi-dimensional solutions of space-time-fractional diffusion equations. Proc R Soc Lond A 458:429-450 Haubold J, Mathai AM, Saxena RK (2011) Mittag-Leffler functions and their applications. J Appl Math 2011:298628 Luchko Yu (1999) Operational method in fractional calculus. Fract Calc Appl Anal 2:463–489 Luchko Yu (2008) Algorithms for evaluation of the Wright function for the real arguments’ values. Fract Calc Appl Anal 11:57–75 Luchko Yu (2009a) Boundary value problems for the generalized time-fractional diffusion equation of distributed order. Fract Calc Appl Anal 12:409–422 Luchko Yu (2009b) Maximum principle for the generalized time-fractional diffusion equation. J Math Anal Appl 351:218–223 Luchko Yu (2010) Some uniqueness and existence results for the initial-boundary-value problems for the generalized time-fractional diffusion equation. Comput Math Appl 59:1766–1772 Luchko Yu (2011a) Initial-boundary-value problems for the generalized multi-term time-fractional diffusion equation. J Math Anal Appl 374:538–548 Luchko Yu (2011b) Maximum principle and its application for the time-fractional diffusion equations. Fract Calc Appl Anal 14:110–124 Luchko Yu (2012a) Anomalous diffusion: models, their analysis, and interpretation. In: Rogosin S, Koroleva A (eds) Advances in applied analysis. Series: trends in mathematics. Birkhäuser Verlag, Boston/Basel/Berlin Luchko Yu (2012b) Initial-boundary-value problems for the one-dimensional time-fractional diffusion equation. Fract Calc Appl Anal 15:141–160 Luchko Yu (2013) Fractional wave equation and damped waves. J Math Phys 54:031505 Luchko Yu, Gorenflo R (1998) Scale-invariant solutions of a partial differential equation of fractional order. Fract Calc Appl Anal 1: 63–78 Luchko Yu, Gorenflo R (1999) An operational method for solving fractional differential equations with the Caputo derivatives. Acta Math Vietnam 24:207–233 Luchko Yu, Punzi A (2011) Modeling anomalous heat transport in geothermal reservoirs via fractional diffusion equations. Int J Geomath 1:257–276
Fractional Diffusion and Wave Propagation
1545
Luchko Yu, Mainardi F, Povstenko Yu (2013) Propagation speed of the maximum of the fundamental solution to the fractional diffusion-wave equation. Comput Math Appl 66:774– 784 Mainardi F (1994) On the initial-value problem for the fractional diffusion-wave equation. In: Rionero S, Ruggeri T (eds) Waves and stability in continuous media. World Scientific, Singapore Mainardi F (1996a) Fractional relaxation-oscillation and fractional diffusion-wave phenomena. Chaos Solitons Fractals 7:1461–1477 Mainardi F (1996b) The fundamental solutions for the fractional diffusion-wave equation. Appl Math Lett 9:23–28 Mainardi F, Luchko Yu, Pagnini G (2001) The fundamental solution of the space-time fractional diffusion equation. Fract Calc Appl Anal 4:153–192. E-print http://arxiv.org/abs/cond-mat/ 0702419 Marichev OI (1983) Handbook of integral transforms of higher transcendental functions, theory and algorithmic tables. Ellis Horwood, Chichester Matlab File Exchange (2005) Matlab-Code that calculates the Mittag-Leffler function with desired accuracy. Available for download at www.mathworks.com/matlabcentral/fileexchange/8738mittag-leffler-function Metzler R, Klafter J (2000) The random walk’s guide to anomalous diffusion: a fractional dynamics approach. Phys Rep 339:1–77 Metzler R, Klafter J (2004) The restaurant at the end of the random walk: recent developments in the description of anomalous transport by fractional dynamics. J Phys A 37:161–208 Metzler R, Nonnenmacher TF (2002) Space- and time-fractional diffusion and wave equations, fractional Fokker-Planck equations, and physical motivation. Chem Phys 284: 67-90 Montroll E, Weiss, G (1965) Random walks on lattices. J Math Phys 6:167 Näsholm SP, Holm S (2013) On a fractional Zener elastic wave equation. Fract Calc Appl Anal 16:26–50 Podlubny I (1999) Fractional differential equations. Academic, San Diego Prudnikov AP, Brychkov YA, Marichev OI (1986) Integrals and series. Vol 1: Elementary functions. Gordon and Breach, New York Samko SG, Kilbas AA, Marichev OI (1993) Fractional integrals and derivatives: theory and applications. Gordon and Breach, Yverdon Smith RL (1970) The velocities of light. Am J Phys 38:978–984 Szabo TL, Wu J (2000) A model for longitudinal and shear wave propagation in viscoelastic media. J Acoust Soc Am 107:2437–2446 Vladimirov VS (1971) Equations of the mathematical physics. Nauka, Moscow
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives Matthias Augustin, Mathias Bauer, Christian Blick, Sarah Eberle, Willi Freeden, Christian Gerhards, Maxim Ilyasov, René Kahnt, Matthias Klug, Sandra Möhringer, Thomas Neu, Helga Nutz, Isabel Michel née Ostermann, and Alessandro Punzi
Contents 1 2
3
4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Potential Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Classical Gravimetry Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Spline Method for Data Supplementation of Gravitational Measurement Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Gravitational Signatures of Hotspots/Mantle Plumes . . . . . . . . . . . . . . . . . . . . . 2.4 Gravito-Magneto Combined Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seismic Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Seismic Recording and Data Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Seismic Post-processing in a Multiscale Framework by Means of Surface-Layer Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fluid and Heat Flow Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Basic Physical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Fluid Flow in Hydrothermal Reservoirs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Heat Transport in a Porous Medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Flow Models for Petrothermal Reservoirs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Heat Transport in a Fissured Medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1548 1554 1554 1561 1564 1570 1572 1572 1576 1581 1584 1589 1591 1593 1601
M. Augustin () • C. Blick • S. Eberle • W. Freeden • C. Gerhards • M. Ilyasov • M. Klug • S. Möhringer • H. Nutz • A. Punzi Geomathematics Group, University of Kaiserslautern, Kaiserslautern, Germany e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected] M. Bauer CBM GmbH, Bexbach, Germany R. Kahnt G.E.O.S. Ingenieurgesellschaft mbH, Freiberg, Germany T. Neu Tiefe Geothermie Saar GmbH, Saarbrücken, Germany I.M. née Ostermann Fraunhofer ITWM, Kaiserslautern, Germany © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_22
1547
1548
M. Augustin et al.
5 Poroelastic Stress Field Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Opportunities, Challenges, and Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1605 1615 1620
Abstract
Modeling geothermal reservoirs is a key issue of a successful geothermal energy development. After over 40 years of study, many models have been proposed and applied to hundreds of sites worldwide. Nevertheless, with increasing computational capabilities, new efficient methods become available. The aim of this paper is to present recent progress on potential methods and seismic (post-)processing, as well as fluid and thermal flow simulations for porous and fractured subsurface systems. Commonly used procedures in industrial energy exploration and production such as forward modeling, seismic migration, and inversion methods together with continuum and discrete flow models for reservoir monitoring and management are explained, and some numerical examples are presented. The paper ends with the description of future fields of studies and points out opportunities, perspectives, and challenges.
1
Introduction
Temperature increases with depth in the Earth at the average of 25 ı C=km. If the average surface temperature is assumed to be around 15 ı C, the temperature at about 3 km is around 90 ı C. All in all, in accordance with the standard geophysical estimates, the total heat content of the Earth (reckoned at an average surface temperature of 15 ı C) is approximately 12:6 1024 MJ, where the heat content of the crust amounts to 5:4 1021 MJ. Nowadays, only a fraction of the heat content can be utilized, depending on geological conditions. In favor are areas that transfer heat from deep zones to the surface. Spatial variations of thermal energy within the deep crust and mantle of the Earth give rise to concentrations of thermal energy near the surface of the Earth, such that certain locations can be used as an energy resource. Heat is transferred from the deeper parts of the Earth by conduction through rocks, by the movement of hot deep material towards the surface, particularly when associated with recent volcanism, and by circulation of water, e.g., in active fault zones. Figure 1 illustrates some scenarios of interest for geothermal exploitation. Much of the geothermal exploration occurring worldwide is focused on the geologic concept of plate tectonics since most of the current thermal activities are located near plate boundaries. The brittle and moving plates of the lithosphere (crust and upper mantle) are driven by convection of plastic rocks beneath the lithosphere (see Geothermal Energy Association 2011, for more details). Convection causes the crustal plates to break and move away in opposite directions from zones of upwelling hot material. Magma moving upward into a zone of separation is followed by substantial amounts of thermal energy. A particular source of elevated heat flow and volcanism are plumes and hotspots.
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1549
Fig. 1 Schematic representation of geothermally relevant zones (following Saemundsson 2009)
Several important geothermal systems are associated with recent volcanism caused by hotspots, e.g., geothermal fields of Iceland, Yellowstone, and the Azores. Areas of the world with high geothermal potential are shown in Fig. 2. Innovative techniques of exploration, geothermal drilling, electric power generation, heat pumps, etc. may open new frontiers in the near future and probably allow utilization of larger fractions of the untapped resources. In fact, the heat in place is not the problem. The question is how much can be taken out and at what price. It is our belief that a geoscientific consortium including geomathematics as a key technology can contribute significantly to find a more appropriate answer for all types of geothermal energy systems, i.e., deep as well as near-surface geothermal systems (see Fig. 3). Indeed, the complexity of the entire energy production chain makes modeling of geothermal reservoirs a difficult challenge involving different scientific disciplines. For the geothermal energy production system to be efficient, it is of essential importance not only to have a deep understanding of the geologic, thermal, and mechanical configuration of the potential site but also to be able to predict possible consequences of this invasive procedure, to satisfy mathematical requirements, and to execute large numerical computations. For this reason, geologists, engineers, physicists, and mathematicians must share their expertise and work together in order to provide an efficient solution for the ambitious task “geothermal energy”. The gained geothermal energy can be employed directly for the heat market or it can be used indirectly for electricity generation. Countries with active volcanism such as Iceland have a longstanding tradition of industrial use of geothermal energy
1550
M. Augustin et al.
North American Plate
Eurasian Plate
Eurasian Plate
A tt
antic R
e idg
Indo-Australian Plate
East Pacific Rise
i
d-
Nazca Plate
South American Plate
0° African Plate
30° 45° 60°
Antarctic Plate 90°
120°
150°
180°
150°
90°
120°
60°
30°
0°
30°
North American Plate
Eurasian Plate
45° 30°
M
Pacific Plate
60°
60°
90°
60° Eurasian Plate
45° 30°
Pacific Plate
0° Nazca Plate
South American Plate
Indo-Australian Plate
African Plate
30° 45° 60°
Antarctic Plate 90°
120°
150°
180°
150°
120°
90°
60°
30°
0°
30°
60°
90°
Fig. 2 Major tectonic plates of the Earth (top), areas of the Earth with potential for generating geothermal energy (bottom) (due to Hammons 2011)
gained by reservoirs with high enthalpy (>180 ıC), which are easily accessible due to their shallow depths (cf. Georgsson and Friedleifsson 2009). Nevertheless, the absence of these sources does not exclude substantial geothermal potential in other regions or countries (for more details on the geothermal potential in Germany, see, e.g., Jung 2007; Schulz 2009) in the shape of deep reservoirs with low enthalpy (1 km), through mostly vertical fractures, to extract the heat from the rocks. (3) Sedimentary geothermal systems are probably the most common type worldwide. The systems in this category are characterized by medium temperature, high flow rate geothermal reservoirs in large-basin, sedimentary deposits. Geothermics can also be classified based on depth. It distinguishes between nearsurface geothermal energy and deep geothermal energy. The former is mainly used for heating in private homes. The required facilities are, e.g., geothermal collectors or thermo-active pipes reaching down to a depth of at most 400 m. The use of deep geothermal energy, on the other hand, requires boreholes with a depth between 2 and 5 km. Basically, there are three different types of systems for deep geothermal power production (see Fig. 3): thermowells (deep heat exchanger), hydrothermal reservoirs, and petrothermal reservoirs. In a thermowell, the heat transfer medium circulates in a closed cycle within a U-pipe or a coaxial heat exchanger. Therefore,
1552
M. Augustin et al.
Table 1 Classification of geothermal systems on the basis of temperature, enthalpy, and physical state (based on the work of Bödvarsson 1964, Axelsson and Gunnlaugsson 2000, and Sanyal 2005) Low-enthalpy Liquid-dominated Low-temperature geothermal reservoirs with (LT) systems with geothermal systems water temperature at, or reservoir temperature with reservoir fluid below, the boiling point at at 1 km depth below enthalpies less than the prevailing pressure, such often 800 kJ/kg, 150 ı C, that the water phase controls characterized by corresponding to the pressure in the reservoir. hot or boiling springs temperatures less ı C than about 180 Some steam may be present Mediumtemperature (MT) systems with reservoir temperature at 1 km depth between 150 and 200 ı C High-temperature High-enthalpy geothermal Two-phase geothermal reservoirs (HT) systems with systems with reservoir fluid where steam and water co-exist and reservoir temperature enthalpies greater than 800 kJ/kg the temperature and pressure follow at 1 km depth above the boiling point curve 200 ı C, characterized Vapor-dominated geothermal sysby fumaroles, steam tems where temperature is at, or vents, mud pools, and above, the boiling point at the prehighly altered ground vailing pressure and the steam phase controls the pressure in the reservoir. Some liquid water may be present
only one borehole has to be drilled. Although there is the advantage of no contact with the groundwater, the relatively low productivity gives rise to focus on open systems, namely, hydrothermal and petrothermal reservoirs. The concept behind hydrothermal systems is to let thermal water found in deep reservoirs circulate between two or three drilled deep wells through a previously existing aquifer. Typically, these reservoirs consist of a porous medium layer heated from below by a hot stratum of impermeable material. By contrast, in petrothermal systems (also called Hot Dry Rock (HRD) Systems or Enhanced Geothermal Systems (EGS)), the water flows through fractured hot rock, the porosity of which can be enhanced by hydraulic stimulation. In the latter case, water is artificially pumped into the reservoir. In this work, we essentially focus on hydro- and petrothermal systems. The key issues that geomathematicians have to face are the detection of potential reservoirs along with the surrounding subsurface structure, including temperature, capacity and hydraulic characteristics of aquifers, and the development of a comprehensive model describing the dynamics of the production process, in particular concerning flow, temperature, and composition of the fluid. Recent events such as earthquakes (see also Phillips et al. 2002) show that another fundamental mathematical problem is the understanding of inner stress conformation and dynamics. Geothermal exploration methods include a broad range of disciplines: geology, geophysics, geochemistry, geoengineering, and, of course, geomathematics.
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1553
Fig. 4 Column model characterizing deep geothermal systems implemented by the Geomathematics Group at the University of Kaiserslautern
Exploration involves not only identifying hot geothermal bodies but also costefficient regions to drill. A key technology is the detection of geophysical features, such as fault patterns and their accompanying fractures, karst occurrence, or water transfer rates in the area of interest in the deeper underground. This is done, for example, by migration of seismic data. Gravimetry studies use changes in the density to characterize subsurface properties. In particular, subsurface fault lines are identifiable with gravitational methods. Magnetotellurics data allow the detection of resistivity anomalies associated with geothermal structures, including faults, and enable the estimation of geothermal reservoir temperatures at various depths. Geomagnetism offers the possibility to detect in which depth the Curie temperature for certain materials is reached, hence, providing valuable information on future plant productivity. Based on the knowledge gained by seismic modeling (location, orientation, and aperture of cracks), a mathematical description of the stress field prior to production can be provided. Due to the danger of fluid flow in a highly stressed system, it is also of crucial importance to understand the evolution of the stress during the production process. Another core issue is modeling the actual underground flow of water, which must take into account several aspects such as thermal flow, chemical flow, evolution of the pressure gradient, and eventual consequences on the highly stressed rock conformation. This means solving coupled equations for elasticity as well as for two-, three-, or multiphase fluid, heat, and mass flow in porous or fractured media (see, e.g., Pruess 1990, for flow problems). All in all, from the viewpoint of geomathematics, we are required to model and simulate both the necessary (thermal, mechanical, hydraulic, etc.) parameters and the processes occurring during exploration, construction, and operational phases by using faulty or incomplete (measuring) data.
1554
M. Augustin et al.
In the Geomathematics Group at the University of Kaiserslautern, a column model was developed to handle deep geothermal reservoirs. It consists of the following four areas (columns): potential methods (gravitation/geomagnetics), seismic exploration, transport processes, and stress field (see Fig. 4), with various related publications, e.g., Freeden et al. (2003), Freeden and Schreiner (2006), Freeden and Wolf (2009), Freeden (2011), Freeden and Nutz (2011, 2014), Ilyasov (2011), Luchko and Punzi (2011), Ostermann (2011a,b), Gerhards (2011, 2012, 2014), Augustin (2012), Freeden and Blick (2013), Freeden and Gerhards (2013), Freeden and Gutting (2013), and Bauer et al. (2014a,b). In this work, we essentially follow the Kaiserslautern model. Accordingly, we first give a short description of geopotential methods. Then, we go on to seismic data retrieval, post-processing, and decorrelation of signatures. Different types of geothermal reservoirs will be discussed subsequently, followed by a description of the involved transport processes. The simulation of the stress field is investigated. Finally, we present an outlook on the future of the geothermal situation, i.e., opportunities, challenges, as well as perspectives.
2
Potential Methods
In order to minimize the geothermal exploration risk, one is well advised in geothermal obligations first to consult potential methods using gravimeter and/or magnetometer data (in accordance with the first column of the Kaiserslautern model in Fig. 4).
2.1
Classical Gravimetry Problem
The inversion of Newton’s Law of Gravitation, i.e., the determination of the integral density function from information of the external gravitational potential, is known as the gravimetry problem. To be more precise, let B R3 be (a region of) the Earth. We are interested in the density function W B ! R which we want to reconstruct from (information of) the gravitational potential P in R3 nB. The gravitational potential is given by the integral Z ˆ.r 2 I kx yk/ .y/ dV .y/;
P .x/ D T Œ .x/ D
(1)
B
with the volume element dV .y/ and the Laplace operator r 2 . Please note that we assume here that the gravitational constant can be set equal to 1. The kernel function, given by ˆ.r 2 I kx yk/ D
1 ; x; y 2 R3 ; 4kx yk
x ¤ y;
(2)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1555
is the fundamental solution to the Laplace operator in R3 (see, e.g., Freeden and Gerhards 2013). Here, kk denotes the Euclidean norm of a vector in R3 . It is well known that in the classification due to Hadamard, the gravimetry problem violates all criteria of well posedness, viz., uniqueness, existence, and stability (for more details, the reader is referred to, e.g., the Habilitation thesis Michel (2002) as well as Freeden and Michel (2004) and Michel and Fokas (2008)). We merely summarize the essential results: (1) The potential P is harmonic in R3 nB. In accordance with the so-called Picard condition known in the theory of inverse problems (e.g., Engl et al. 1996), a solution exists only if P belongs to (an appropriate subset in) the space of harmonic functions. However, it should be pointed out that this observation does not cause a numerical problem since in practice the information on P is only finite dimensional. In particular, an approximation by an appropriate harmonic function is a canonical ingredient of any practical method. (2) The most serious problem is the non-uniqueness of the solution: the associated Fredholm integral operator T is of first kind and has a kernel (null space) that is known to coincide with the L2 .B/-orthogonal complement of the closed linear subspace of all harmonic functions in B (see, e.g., Michel 2002; Freeden and Gerhards 2013). The orthogonal complement, i.e., the class of anharmonic functions on B, is known to be infinite dimensional. In fact, the problem of non-uniqueness has been discussed extensively in the literature. This problem can be overcome by imposing some reasonable additional conditions on the density. A suitable condition, suggested by the mathematical structure of the Newton potential operator T , is to simply require that the density is harmonic. The approximate calculation of the harmonic density has already been implemented and covered in several papers, whereas the problem of determining the anharmonic part still seems to remain a great challenge. Due to the lack of an appropriate physical interpretation of the harmonic part of the density, various alternative variants have been discussed in the literature (see, e.g., Freeden and Michel 2004; Michel and Fokas 2008, and the references therein). In general, gravitational data yield significant information only about the uppermost part of the Earth’s interior, which is not laterally homogeneous. This is the reason why it results in serious difficulties. (3) Restricting the operator to harmonic densities leads to an injective mapping that has an unbounded inverse, implying an unstable solution. Apart from the inversion of the Newton potential (see, e.g., Ernstson and Alt 2013, for an overview on the conventional methods in geothermal modeling), the decorrelation of features occurring in the density function W B ! R plays an important role in geothermal practice. Thus, we introduce a wavelet approach that supplies detail information that is useful for interpretation of the different features. As we shall see, the resulting post-processing method as proposed here can either be implemented for the density function itself or – more troublesome – based on gravitational data. That is because the Newton potential (1) is directly related to
1556
M. Augustin et al.
the density via a differential equation in the interior of the Earth, viz., the Poisson equation, at least if Hölder continuity is assumed for W B ! R (see, e.g., Freeden and Gerhards 2013, for more details). We obtain Z .x/ D rx2 P .x/ D rx2
ˆ.r 2 I kx yk/ .y/ dV .y/;
x 2 B:
(3)
B
Indeed, the Poisson differential equation (3) enables us to develop specific localizing scaling and wavelet functions for post-processing procedures from the density function itself and/or from gravitational data. Concerning the first variant, i.e., efficient post-processing of available density information, we use a regularization technique of the Newton potential (1) by approximating the fundamental solution ˆ.r 2 I kk/ by a (one-dimensional) linear Taylor expansion ˆ .r 2 I k k/ given by ( 2
ˆ .r I kx yk/ D
1 1 ; 4 kxyk 1 .3 12 kx 8
kx yk 2
yk /; kx yk < :
(4)
It is readily seen that the regularization P of the potential P in (1) given by Z ˆ .r 2 I kx yk/ .y/ dV .y/
P .x/ D T Œ .x/ D
(5)
B
satisfies the asymptotic relation sup kP .x/ P .x/k D O. 2 /;
! 0;
(6)
x2B
i.e., P approximates P with order 2 . fˆ g >0 is called the family of (scale continuous) scaling functions, while fW g >0 given via the (scale continuous) differential expression W D
d ˆ d
(7)
is called the family of (scale continuous) wavelet functions. We omitted the reference to the Laplace operator here and will do so further on. The transition from scale continuity to scale discretization (see Freeden and Blick 2013; Freeden and Gerhards 2013; Freeden and Schreiner 2009) is provided by a positive, monotonically decreasing sequence j j 2N of scale parameters such that limj !1 j D 0. In it is not difficult to see that the family of (scale discrete) fact, wavelet functions ‰j j 2N defined by
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
Z
j
‰j D
W j C1
1557
1 d
(8)
satisfies the (scale discrete) difference equation Z j d 1 ‰j D ˆ d D ˆj C1 ˆj ; d j C1
j 2 N:
(9)
Obviously, we have Z PJ .x/ D
ˆj .kx yk/.y/ dV .y/ C
J 1 Z X
‰l .kx yk/.y/ dV .y/;
lDj B
B
(10) j; J 2 N; J > j; with lim sup kP .x/ PJ .x/k D 0:
J !1
(11)
x2B
Of course, this convergence result is also valid for (5) if we regard ! 0. It should be noted that all wavelet functions defined by (9) have a compact support, which means that the evaluation of the integrals can be restricted economically to the support of the functions. Moreover, we are able to introduce low-pass filters Lj and band-pass filters Bj for the density in the form Z rx2 ˆj .kx yk/ .y/ dV .y/
j .x/ D Lj Œ .x/ D
(12)
B
and Z rx2 ‰j .kx yk/ .y/ dV .y/:
Bj Œ .x/ D
(13)
B
In connection with the differential equation (see, e.g., Freeden and Gerhards 2013) rx2 ˆj .kx yk/ D Hj .kx yk/;
(14)
we indeed arrive at the Haar scaling function Hj .k k/ given by
Hj .kx yk/ D
8 < 0;
kx yk j
:
kx yk < j
3 ; 4j3
(15)
1558
M. Augustin et al.
and the Haar wavelet function Kj .kx yk/ D Hj C1 .kx yk/ Hj .kx yk/;
(16)
which are well known in constructive approximation. Higher-order Taylor approximations of the fundamental solution of the Laplace operator and resulting smoothed Haar kernels are considered in the PhD thesis Möhringer (2014). If is assumed to be Hölder continuous on B, the low-pass filter converges in the sense that lim LJ Œ .x/ D .x/;
J !1
x 2 B;
(17)
where LJ Œ .x/ is expressed using a sum of band-pass filters Z LJ Œ .x/ D
Hj .kx yk/.y/ dV .y/ C
J 1 Z X
Kl .kx yk/.y/ dV .y/;
lDj B
B
(18) j; J 2 N; J > j: This is equivalent to the (more general) limit relation known from the theory of singular integrals in Euclidean space R3 Z HJ .kx yk/ .y/ dV .y/ D .x/;
lim
J !1
x 2 R3 :
(19)
B
Some properties of the multiscale reconstruction of the density via (18) are demonstrated here with regard to the BP density model (see Fig. 5). This model 0
2.6 2.4
2000
2.2 4000
2
6000
1.8 1.6
8000
1.4 10000
1.2 1
0
1
2
3
4
5
6
4
x 10
Fig. 5 Density contours of the BP model in
g cm3
(due to Billette and Brandsberg-Dahl 2005)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1559
was constructed during a research workshop co-sponsored by the EAGE and the SEG from the 2004 BP density benchmark with the aim to properly validate and test not only a velocity model but also a corresponding density model. As a threedimensional density model was not available, we extended the two-dimensional BP model by arranging several copies of the data set consecutively along the third orthogonal axis. Figure 6 shows the decomposition of the BP density model based on property (17) of the low-pass filters by displaying the scales j D 1 to j D 7, where the scale parameter j is adapted to the expansion of the model. The detail information provided by band-pass filtering of the scales j D 3; 4 sharply shows all essential density boundaries. Concerning the second variant, i.e., efficient multiscale processing of density information from gravitational data input (cf. Freeden and Gerhards 2013), we come back to (5) and the limit relation (11) from which it is clear that the Newton potential can be approximated by Z (20) P .x/ ' PJ .x/ D ˆJ .kx yk/ .y/ dV .y/; x 2 B; J 2 N; B
for sufficiently large J. An approximate integration formula over B then leads to P .x/ '
NJ X
J ˆJ .kx yiNJ k/ wN .yiNJ /; i
(21)
i D1 NJ J where wN i ; yi ; i D 1; : : : ; NJ ; are prescribed weights and nodes, respectively. Within the inversion process of density modeling, the unknown coefficients J .yiNJ /; aiNJ D wN i
i D 1; : : : ; NJ ;
(22)
must be determined by solving a linear system of the type P .xkMJ / D
NJ X
ˆJ .kxkMJ yiNJ k/aiNJ ;
k D 1; : : : ; MJ ;
(23)
i D1
from known gravitational values P .xkMJ / at the nodes xkMJ 2 B; k D 1; : : : ; MJ . J are known, the density values .yiNJ / are Since the integration weights wN i immediately obtainable via (22) such that the density can be determined by (cf. (19))
.x/ ' J .x/ D
NJ X i D1
NJ J HJ .kx yiNJ k/wN i .yi /;
x 2 B:
(24)
0
2.6 2.5 2.4 2.3 2.2 2.1 2 1.9 1.8 1.7
2000 4000 6000 8000 10000 0
1
2
3
4
5
6
x 10
4000 6000 8000 10000 1
2
3
4
5
6000 8000 10000 0
1
2
3
4
5
6
4
2000 4000 6000
8000 10000 0
1
2
3
4
5
4000 6000 8000 10000 0
1
2
3
4
5
6
0
0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4 −0.5
2000 4000 6000 8000 10000 0
1
2
3
4
5
x 104 0
4000 6000 8000 10000 0
1
2
3
4
5
6
x 10 0
6000 8000 10000 0
1
2
3
4
5
6
4000 6000 8000 10000 1
2
3
4
5
4000 6000 8000 10000 0
1
2
3
4
5
0.5 0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3
4000 6000 8000 10000 1
2
3
4
5
6000 8000 10000 0
1
2
3
4
5
6
x 104
0.5 0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3
2000 4000 6000 8000 10000 1
2
3
4
5
6
4
x 10 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1 0.8
4000
0
0
x 104
2000
6
x 104
6
0
4
0
0
2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1
2000
6
2000
x 104 0
0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4 −0.5
2000
x 10 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1
4000
0
0
4
2000
6
x 104 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1
2000
6
x 104 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1
2000
0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4 −0.5 −0.6
0
x 104 0
6
x 10 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1
2000
0
4000
4
0
0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4 −0.5
0 2000
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3
0 2000 4000 6000 8000 10000 0
1
2
3
4
5
6
x 104
Fig. 6 Multiscale decomposition of the (3D-extended) BP density model
scale j D 1 to gfrom j D 7 (low-pass filtered density, left; band-pass filtered density, right) in cm3
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
2.2
1561
Spline Method for Data Supplementation of Gravitational Measurement Information
In geothermal practice, we are often confronted with the problem to complete gravitational information by means of the knowledge of discrete spaceborne, airborne, and/or terrestrial observations as well as internal borehole data. Our purpose is to explain a spline method involving the framework of Newton’s volume potential. The point of departure is the well-known Poisson formula (see, e.g., Freeden and Gerhards 2013) Z P .x/ D
rx2
ˆ.kx yk/ P .y/ dV .y/ B
Z Z
D rx2
ˆ.kx yk/ˆ.ky zk/ dV .y/ .z/ dV .z/; B B
„
ƒ‚
(25)
…
KH .x;z/
where KH .; / is the uniquely determined reproducing kernel of a Hilbert space H (of Sobolev type). The inner product .; /H in H is given by .P1 ; P2 /H 0 1 Z Z D @ ˆ.k yk1 .y/ dV .y/; ˆ.k yk/2 .y/ dV .y/A B
B
20
H
10 13 Z Z Z D 4@ rx2 ˆ.kx yk1 .y/ dV .y/A@ rx2 ˆ.kx yk/2 .y/ dV .y/A5 dV .x/ B
Z
D
B
B
1 .x/2 .x/ dV .x/ B
D.1 ; 2 /L2 .B/ :
(26)
In other words, the inner product .; /H may be interpreted by a transition from the L2 -product .; /L2 .B/ for density functions 1 ; 2 2 L2 .B/ to the level of Newton volume potentials P1 ; P2 2 H. For each x 2 R3 and every Newton volume potential P 2 H, the reproducing property .KH .x; /; P /H D P .x/
(27)
holds true. This property implies that there exists a (spline) function S 2 H satisfying the minimum norm interpolation relation
1562
M. Augustin et al.
kS kH D min kP kH ;
(28)
P 2I
where I is the class of Newton potentials in H consistent to the interpolating conditions Li S D Li P D ˛i ;
i D 1; : : : ; N:
(29)
Li ; i D 1; 2; : : : ; N , denote linear bounded functionals on H characterizing measurable gravitational observables (e.g., potential values, gravity anomalies), whereas ˛i ; i D 1; 2; : : : ; N , are the known measured quantities. The interpolation spline S 2 H then takes the form
S .x/ D
N X
ai Li KH .x; /;
x 2 R3 :
(30)
i D1
The coefficients ai ; i D 1; 2; : : : ; N , are obtainable via the linear system N X
aj Li Lj KH .; / D ˛i ;
i D 1; : : : ; N:
(31)
j D1
In case of erroneous data, we can proceed from spline interpolation to spline smoothing (cf. Freeden 1981, 1999). As indicated by (4), the function ˆ.kx yk/ may be regularized. Consequently, the reproducing kernel Z KH .x; z/ D
ˆ.kx yk/ˆ.ky zk/ dV .y/
(32)
B
may be replaced by the regularized kernel Z J KH .x; z/ D
ˆJ .kx yk/ˆJ .ky zk/ dV .y/
(33)
B
for a sufficiently large integer J such that J is a sufficiently small element of a positive, monotonically decreasing sequence j j 2N as introduced above. The important difference is that the right-hand side of (32) is an indefinite integral in R3 for all x; z 2 B, whereas the right-hand side of (33) is a regular integral (thus, allowing the application of unbounded functionals Li , e.g., oblique derivatives). Therefore, if the values Li P D ˛i , i D 1; 2; : : : ; N , are known, we are able to derive an expression in terms of the regularized kernels
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1563
Fig. 7 Section of the BP density model chosen for the implementation of the spline g interpolation in cm3
J
S .x/ ' S .x/ D
N X
J ai Li KH .x; /
i D1
D
N X
Z ai Li
i D1
ˆJ .kx yk/ˆJ .ky k/ dV .y/;
x 2 B;
B
(34) as well as by applying the operator rx2 to both sides of (34) the corresponding density function (cf. Eq. (14))
J
.x/ ' .x/ D
N X i D1
Z ai
HJ .kx yk/Li ˆJ .ky k/ dV .y/; B
x 2 B; (35)
for sufficiently large J . The representations (34) and (35) hold independently of the location of the measurement points (i.e., whether we consider inner, terrestrial, or outer data). Furthermore, (34) and (35) allow a decorrelation of the signal in a similar manner as achieved by the multiscale representation in Sect. 2.1. Some numerical results for the spline method are shown in Fig. 8 for a section of the BP density model as shown in Fig. 7. For more details, the reader is referred to the PhD thesis Möhringer (2014).
1564
M. Augustin et al.
2000
2000
2200
2.2
2200
2400
2
2400
2600
1.8
2600
2800
1.6
2800
3000
1.4
3200
1.2 1
3400
0.8
3600
0.6
3800 4000
0.4 2.3
2.4
2.5
2.6
2.7 x 104
2.5
2
3000
1.5
3200 3400
1
3600 3800 4000
0.5 2.3
2.4
(a) j = 6
2.7 x 104
2000
2200
3
2200 2
2400 2600
2400
2.5
2600 1.5
2800 3000
2
2800 3000
1
3200 3400
1.5
3200 1
3400 0.5
3600 3800
3600
0.5
3800 2.3
2.4
2.5
2.6
0
2.7 x 104
(c) j = 8 Fig. 8 Spline interpolation of the density
2.3
2.6
(b) j = 7
2000
4000
2.5
4000
2.3
2.4
2.5
2.6
2.7 x 104
(d) j = 9
g cm3
in a local region for different scales j
Gravitational Signatures of Hotspots/Mantle Plumes
Because of the different temperature levels in comparison to its vicinity, locations of hotspots/plumes play a particular role in geothermal obligations, e.g., for locating high temperature water at moderate depths. Here we essentially follow the theory developed in the monograph Freeden and Gerhards (2013) on potential theory to model gravitational anomalies caused by hotspots/plumes with respect to their horizontal/vertical spatial extensions inside the Earth. Nowadays, the concept of mantle plumes is widely accepted in the geoscientific community. Mantle plumes are understood to be approximately cylindrically concentrated upflows of hot mantle material with a common diameter of about 100– 200 km. They are an upwelling of abnormally hot rock within the Earth’s mantle. As the heads of mantle plumes can partly melt when they reach shallow depths,
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1565
they are thought to be the cause of volcanic centers known as hotspots. Hotspots have first been explained by Wilson (1963) as long-term sources of volcanism that are fixed, with a tectonic plate overriding them. Following Morgan (1971), characteristic surface signatures of hotspots are due to the rise and melting of hot plumes from deep areas in the mantle. Special cases occur as chains of volcanic edifices whose age progresses with increasing distance to the plume, like the islands of Hawaii. They are the result of a pressure-release melting close to the bottom of the lithosphere that produces magma rising to the surface and by plate motion relative to the plume. The term “hotspot” is used rather loosely. It is often applied to any long-lived volcanic center that is not part of the global network of mid-ocean ridges and island arcs, like Hawaii which serves as a classical example. Anomalous regions of thick crust on ocean ridges are also considered to be hotspots, like Iceland. The multiscale reconstruction/decomposition proposed by Freeden (1999), Freeden and Schreiner (2006), Freeden and Wolf (2009), Freeden et al. (2009), and Freeden and Gerhards (2013) used for modeling gravity anomalies and/or vertical deflections caused by hotspots/plumes essentially consists of two ingredients: First, terrestrial gravity anomalies and/or deflections of the vertical are taken as gravitational input data. Second, significant tools for signal recovery are locally supported Haar wavelets and their smoothed versions (see, e.g., Freeden et al. 1998). Clearly, the size of the local support depends on the scale of the wavelet, i.e., with increasing scale its diameter decreases. This is the reason why the wavelet concept allows a “zooming-in” process to local (high-frequency) phenomena. It turns out that the application of (smoothed) Haar wavelets provides a powerful approximation technique for the investigation of, e.g., local fine-structured hotspots/plume features. The included illustrations (taken from the PhD theses Fehlinger (2009) and Wolf (2009); analogous computations in geomagnetics can be found in Gerhards (2011, 2012)) show that the presented multiscale procedure allows a scale- and space-dependent characterization of geophysically reflected phenomena. The wavelet coefficients can be interpreted as spatial measures of certain frequency bands contained in the signal signatures. Thereby, the wavelet theory offers a physically relevant approach for detecting and decorrelating hotspots/plume features. A critical point for numerical computation involving Haar wavelets is that today only terrestrial gravitational data sets of limited spatial extent are available. This causes numerical instabilities (oscillations) at the boundaries of the test areas under consideration. In consequence, it is a twofold challenge for future work to combine globally given satellite and locally available terrestrial data to get a higher accuracy within the modeling process as well as to avoid artificial phenomena, such as Gibb’s phenomena, by imbedding the local test area in a larger satellite-generated regional framework. The hotspots/plume test areas selected here for multiscale demonstration are the Hawaiian islands and Iceland. Hawaii: Ritter and Christensen (2007) believe that a stationary mantle plume located beneath the Hawaiian Islands created the Hawaii-Emperor seamount chain while the oceanic lithosphere continuously passed over it (Fig. 9). The HawaiiEmperor chain consists of about 100 volcanic islands, atolls, and seamounts that spread nearly 6000 km from the active volcanic island of Hawaii to the 75–80 Ma
1566
M. Augustin et al. Kauai
NW plate motion
100 km
Oahu Maui
Hawaii
SE
60 km 70 km 88 km 105 km viscous flow
LVZ
plume
130 km
Fig. 9 Interpretation of seismic tomography results by Ritter and Christensen (2007) (modified version by Fehlinger 2009)
old Emperor seamounts nearby the Aleutian trench. Moving further south east along the island chain, the geological age decreases. The interesting area is the relatively young southeastern part of the chain, situated on the Hawaiian swell, a 1200 km broad anomalously shallow region of the ocean floor, extending from the island of Hawaii to the Midway atoll. Here, a distinct gravity disturbance and a geoid anomaly occur which have their maxima around the youngest island. Both coincide with the maximum topography and both decrease in northwestern direction. The progressive decrease in terms of the geological age is believed to result from the continuous motion of the underlying plate (cf. Wilson 1963; Morgan 1971). Using seismic tomography, several features of the Hawaiian mantle plume are gained (cf. Ritter and Christensen 2007, and the references specified therein). They reveal a low-velocity zone (LVZ) beneath the lithosphere, starting at a depth of about 130–140 km, below the central part of the island of Hawaii. So far, plumes have just been identified as low seismic velocity anomalies in the upper mantle and the transition zone by the use of seismic wave tomography, which is a fairly new achievement. Because plumes are relatively thin in lateral direction according to their diameter, they are hard to detect in global tomography models. Hence, despite novel advances, there is still no general agreement on the fundamental questions concerning mantle plumes, like their depth of origin, their morphology, and their
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1567
h 2i Fig. 10 Band-pass filtered details of the disturbing potential in ms2 from gravity anomalies in the region of Hawaii using discrete (smoothed) Haar wavelets for j D 9; 11; 13 (left) and j D 10; 12; 14 (right)
longevity, and even their existence is still discussed controversially. This is due to the fact that many geophysical as well as geochemical observations can be explained by different plume models and even by models that do not include plumes at all (e.g., Foulger et al. 2005). With our space-localized multiscale method of deriving gravitational signatures (see Freeden et al. 2009), more precisely the disturbing potential, from the deflections of the vertical, we add a new component in specifying essential features of plumes. From the band-pass filtered detail approximation (Fig. 10), we are able to conclude that the Hawaii plume has an oblique layer structure. As can be seen in the lower scale, which reflects the greater depths, the strongest signal is located in the ocean in a westward direction of Hawaii. With
1568
M. Augustin et al.
increasing scale, i.e., closer to the surface, it moves more and more to the Big Island of Hawaii, i.e., in eastward direction. Iceland: The plume beneath Iceland (cf. Freeden et al. 2009) is a typical example of a ridge-centered mantle plume. An interaction between the North Atlantic ridge and the mantle plume is believed to be the reason for the existence of Iceland, resulting in melt production and crust generation since the continental break-up in the late Paleocene and early Eocene. Nevertheless, there is still no agreement on the location of the plume before rifting started in the East. Controversial discussions, whether it was located under central or eastern Greenland about 62–64 Ma ago, are
h 2i Fig. 11 Band-pass filtered details of the disturbing potential in ms2 from gravity anomalies in the region of Iceland using Haar wavelets for j D 10; 12; 14 (left) and j D 11; 13; 15 (right)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives 68°N
68°N 2 1.5
66°N
1
1569
1.5
1
66°N 0.5
0.5 0
0
64°N
64°N –0.5
–0.5
–1 –1.5
62°N
26°W 24°W 22°W 20°W 18°W 16°W 14°W 12°
–1
62°N
26°W 24°W 22°W 20°W 18°W 16°W 14°W 12°
h 2i Fig. 12 Band-pass filtered details of the disturbing potential in ms2 from gravity anomalies in the region of Iceland using Haar wavelets for j D 14 (left) and j D 15 (right) including the Mid-Atlantic Ridge (gray)
still in progress (cf. Schubert et al. 2001, and the references therein). Iceland itself represents the top of a nearly circular rise of topography, with the maximum of about 2:8 km above the surrounding seafloor in the south of the glacier “Vatnajökull.” Beneath this glacier, several active volcanoes are located, which are supposed to result from the mantle plume. The surrounding oceanic crust consists of three different types involving a crust thickness that is more than three times as thick as average oceanic crusts. Seismic tomography provides evidence of the existence of a mantle plume beneath Iceland, resulting in low-velocity zones in the upper mantle and a transition zone, but also hints for anomalies in the deeper mantle seem to exist. The low-velocity anomalies have been detected in depths ranging from at least 400 km up to about 150 km. Above 150 km, ambiguous seismic velocity structures were obtained involving regions of low velocities covered by regions of high seismic velocities. For a deeper insight into the theory of the Iceland plume, the interested reader is referred to Ritter and Christensen (2007) and the references therein. From our multiscale reconstruction, it can be derived that the deeper parts of the mantle plume are located in the northern part of Iceland (compare the lower scales in Fig. 11), while shallower parts are located further south (compare the higher scales in Fig. 11). It is remarkable that from scale 13 on, the plume seems to divide into two sectors. As the North American plate moves westward and the Eurasian plate eastward, new crust is generated on both sides of the Mid-Atlantic Ridge. In case of Iceland, which lies on the Mid-Atlantic Ridge, the neovolcanic zones are readily seen in Fig. 12. In Iceland, electrical production from geothermal power plants has been developed rapidly (see Georgsson and Friedleifsson 2009; Saemundsson 2009). Reflecting the geological situation (see Fig. 13), Iceland is a unique country with regard to utilization of geothermal energy, with more than 50 % of its primary energy consumption coming from geothermal power plants.
1570
M. Augustin et al.
Fig. 13 Geothermal power plants in Iceland (following the International Energy Agency 2010)
All in all, by the space-based multiscale techniques initiated by Freeden and Schreiner (2006) in gravitation and by Freeden and Gerhards (2010) in geomagnetics, we are able to come to interpretable results involving hotspots/mantle plumes. In particular, for Iceland, we are led to the conclusion (cf. Fig. 12) that Iceland is split into three areas with characteristic ages of the basaltic rock. Tertiary flood basalt fills up most of the northwestern area. This formation of lava must be of considerable thickness, probably more than 3 km. Quaternary flood basalt must be revealed in the southwest and southeast. These rocks are cut by the neovolcanic areas of active rifting. Indeed, this area covers almost one-third of Iceland. However, it should be mentioned that our multiscale method offers better results for specifying hotspots/mantle plumes with respect to their horizontal than vertical size. A more detailed study of their depth remains a challenge for future work. One possible remedy is the artificial positioning of buried mass points in order to study the behavior of the multiscale representation for known depths of the anomaly. First attempts in this direction have been made in the PhD thesis Fehlinger (2009) and in Gerhards (2014). Indeed, this technique allows us to estimate the locations of upwelling gravity anomalies indicating a higher temperature in these regions.
2.4
Gravito-Magneto Combined Inversion
As the density of a section of the underground is coupled to its gravitational effect, its magnetization m results in a magnetic field that can be related to a magnetic potential M (see Fig. 14). However, while the density is scalar, the magnetization m is a vector-valued quantity.
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1571
Fig. 14 Principle of gravimetry (left) and its geomagnetic counterpart (right) (modified illustration based on Jacobs and Meyer 1992, with permission by B. G. Teubner)
Similarly to the inversion problem in the gravitational case via the Newton integral (1), major difficulties are faced in the determination of the magnetization m from the magnetic potential M in form of the integral expression 1 M .x/ D 4
Z B
m.y/ ry
1 dV .y/: kx yk
(36)
Most of the known inversion techniques (see, e.g., Blakely 1996; Freeden and Gerhards 2013; Menke 1984, and the references therein) for this problem make use of the replacement of the integral by a suitable (finite) sum and subsequent computation of a suitable solution of the linear equation system. Usually, gravity and magnetic inversion are handled separately (see, e.g., Ernstson and Alt 2013; Turcotte and Schubert 2001) in order to obtain density and magnetization independently from one another. At the same time, Poisson’s ratio (see Blakely 1996) shows that for each subset K of a body B R3 with uniform magnetization (i.e., m.y/ D m, y 2 K) as well as uniform density (i.e., .y/ D , y 2 K), we have M .x/ D
1 m 4
Z K
1 m D 4
ry
Z
1 1 dV .y/ D m kx yk 4
Z K
rx
1 dV .y/ kx yk
m 1 dV .y/ D rx P .x/ .y/rx kx yk K
(37)
for all x 2 R3 n K and the gravitational potential P from (1). In other words, if a body has a uniform magnetization and density, then the magnetic potential is proportional to the gravitational field component in the direction of the magnetization. These facts allow for a gravito-magneto combined inversion, where, e.g., the
1572
M. Augustin et al.
Euler summation formula for the Laplace operator r 2 and the boundary condition of periodicity (see Freeden 2011) turns out to be an advantageous tool (see Augustin et al. 2012, for more details). Multiscale methods similar to those described in the previous sections can be used to quantify deviations from uniformity and, thus, indicate gravitational/magnetic anomalies.
3
Seismic Processing
Following the Kaiserslautern model (cf. Fig. 4), we next deal with the explanation of seismic post-processing procedures based on multiscale techniques obtained by regularization of singular integrals occurring in seismic reflection tomography.
3.1
Seismic Recording and Data Retrieval
For seismic recording, an energy source (vibroseis, air gun, etc.) is placed on the surface. While the energy source generates a wave impulse, a set of receivers (geophones, hydrophones) placed along one or many parallel lines record this impulse after it is transmitted through the Earth’s interior, reflected at places of impedance contrasts (rapid changes of density/velocity) and transmitted back to the surface. Then, the configuration is moved into the direction of seismic acquisition and the procedure is repeated (see Fig. 15), so that each underground point is represented from all incidence angles needed for further data analysis. Other strategies to retrieve seismic data can be found, e.g., in Yilmaz (1987), Biondi (2006), and Claerbout (2009).
Fig. 15 Seismic acquisition (modified illustration taken from Jacobs and Meyer 1992, with permission by B. G. Teubner)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1573
In the context of seismic imaging, it is usually assumed that shear stresses generated by the wave impulse and other kinds of damping can be neglected. As a consequence, wave propagation is treated as an acoustic phenomenon. A derivation of the acoustic wave equation in seismic imaging can be found, e.g., in Freeden (2015). It is given by @2 K.x/ 2 @2 r p.x; t/ D p.x; t/ C S .x; t/: x @t 2 .x/ @t 2
(38)
Here, p.x; t/ is the pressure, K.x/ is the bulk modulus, .x/ is the density, and S .x; t/ contains pressure sources and sinks. Equation (38) implies that the propagation speed of a wave, i.e., the Euclidean norm of the velocity vector, is given by s K.x/ .x/
c.x/ D
(39)
and the wave equation can be written as
1 @2 1 @2 2 p.x; t/ D r S .x; t/: x c 2 .x/ @t 2 c 2 .x/ @t 2
(40)
A solution scheme to (40) can be found by applying Fourier transformation with 2 respect to time. With the assumption that @t@ 2 S .x; t/ D 0 for all x and the definition 1 U .x/ D p 2
Z p.x; t/ e i !t dt
(41)
R
we get !2 2 rx C 2 U .x/ D 0: c .x/
(42)
This leads to the definition of the wave number k.x/ and the refraction index N .x/ by c0 ; c.x/ ! ! c0 k.x/ D D D k0 N .x/; c.x/ c0 c.x/
N .x/ D
(43) (44)
with c0 being a suitable constant reference velocity (see Biondi 2006; Engl et al. 1996; Snieder 2002, and the references therein). Accordingly, the wave equation (42) can be written as
1574
M. Augustin et al.
rx2 C k02 N 2 .x/ U .x/ D 0:
(45)
The region where N .x/ ¤ 1 represents the scattering object such that N .x/ 1 may be supposed to have compact support. Another standard assumption is that the difference between c.x/ and c0 is small. As a consequence, N 2 .x/ may be developed into a Taylor series up to order one with a center such that c.x0 / D c0 . This yields N 2 .x/ D 1 C ".x/
(46)
with the small perturbation parameter " and consequently k 2 .x/ D k02 N 2 .x/ D k02 .1 C ".x// :
(47)
With the same argument as before, the unknown function .x/ may be supposed to have compact support. The wave operator may be split into Ax Drx2 C
!2 D rx2 C k02 N 2 .x/ D rx2 C k02 .1 C ".x// c 2 .x/
.1/ Drx2 C k02 C "k02 .x/ D A.0/ x C "Ax .0/
(48)
.1/
with Ax D rx2 C k02 and Ax D k02 .x/. Moreover, the wave field U may be split into an incident wave field UI , corresponding to the wave propagating in the absence of the scatterer, and the scattered wave field US such that U D UI C US :
(49)
This leads to A.0/ UI D rx2 C k02 UI D 0; A.0/ US D rx2 C k02 US D "k02 .UI C US / D "k02 U D "A.1/ U:
(50) (51)
An image of the subsurface structure corresponding to some given parameter like the wave propagation velocity or the underground density is produced by corresponding methods of seismic migration. By migration of the seismic data, the seismogram (amplitudes) recorded in time is shifted to its “true” depth position (see Figs. 15 and 16), so that the shape, depth, and reflection coefficients of different structures can be reconstructed (for more details, see, e.g., Yilmaz 1987, Claerbout 2009, and the references therein). All migration methods use an approximate velocity model obtained by means of a “migration velocity analysis” (e.g., tomography, full wave inversion, etc.) in the computation process (for more details, the reader is referred, e.g., to Biondi 2006, and the references therein). In addition, migration
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1575
Fig. 16 Coherence between seismic experiments and migration (from Ilyasov 2011)
methods can be recursively applied in order to refine the given velocity model. Here, the migration is repeated with a velocity differing from the initial mode by a small perturbation in a local area. In the end, the velocity model with the “best” reflector image is chosen as the final model. The known migration methods today are all based on some approximation of the wave equation or, more generally, on the elastodynamic equation and can roughly be divided into the following groups: • Ray-based methods, which usually model the high-frequency asymptotic solution (see Bleistein et al. 2000) in terms of Gaussian beams (e.g., Popov 1982; Semtchenok et al. 2009), or Kirchhoff migration based on the solution of the eikonal equation (e.g., Buske 1994; Podvin and Lecomte 1991; Vidale 1988) • Depth continuation methods, which are usually based on the one-way wave equation and compute wave fields from one depth level to the next (e.g., Claerbout 2009; Deng and McMechan 2007; Xie and Wu 2006) • Reverse time migration, which is based on the full wave equation and follows the recorded seismogram backward in time until the starting time is reached (e.g., Baysal et al. 1983; Bording and Liner 1994; Haney et al. 2005; Symes 2007)
1576
M. Augustin et al.
The numerical realizations of all aforementioned methods can be classified according to Yilmaz (1987) in three broad categories: • Algorithms based on integral solutions to the acoustic and/or elastic wave equation (e.g., Bleistein 1987; Nolet 2008; Semtchenok et al. 2009; Symes 2003; Xie and Wu 2006) • Algorithms based on finite-difference solutions (e.g., Baysal et al. 1984; Du and Bancroft 2004; Jia and Hu 2006; Renaut and Fröhlich 1996) • Algorithms based on frequency/wave number implementations (e.g., Bollhöfer et al. 2008; Bonomi and Pieroni 1998; Takenaka et al. 1999) In order to acquire a better accuracy and resolution in the resulting image, modern migration methods can combine any number of strategies. This presents itself in algorithms which, for example, compute an initial approximation with the finitedifference approximation of the full wave equation (Wu et al. 2006) and additionally apply the depth continuation method based on the space-frequency implementation.
3.2
Seismic Post-processing in a Multiscale Framework by Means of Surface-Layer Potentials
Our aim now is the decomposition of a signal F – for example, a seismogram or a migration result of a seismogram – into multiscale components using spacelocalized Helmholtz wavelets (as proposed by Freeden et al. 2003) associated to a given wave number k0 . As essential tools, limit and jump relations are used pointwise in the way known from mathematical physics for Helmholtz surface-layer potentials. Our approach to post-processing is then based on a multiscale technique developed by Freeden and Mayer (2003). Starting point for the following considerations are Eqs. (50) and (51). Let S be the surface of B. As the fundamental solution to the Helmholtz operator r 2 C k02 is known to be (Müller 1998)
ˆ.r 2 C k02 I kx yk/ D
1 e i k0 kxyk ; 4 kx yk
x ¤ y;
(52)
UI and US can be represented by the potentials Z UI .x/ D Z US .x/ D
S
K
F .y/ ry ˆ.r 2 C k02 I kx yk/ n.y/ dS .y/;
(53)
"k02 .y/ˆ.r 2 C k02 I kx yk/ U .y/ dV .y/;
(54)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1577
with the surface element dS , the unit normal vector n.y/, a suitable, sufficiently differentiable function F , and K D supp./ being the support of . Here, Eq. (53) gives UI as a double-layer potential, whereas (54) gives US as a volume potential. In the following, we will only consider (53). The double-layer potential can be used to represent solutions to the Helmholtz equation on some bounded domain B or the exterior domain R3 n B, respectively (cf. Freeden et al. 2003, and the references therein). Starting from this, we assume that S is piecewise locally twice continuously differentiable and F is continuous. As the double-layer potential is singular, the first step consists in regularizing it. For this purpose, we introduce a regularization of the fundamental solution ˆ.r 2 C k02 I kx yk/ as (see Freeden et al. 2003, and the references therein) e i k0 k.xC n.x//.yC n.y//k 1 4 k.x C n.x// .y C n.y//k (55) 2 2 with ¤ , jj, jj small. A regularization of the kernel ry ˆ.r C k0 I kx yk/ n.y/ is then given by
ˆ.r 2 C k02 I k.x C n.x// .y C n.y//k/ D
ˇ ˇ ry ˆ.r 2 C k02 I k.x C n.x// .y C n.y//k/ n.y/ ˇ i k0 k.xC n.x//y/k
D
1 n.y/ .x C n.x/ y/e 4 kx C n.x/ yk2
i k0 C
D0
1 : kx C n.x/ yk (56)
The operator of the double-layer potential on S for values on the shifted surface S./ is given as Pj2 .; 0I k0 /ŒF .x/ Z ˇ ˇ D F .y/ ry ˆ.r 2 C k02 I k.x C n.x// .y C n.y//k/ n.y/ ˇ S
D0
dS .y/: (57)
In a more general approach on regularized surface-layer potentials for the Helmholtz equation, Freeden et al. (2003) prove limit and jump relations. For our purposes here, we need the following limit relation (given by Freeden et al. (2003), for a slightly different ˆ.r 2 C k02 I k.x C n.x// .y C n.y//k/) 1 lim Pj2 .; 0I k0 /ŒF .x/ Pj2 .0; 0I k0 /ŒF .x/ D F .x/: !0 2 >0
This leads to the definition of the continuous Helmholtz scaling function
(58)
1578
M. Augustin et al.
" 1 n.y/ .x C n.x/ y/e i k0 k..xC n.x//y/k ˆ .x; y/ D 2 kx C n.x/ yk2 1 i k0 C kx C n.x/ yk # n.y/ .x y/e i k0 k.xy/k 1 i k0 C kx yk2 kx yk
(59)
such that Z lim
!0 >0
S
F .y/ˆ .x; y/ dS .y/ D F .x/;
(60)
as shown in Freeden et al. (2003). We omitted the reference to the Helmholtz operator here and will do so further on. Once again, in numerical applications, instead of the continuously varying parameter , a positive monotonically decreasing sequence j j 2N with limj !1 j D 0 is chosen (of which, of course, finitely many elements are only used). Thus, we obtain the associated sequence ˆj j 2N of scaling functions ˆj of scale j . Moreover, we introduce a sequence ‰j j 2N of (difference) wavelet functions ‰j of scale j given by (cf. (9)) ‰j D ˆj C1 ˆj :
(61)
An approximation of UI at scale J0 2 N is given by the low-pass filter Z LJ0 ŒF .x/ D
S
ˆJ0 .x; y/F .y/ dS .y/;
(62)
and two low-pass filters to the scales J and J0 with J > J0 are related by LJ ŒF .x/ D LJ0 ŒF .x/ C
J 1 Z X j DJ0
S
‰j .x; y/F .y/ dS .y/:
(63)
This leads to the introduction of the band-pass filter to scale j as Z Bj ŒF .x/ D
S
‰j .x; y/F .y/ dS .y/:
(64)
It is evident from Eq. (63) that the band-pass filtered function of the scale j contains all detail information that is included in the low-pass filtered function of scale j C 1 but not in the low-pass filtered function of scale j . This allows a “zooming-in process” on details of different scales which are decorrelated.
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1579
In the context of seismic imaging, we suppose that S consists of finitely many patches P such that a velocity model is available on every patch P. The values of the signal F shall be discretely available on every P in sufficient data density. Instead of integrating over S as a whole, every patch P is treated on its own. Of course, a numerical solution scheme uses some kind of summation formula to approximate the surface integral, such that Z Lj ŒF .x/ D
P
ˆj .x; y/F .y/ dS .y/ '
Nj X
N N ai j ˆj x; yi j
(65)
i D1
with the coefficients N N T aNj 2 RNj ; aNj D a1 j ; : : : ; aNjj ; j D J0 ; : : : ; J;
(66)
given by Nj
aNj D wi
N
F .yi j /; i D 1; : : : ; Nj :
(67)
The band-pass filter is then given by Z Bj ŒF .x/ D
P
‰j .x; y/F .y/ dS .y/ '
Nj X
N N ai j ‰j x; yi j :
(68)
i D1
In Freeden et al. (2003), a relation between the coefficient vectors aNj for different scales j D J0 ; : : : ; J is shown. This allows the use of a tree algorithm or pyramid scheme in the following way: 1. For a sufficiently large integer J , calculate the coefficient vector to the scale J via J aiNJ D wN F .yiNJ /; i D 1; : : : ; NJ ; i
(69)
J and known values F .yiNJ /. with known weights wN i N 2. For j D J 1; : : : ; J0 , calculate the coefficients ai j recursively from the coefficient of the previous, finer scale.
The procedure outlined above was used in the PhD thesis Ilyasov (2011) as a method for seismic post-processing for the “Marmousi” model due to Martin et al. (2002) (see Fig. 17), where P is a rectangle. It should be noted that Ilyasov (2011) used surface decorrelation of seismic data involving jump and limit relations in Helmholtz potential theory, while Freeden and Blick (2013) applied volume decorrelation techniques based on Newton-like integrals.
1580
M. Augustin et al.
Figure 18 shows an example of a migration result for the “Marmousi” model. Figure 19 illustrates a corresponding smoothed velocity field for this rectangle, which gives location-dependent information about the wave number k, with f D !=.2/ denoting the frequency. Since the applied Helmholtz wavelets show a strong spatial localization for high scales, they reflect the position-dependent wave number k in close approximation. Figure 20 visualizes the decomposition of the migration result F (from Fig. 18) regarding the velocity model for the “Marmousi” model by means of Helmholtz scaling and wavelet functions using the tree algorithm. The numerical calculations are carried out first with a certain wave number associated to a velocity of 2 km s and a frequency of f D 20 Hz. High-frequency information – for example, “highamplitude reflectors” – can be seen in the band-pass filtered details B9 ŒF , B8 ŒF ,
Fig. 17 Lithology for the “Marmousi” model (following Martin et al. 2002) [m] 1000
2000
3000
4000
5000
6000
7000
23500
1000
0
14660
–3043 –11892
2000
5806
[m]
Fig. 18 Migration result F of the “Marmousi” model on the surface patch P
8000
9000
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives 2000
4000
6000
8000
1581 [m]
0
5315 4361
2451 1497
2000
3406
[m]
Fig. 19 Smoothed velocity model of the “Marmousi” model
and B7 ŒF . Of special interest is the fact that the shadow in the upper left corner of the low-pass filtered data (left side of Fig. 20) is not present in the band-pass filtered data (right side of Fig. 20). Hence, structures that are only hard to recognize in the former become obvious in the latter. In view of the lithological situation (provided by Fig. 17), it is not difficult to recognize faults and salt domes in their spatial location. Figure 21 provides a wavelet decomposition for a frequency of f D 50 Hz. The details here show a finer structure. Comparison with Fig. 20 shows that the change of frequency yields different structures in the band-pass filtered data. The salt domes that were prominent for f D 20 Hz have little influence for f D 50 Hz. Instead, the highest amplitudes appear in the shale structures in the lower left corner above the salt dome. Additional numerical calculations and illustrations can be found in the PhD thesis Ilyasov (2011). The particular objectives of the interpretation of seismic signals are, besides the fault pattern, the detection of special facies and karst systems with the intention to specify certain water levels in backfills and disturbances. Seismic post-processing via frequency-dependent Helmholtz wavelets provides useful additional information to manage this task, as particular structures in the deeper underground become obvious by the decorrelation capacity of this method.
4
Fluid and Heat Flow Models
The problem of fluid flow in a geothermal reservoir is very complex for several reasons. First of all, the data regarding geometry and composition of the domain are usually lacking or incomplete. Moreover, the direct feedback from the physical situation needed to compare theoretical results with practical measurements is difficult to obtain. Furthermore, the large number of parameters and coupled processes that have to be taken into account make the development of a comprehensive model a very hard task.
[m]
[m]
[m]
L τ8[F ]
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
L τ9[F ]
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
L τ10[F ]
−9.895
−5.072
−0.249
4.574
9.397
−20.75
−10.61
0.457
9.691
19.84
−42.93
−22.03
−1.131
19.77
40.67
[m]
[m]
[m]
B τ 7[F ]
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
B τ 8[F ]
B τ 9[F ]
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
Fig. 20 Multiscale decomposition of the migration result F in Fig. 18 realized by the described tree algorithm for the frequency f D 20 Hz
−11079
7166 −1955
16290
25410
−12040
−2963
6114
15190
24270
−11870
−3049
5777
14600
23430
0
1000
2000
0
1000
2000
0
1000
2000
0 1000 2000 0 1000 2000 0 1000 2000
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
1582 M. Augustin et al.
[m]
[m]
[m]
L τ8[F ]
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
L τ9[F ]
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
L τ10[F ]
[m]
−16.5
−10
−3.505
2.992
9.489
−9.058
−4.568
[m]
−0.07935
4.409
8.897
−18.30
−9.028
0.2433
9.515
18.79
[m]
B τ 7[F ]
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
B τ 8[F ]
B τ 9[F ]
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
Fig. 21 Multiscale decomposition of the migration result F in Fig. 18 realized by the described tree algorithm for the frequency f D 50 Hz
−10960
−1806
7234
16340
25440
−11977
6148 −2915
15210
24270
−3047 −11870
5777
14600
23430
0
1000
2000
0
1000
2000
0
1000
2000
0 1000 2000 0 1000 2000 0 1000 2000
[m] 1000 2000 3000 4000 5000 6000 7000 8000 9000
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives 1583
1584
M. Augustin et al.
According to the Kaiserslautern model (Fig. 4), the three main phenomena that have to be modeled are fluid flow, thermal flow, and chemical flow. Chemicals can dissolve and precipitate during fluid circulation eventually reducing the permeability of the medium and influencing the flow (see Cheng and Yeh 1998; Durst and Vuataz 2000; Jing et al. 2002; Kühn 2009; Kühn and Stöfen 2005, and the references therein), but the contribution of this process to thermal flow is limited; therefore, most of the coupled models just neglect it. Depending on the physical conditions of the reservoir, the problems arising during modeling fluid and heat flow differ.
4.1
Basic Physical Background
This section makes an attempt on a general and uniform theoretical description of the underlying equations for the modeling of physical processes within a geothermal reservoir derived from microscopic considerations which assume a multiphase decomposition of a porous medium and contained fluids. The theoretical basis for this formulation was developed in a number of publications in the 1970s and 1980s which are summarized, e.g., in Diersch (1985). The overview which we give here is based on a simplified version of this approach due to Diersch (2000). We consider a three-dimensional, bounded, regular region B and a (scalar) quantity which is transported through this medium. It is convenient to define some kind of density ! such that D ! with the mass density . The sum of the temporal change of inside the domain B and the inflow and outflow of caused by the flow (vector) j through the boundary @B is equal to the amount of this quantity generated by sources and sinks Q within B. This means d dt
Z
Z .x; t/!.x; t/ dV .x/ C B
Z j .x; t/ n.x/ dS .x/ D
Q.x; t/ dV .x/; B
@B
(70) where we remind the reader that d dt
(total) derivative w.r.t. time t ;
n outer (unit-)normal w.r.t. @B; If we use the (vectorial) velocity v.x; t/ D following way: Z B
Z
dV
volume element in R3 ;
dS
surface element in R3 :
@x @t ,
Eq. (70) can be rewritten in the
Z @.!/ .x; t/ C rx .!v/ .x; t/ dV .x/ C rx j .x; t/ dV .x/ @t B
D
Q.x; t/ dV .x/: B
(71)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1585
From the integral representation (71), we can conclude the differential equation @.!/ .x; t/ C rx .!v/ .x; t/ C rx j .x; t/ DQ.x; t/: @t
(72)
We will omit dependencies on x and t further on for the sake of readability. In its above form, the differential equation is only valid with regard to microscopic quantities which can usually not be measured. To derive equations for macroscopic quantities, spatial averaging over a representative elementary volume and further carefully applied simplifications are necessary. For a system consisting of fluid and solid phases, such as a liquid in a porous medium, this procedure was carried out, e.g., by Diersch (1985) and yields in a simplified form (cf. Diersch 2000) @."!/ C r ."!v/ C r ."j / D ."Q/ C j inter ; @t
(73)
with the volume fraction " of the fluid. We will assume here that the porous medium is completely saturated with the fluid, such that " is equal to the porosity . In case of a free one-component fluid, we have " D 1. Moreover, Eq. (73) contains the additional interaction term Z 1 j inter D .j C ! .v w// ninter dS (74) ıS S inter at phase interfaces. Here, ıS is the surface area of the interface, v the velocity with which is transported, and w the interface velocity. j inter is a scalar and vanishes for a free one-component fluid. From Eq. (73), we can derive balance equations for fluid mass, species mass of chemical materials, fluid impulse, and energy. It is also possible to derive equations for the mechanical stress field of the porous medium by considering balance of mass and balance of impulse for the whole system of solid and fluid components, but such a derivation is out of the scope of this article. The interested reader is referred to, e.g., Diersch (1985). All of these equations are strongly related to each other and decouple only if further assumptions on the negligibility of coupling terms are valid. Considering D f , the density of the fluid, Eq. (73) leads to conservation of mass which can be written as @ f C r f vf D f Qp;f ; (75) @t with vf being the fluid velocity. All mass flows over interfaces, including interfaces between phases, and all other impulse sink and source terms are summarized in the right-hand side term f Qp;f . To derive the species mass balance equation, we have to replace the mass density f by the concentration of the species C . In this case, the mass flux is governed by Fick’s law
1586
M. Augustin et al.
jC D D .rC / ;
(76)
with the tensor of hydrodynamic dispersion D. This tensor can be written as D D Dd I C Dm ;
(77)
with a diffusion part Dd I and a part Dm , which is needed to include mechanic dispersion. Considering porous media, the latter one can be expressed by the Scheidegger-Bear dispersion law as .Dm /ij D ˇT kvf kıij C .ˇL ˇT /
vf;i vf;j kvf k
(78)
with ˇT , ˇL being the transversal and longitudinal dispersivity, respectively. Additionally, the sink-source term on the right-hand side has to include a term #C to incorporate chemical reactions with the decay rate #, and as we are dealing with porous media, sorption effects have to be considered. By combining all these effects, we obtain @.Rd C / C r C vf r .D .rC // C R#C D Qc @t
(79)
as the balance equation for concentration of chemical components with the so-called retardation relations 1 &.C /; 1 d Rd D1 C .&.C / C / : dC R D1 C
(80) (81)
Here, &.C / is a sorption function that can be expressed by empirical material relations. For the balance of momentum of the fluid, we have to bear in mind that linear momentum is a vector-valued quantity, whereas the considerations above use vocabulary for a scalar-valued quantity . As a consequence, the balance of momentum incorporates the stress tensor instead of the flow vector. The stress tensor usually splits into an isotropic part containing pressure and a deviatoric part 0 such that D pI 0 :
(82)
In order to give the balance of momentum, it is convenient to use Einstein’s summation convention and consider the components of the momentum separately. This leads to
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1587
@ f vf;k @ @ @ 0 C f vf;k vf;i C ki D f gk C jpinter .p/ k @t @xi @xk @xi (83) with the fluid force density f g and a still not specified interaction term jpinter , which is now a vector-valued quantity. For a Newtonian fluid, 0 is given by 0i k
D f
@vi @vk 2 @vj C ıi k @xk @xi 3 @xj
(84)
with the fluid viscosity f . With the further assumptions that the fluid is incompressible which means that the mass density does change neither in space nor time and that the divergence of the velocity field vanishes, we obtain the Navier-Stokes equation f
@vf C f vf r vf f r 2 vf D rp C f g C jpinter : @t
(85)
Usually, the fluid velocity in a porous medium is rather small which results in small Reynold’s numbers. As a consequence, the time derivative and the convective transport term can be neglected, i.e., @vf 0; @t
vf r vf 0:
(86)
Thus, the balance of momentum is reduced to rp f g D f r 2 vf C
1 inter j : p
(87)
Moreover, inner friction as a result of viscosity is much smaller than the friction between solid and fluid such that f r 2 vf 0. The interaction term between solid and fluid can be modeled by 1 inter j D f 1 vf p
(88)
with the hydraulic permeability tensor . This yields Darcy’s law vf D
1 rp f g f
(89)
which was first introduced empirically by Darcy in the nineteenth century (Darcy 1856). More sophisticated justifications can be provided, e.g., by the homogenization method described by Ene and Poliševski (1987) or the volume averaging method described by Sahimi (1995). For further details on transport in porous
1588
M. Augustin et al.
media, see, e.g., Bear (1972) and de Boer (2000) and the chapters on flow in porous media in this handbook. From the mathematical point of view, the problem has been challenged by Ene and Poliševski (1987) who proved the existence and uniqueness of a solution for the incompressible fluid case, i.e., r vf D 0, in both bounded and unbounded domains. Until now, we only considered the fluid phase, i.e., we neglected mass exchange between fluid and solid and momentum transfer from the fluid to the solid. This corresponds to the assumption of a rigid non-deformable solid. But even in this case, energy transfer between the solid and the fluid happens. Thus, we have to consider both phases to derive the energy balance. We will use the term “porous medium” for the whole system of fluid and solid now. The indices “p”, “f”, and “s” characterize the associated quantities of the “porous medium”, “fluid phase”, and “solid phase”, respectively. As a consequence, we have to consider Eq. (72) before averaging it into Eq. (73), but may consider Eq. (72) for both phases separately. With heat capacity c and temperature T , the internal (thermal) energy is given by dE D c d T:
(90)
The heat flow is governed by Fourier’s law jT D k .rT / ;
(91)
with the heat conductivity tensor k given by ks Dks I;
(92)
kf Dkf I C f cf Dm
(93)
for the solid and fluid phase, respectively. To combine both phases, we make the following assumptions: (1) The temperature of the solid phase equals (locally) the temperature of the fluid phase T .D Ts D Tf / ((local) thermodynamic equilibrium). (2) There are only heat sources and sinks in the fluid phase and none in the solid phase. For this, a stiff and rigid stone matrix without heat sources and sinks is assumed. This also excludes heat generation through mechanical deformation of the solid. Thus, we have QT;s D 0. The fluid heat source and sink term QT;f represents the injection and extraction of heat into and from the hydrothermal reservoir, respectively. Furthermore, there is neither a change in the rock morphology nor seismic activity during the production process, and thus, vs D 0. (3) Weighted summation of the solid and fluid phase with porosity : .c/p D cf f C .1 /cs s
volumetric heat capacity;
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
kp D kf C .1 /ks
1589
heat conductivity;
QT D QT;p D QT;f
heat sources and sinks;
jT;p D jT;f C .1 /jT;s
thermal flow:
Combining those assumptions, we obtain @ .c/p T C r f cf vf T r kp .rT / D QT : @t
(94)
In order to solve the fluid and thermal flow problem for a porous medium representing a geothermal reservoir, these coupled equations have to be solved. Commonly, numerical solution methods are based on finite-element, finite-volume, or finite-difference methods (e.g., Chen et al. 2006; O’Sullivan et al. 2001; Zhao et al. 1999; Zyvoloski 1983). A solution technique for advective-diffusive heat transport under the assumption of a known fluid velocity will also be presented in Sect. 4.3.
4.2
Fluid Flow in Hydrothermal Reservoirs
In a hydrothermal system, wells are drilled into a so-called aquifer, i.e., a waterbearing layer which is usually modeled as a porous medium. The basic equations to model fluid flow and heat transport in such a reservoir are given in Sect. 4.1. Usually, the vertical dimension of an aquifer is much smaller than its horizontal expansions. It is therefore convenient to apply some kind of vertical averaging, which also reduces the dimension of the problem. Following Diersch (2000), we consider a Cartesian grid x D .x1 ; x2 ; x3 /T such that x3 is the vertical component. The upper and lower bounds of the aquifer are given by b up .x1 ; x2 ; t/ and b down .x1 ; x2 /, respectively, i.e., we assume that the lower bound does not change with respect to time. The thickness of the aquifer is given by B.x1 ; x2 ; t/ D b up .x1 ; x2 ; t/ b down .x1 ; x2 /:
(95)
Another useful quantity is the hydraulic head h. We assume that the only force which affects the fluid is gravity, such that for the fluid force density f g, g is the gravitational acceleration and is assumed to be given by g D kgke .3/ with e .3/ being the unit vector in x3 -direction. Choosing a reference fluid density f;0 , the pressure head hp is defined via hp D
p f;0 kgk
(96)
1590
M. Augustin et al.
and the hydraulic head via h D hp C x3 :
(97)
If we assume that the upper bound of the aquifer is given by the pressure head, we obtain B D hp b down :
(98)
Unlike in Sect. 4.1, we now take into account that the porous medium may not be completely saturated, such that " D s with saturation s. Thus, integrating Eq. (75) over the depth B yields @ Bsf C r Bsf vf D Bsf Qp;f : @t
(99)
To incorporate the hydraulic head, we take a closer look at the time derivative term and get @ Bsf @ f @B @s D sf C Bf C Bs : @t @t @t @t
(100)
The first summand is directly related to the pressure head via Eq. (98) which implies @hp @B @h D D : @t @t @t
(101)
For the second summand, we get @s @s @hp @s @hp @h @h D D D C .hp / @t @hp @t @hp „ƒ‚… @h @t @t
(102)
D1
with the moisture capacity C .hp /. For the third summand, we remark that density changes can be related to pressure changes via a suitable compressibility. As pressure and hydraulic head are equivalent, it is also possible to define a compressibility S0 D f C .1 /s
(103)
with the fluid and solid compressibilities f and s such that @ f @h D Bs.hp /S0 : Bs @t @t
(104)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1591
Neglecting fluid density effects in the divergence term as part of the Boussinesq approximation and with the definition S D s C B S0 C C .hp / ;
(105)
the balance of mass may be written as S
@h C r Bsvf D BsQp;f : @t
(106)
Darcy’s law (89) can also be rewritten with respect to the hydraulic head instead of the pressure: vf D
1 1 rp f g D r f;0 kgk.h x3 / C f kgke .3/ f f
1 f;0 kgkrh f;0 kgke .3/ C f kgke .3/ f f f;0 1 D f;0 kgk rh C kgke .3/ : f f;0 D
(107)
Combining Eqs. (106) and (107) gives @h r S @t
f f;0 Bs .3/ D BsQp;f f;0 kgk rh C kgke f f;0
(108)
as a kind of diffusion equation for the hydraulic head which is highly nonlinear due to the dependencies of the coefficients on various quantities related to the hydraulic head.
4.3
Heat Transport in a Porous Medium
The aim of this section is to simulate the heat transport in a two-phase porous medium, which consists of a solid and a fluid phase, based on Eq. (94). We make the following additional assumptions: (1) The fluid is incompressible and the divergence of the velocity field vanishes. (2) The heat produced from mechanic dispersion is negligible, such that kf D kf I. (3) All heat capacities and the porosity are constant with respect to time and space. With assumption (1)–(3), we obtain the transient advection-diffusion-equation (TADE) for the two-phase porous medium (0 < tend < 1):
1592
M. Augustin et al.
.c/p
@T D r .kp rT / cf f vf rT C QT @t
T .; 0/ D T0 @T DF @n
in B .0; tend /;
(109)
in B;
(110)
on @B Œ0; tend :
(111)
Statements concerning existence, uniqueness, and continuity of the weak solution of problem (109)–(111) are proved in Ostermann (2011a). Because the weak formulation of problem (109)–(111) is infinite dimensional, we need a finite-dimensional problem for the numerical realization whose solution is a good approximation of the original solution. In order to reach this aim, a linear Galerkin scheme using scalar kernels and a grid of J pairwise distinct points in the domain B is used. The corresponding approximated Galerkin solution is called TJ . The convergence towards the actual solution T is shown, e.g., in Ostermann (2011a,b). Suitable examples for the Galerkin grid and Galerkin kernel are tetragonal grids and the biharmonic kernel (which is a radial basis function; see, e.g., Buhmann 2003; Wendland 2005). They satisfy the required conditions for the convergence of the Galerkin scheme (see Ostermann 2011a,b, for details). The results for a numerical test for the cube .1; 1/3 for a diffusion dominated f jjxo problem (with Péclet number Pe D cf f jjv D 0:01 with characteristic p length scale xo ), a constant initial condition T0 D 393:15 K, and a (singlepoint-)injection at the origin with an injection temperature of 343:15 K that is constant in time are shown in Figs. 22 and 23 (cf. Ostermann 2011a,b). In the first two columns of Fig. 22, the temporal evolution of the temperature difference TJ .; t/ TJ .; 0/ for chosen times between 3 and 36 months is presented. The global expansion shows a significant cooling of the injection region – expectedly due to the small injection temperature compared to the initial temperature. The color scale is always scaled to the interval Œ50; 0 . In the third column of this figure, a detailed view after 6, 12, and 36 months can be seen. The quantity detai l J .t/ D log10 ..TJ .; t/ TJ .; 0/// is illustrated for the specific temperature intervals. For clarification of the arising structures, the color scale is condensed at the upper end. Nevertheless, the influence of the advection term, which is comparatively low in this case, can be clearly detected. The cooling front, which propagates “spherically” from the injection point, is slowed down in positive x2 -direction, whereas it is accelerated in negative x2 -direction. This behavior can also be observed in Fig. 23 (see also Ostermann 2011a,b). There the temperature TJ .; t/ after 36 months both in x1 x2 - and x2 x3 -plane for an adapted color map is shown. Further investigations concerning the presented method – especially in case of advection dominated problems – are needed because the applied Galerkin method does not contain any stabilization terms. Such terms should prevent strong oscillations for large Péclet numbers, (e.g., John and Schmeyer 2008; John et al. 2006, for similar considerations). The numerics of a similar convection-diffusionreaction-problem has been discussed in the PhD-thesis by Eberle and in Eberle et al. 2015 in this issue.
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives x1 –1 1
–0.5
x1
0
0.5
1
–1 1
0.5 x2
–0.5
x1
0
0.5
1
–1 1
0.5 x2
0
0
–0.5
–1
–1
–1
–30
–20
–10
0
–50
–40
–30
x1 –1 1
–0.5
0.5
1
–1 1
0.5 x2
–20
–10
0
–0.5
x2
0.5
1
–1 1
x2
0
–0.5
–1
–1
–1
–30
–20
–10
0
–50
–40
–30
x1 –1 1
0.5
1
–1 1
0.5 x2
–20
–10
0
–1.6
x1
0
–0.5
x2
–0.5
0.5
0
1
–1 1
x2
0
–0.5
–1
–1
–1
–30
–20
–10
0
–50
–1.4
–1.3
–0.5
0
0.5
1
0
–0.5
–40
–1.5
0.5
–0.5
–50
1
x1
0.5
0
0.5
0
0
–0.5
–40
–0.5
0.5
–0.5
–50
1
x1
0
0.5
0
0.5
–1.6 –1.5 –1.4 –1.3 –1.2 –1.1 –1
x1
0
0
0
–0.5
–40
–0.5
0.5 x2
–0.5
–50
1593
–40
–30
–20
–10
0
–1.65
–1.6
–1.55
Fig. 22 Temperature difference TJ .; t / TJ .; 0/ in ŒK for the Galerkin expansion (extended by a constant) on the x1 x2 -plane after 3, 6, 9, 12, 24, and 36 months (first two columns from left to right and top to bottom) and detail view of det ai l J .t / in ŒK after 6, 12, and 36 months (right column from top to bottom) for a tetragonal grid with grid size 19 and Pe D 0:01
4.4
Flow Models for Petrothermal Reservoirs
In petrothermal systems, the possible efficiency of the geothermal reservoirs is often too low for industrial purposes. Hence, the usual method to improve the productivity is to use fluid pressure to increase the aperture of the fractures as well as their conductivity and, thus, the permeability of the reservoir, creating what is called an Enhanced Geothermal System (EGS). Applying hydraulic stimulation (e.g.,
1594
M. Augustin et al.
Fig. 23 Temperature TJ .; t / in ŒK for the Galerkin expansion (extended by a constant) on the x1 x2 - and x2 x3 -plane after 36 months for a tetragonal grid with grid size 19 and Pe D 0:01
Ghassemi 2003; Zubkov et al. 2007), the fractures propagate along the maximum principle stress direction. The fracture path is governed by the actual stress field in the reservoir acting at the fracture tips (e.g., Moeck et al. 2009). Often the main difficulty of the mechanical models that predict the fracture growth during the stimulation process is to couple the applied pressure of the fluid with the fracture aperture and the rock stress since the parameters that govern this coupling (such as the rock stiffness and the shear stiffness) are very difficult to detect (see Hillis 2003). For all models which predict fracture growth, knowledge of the stress field prior to stimulation is vital. There are two basic categories of models for petrothermal systems consisting of a fractured medium: continuum methods and discrete methods. The continuum model approach can be subdivided into three methods, namely, the Effective Continuum Method (ECM), the Dual-Continuum (DC) or the generalized Multiple Continuum Model (MINC, Multiple INteracting Continua), and the StochasticContinuum (SC) model. The continuum approach is a simple method since it ignores the complex geometry of fractured systems and employs effective parameters to
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1595
reservoir
hydrothermal
petrothermal
porous medium
fractured medium
continuum
continuum parameters
ECM DC/MINC SC
discrete
SF DFN FM
Fig. 24 Scheme of geothermal reservoir models characterizing today’s situation
describe their behavior. Nevertheless, it is not useful to apply vertical averaging as for hydrothermal systems. In contrast to this, the discrete approach is based on the explicit determination of the fractures. Thus, the need for a detailed knowledge of the geometry of the fractures arises. There are two different tasks which discrete models can fulfill, namely, modeling the flow when the continuum methods cannot be applied and specifying the effective parameters needed in the continuum approach if they are applicable. Similar to the continuum models, the discrete models can be divided into three different methods: Single-Fracture (SF) models, Discrete Fracture Network (DFN) models, and Fracture Matrix (FM) models. A schematic overview of the presented methods used to model geothermal reservoirs is given in Fig. 24. This article can only give an overview on the different methods used in reservoir and flow modeling. For further information on (numerical) methods concerned with flow and transport through fractured media, the reader is referred to the recent assessments by Adler and Thovert (1999), Sanyal et al. (2000), Dietrich et al. (2005), and Neuman (2005).
Continuum Models Effective Continuum Method We first present two basic models for determining effective coefficients in fractured rocks introduced by Pruess et al. (1986) and Wu (2000) based on a deterministic approach. Pruess and his co-workers developed a semi-empirical model to represent an unsaturated fracture/matrix system as a single equivalent continuum, therein using the simple arithmetic sum of the conductivity of the rock matrix and the fracture system as approximation of the unsaturated conductivity of the fractured rock: .p/ D matrix .p/ C fracture .p/:
(112)
1596
M. Augustin et al.
A similar approximation of the isotropic permeability of a fracture/matrix system was used by Peters and Klavetter (1988). Approximation (112) is based on the assumption that the local thermodynamic equilibrium between the fracture system and the rock matrix blocks breaks down in case of fast transients. An example including solute transport in models for fractured media is given by Wu (2000) who presents a generalized ECM formulation to model multiphase, nonisothermal flow and solute transport. The governing equations for compositional transport and energy conservation contain the key effective correlations such as capillary pressures, relative permeability, dispersion tensor, and thermal conductivity. Introducing a fracture/matrix combined capillary pressure curve as a function of an effective liquid saturation, the corresponding fracture and matrix saturations can be determined by inversion of the capillary pressure functions for fracture and matrix, respectively. Consequently, the effective relative permeabilities of the corresponding fluid phase can be determined as the weighted sum (via the absolute permeabilities of the fracture system and the matrix) of the relative effective permeabilities of the fracture system and the matrix evaluated at the respective saturations. Similarly, the thermal conductivity can be specified. For the determination of the dispersion tensor, the Darcy velocities in the fracture system and the matrix are necessary, additionally to the corresponding saturations. Note that approximation (112) is also incorporated in this model for the effective continuum permeability. An attempt to categorize the different mathematical concepts beyond the problem to find good estimates for the matrix-dependent effective parameters can be found in Sahimi (1995). Since the applicability of this method is restricted due to the fact that there is generally no guarantee that a Representative Elementary Volume (REV) exists for a given site (cf. Long et al. 1982), or that the assumption of local thermodynamic equilibrium between fracture and matrix is violated by rapid flow and transport processes (see Pruess et al. 1986), alternatives are needed.
Dual-Continuum and MINC Methods The dual-continuum method or the more general Multiple INteracting Continua (MINC) method (see Barenblatt et al. 1960; Kazemi 1969; Pruess and Narasimhan 1985; Warren and Root 1963; Wu and Qin 2009, and the references therein) tries to overcome the disadvantages of the ECM. The dual-porosity method as one possible form of dual-continuum methods was first introduced by Barenblatt et al. (1960), applying the idea of two separate but overlapping continua – one modeling the fracture system and the other one modeling the porous matrix blocks. The method does not only take into account the fluid within the fractures, as the general Darcy model does, but also the fluid mass transfer between the fractures and the matrix blocks. Typically, the flow between the fracture system and the matrix is modeled as pseudo-steady, assuming pressure equilibration between the fracture system and the matrix blocks within the duration of each time step. Two conservation laws are required: one for the fracture and one for the rock matrix blocks (see Barenblatt et al. 1960) given by
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
@.1 f / C r .1 f v1 / Qinter ; @t @.2 f / 0D C r .2 f v2 / C Qinter : @t
0D
1597
(113) (114)
Here, Qinter is the mass of the liquid which flows from the rock matrix into the fractures per unit time and unit rock volume. The subscripts 1 and 2 represent the fracture phase and the rock matrix phase, respectively. Note that Eqs. (113) and (114) are in the same form as (75) for a single porous medium. Combining Eq. (113) with Darcy’s law and Eq. (114) with the assumption that the porosity in the rock matrix only depends on the corresponding pressure in the rock matrix, we obtain @r 2 p1 @p1
D r 2 p1 ; @t @t
(115)
under the assumptions that the medium is homogeneous, the fluid is slightly compressible, and higher-order terms may be neglected. Here, p1 is the pressure in the fracture phase, and are constants depending on the system. The first numerical implementation of a finite-difference, multiphase, dual-porosity scheme including gravity and imbibition was presented by Kazemi et al. (1976). It can be shown (see Arbogast 1989) that models of this kind are well-posed, given appropriate boundary and initial conditions. For the derivation of the doubleporosity model due to homogenization, see Arbogast et al. (1990). The assumptions of the dual-porosity method fail in the presence of large matrix blocks, low permeability, or silicification of fracture surfaces. An extension of the dual-porosity model has been proposed by Pruess and Narasimhan (1985), the socalled Multiple INteracting Continua (MINC) method. The idea is to divide the domain into regions in which the thermal dynamic equilibrium can be assumed and to treat every block of the rock matrix not as a porous medium as in the dualporosity method but as a set of different porous media with different constituting properties all following Darcy’s law. This way, the transient interaction between matrix and fractures can be shown to be more realistic. A coherent representation for the thermal flow has also been incorporated into the model using the integral finite-difference method. Examples on how this method can be adapted to different situations can be found in Kimura et al. (1992) and Wu and Pruess (2005). Further information on multiphysics and multiscale approaches can be found in this handbook in the corresponding chapter by Helmig et al. (2014), and the references therein. For other models concerning fracture-matrix interaction the reader is referred to Berkowitz (2002). Stochastic-Continuum (SC) Method The stochastic-continuum approach represents the fractured medium as a single continuum, but, as opposed to the ECM, it uses geostatistical parameters. The
1598
M. Augustin et al.
concept was first introduced for fractured media by Neuman and Depner (1988) and Tsang et al. (1996). The site’s specific hydraulic parameters, such as the hydraulic conductivity, are modeled as random variables via stochastic methods, e.g., Monte Carlo simulation. However, measuring the hydraulic conductivity in this manner is problematic since it is a scale-dependent parameter which usually has a higher variance at smaller scales, possesses a varying support volume, and is derived from well tests whose scales have to match the required support scale of the model (see, e.g., Oden and Niemi 2006). The pioneering work by Neuman and Depner (1988) verifies the dependence of the effective principal hydraulic conductivity solely on mean, variance, and integral scales of local log hydraulic conductivities for ellipsoidal covariance functions. The proposed method requires that the medium is locally isotropic and that the variance is small. The stochastic-continuum model by Tsang et al. (1996) is based on a nonparametric approach using the sequential indicator simulation method. Hydraulic conductivity data from point injection tests serve the purpose of deriving the needed input for this simulation, namely, the thresholds dividing the possible range of values of hydraulic conductivity in the stochastic continuum into classes and the corresponding fractions within each of these classes. In order to reflect the fractures and the rock matrix, they introduced a long-range correlation for the high hydraulic conductivity as part of the distribution in the preferred planes of fractures. Thus, both the fractures and the rock matrix contribute to the hydraulic conductivity even though they are not treated as two different continua as in the dual-continuum model described above. Due to the inherent restrictions of the model and the reduction of the uncertainty, they propose to employ spatially integrated quantities to model flow and transport processes in a strongly heterogeneous reservoir reflecting the continuum quality (spatial invariability) of their model. Based on the concept of a stochastic REV using multiple realizations of stochastic Discrete Fracture Network (DFN) models (see below) simulated via the Monte Carlo method Min et al. (2004), determine the equivalent permeability tensor with the help of the two-dimensional UDEC code by the Itasca Consulting Group Inc (2000). The central relation that is used in this code is a generalized Darcy’s law for anisotropic and homogeneous porous media (see Bear 1972)
vi D F
2 X ij @p ; @xj j D1 f
(116)
where F is the cross-section area of the DFN model. One of the most important steps in the analysis is to prove the existence of the resulting permeability as a tensor, which was done by comparing the derived results with the ellipse equation of the directional permeability. The second important step is to determine whether a REV can be established for a specific site or not. Min and his coworkers presented two criteria, “coefficient of variation” and “prediction error”, to show the existence of a REV and to specify its size for Sellafield, UK.
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1599
Discrete Models Single-Fracture (SF) Models In single-fracture models, only one fracture is considered. The analysis of the accurate behavior of a single fracture is crucial in the understanding of situations in which most of the flow occurs in a few dominant paths. There are different realizations for this concept, both deterministic and stochastic. An early statistical approach dealing with preferential flow paths based on the variation of fracture aperture can be found in Tsang et al. (1996). The classical idea for modeling a rock fracture is to consider it as a pair of smooth parallel planes (see Lomize 1951; Tang et al. 1981). These kinds of models are interesting from a mathematical point of view since they often offer an analytical or semi-analytical solution for the flow (see, e.g., Wu et al. 2005); and they are widely used in reservoir modeling as they can be useful for a quantitative analysis. But they are far from being realistic. In reality, the surface of the fracture may be rough to the point that the flow can fail to satisfy the cubic law (see Bear et al. 1993) vD
g h ; a3 12f p l
(117)
where v is the flow rate, g is the gravity acceleration, ap is the aperture, h is the head loss, and l is the fracture length. Several authors have tried to characterize this deviation from the cubic law with fractal (e.g., Brown 1987; Fomin et al. 2003) and statistical (e.g., Tsang and Tsang 1989) modeling of the fracture roughness. A derivation in a simplified setting and further discussion on possible generalizations as well as further discussion on fluid flow and its interaction with the stresses of the surrounding medium can also be found in the chapter by Renner and Steeb (2015) in this book. In general, it is acknowledged that the fractures themselves are two-dimensional networks of variable aperture. Most of the time, the trick is not to identify every single roughness in the surface, even though a number of technical methods are used to compute a precise profilometric analysis, but rather the scale of the roughness which has a dominant influence on the fluid flow (see Berkowitz 2002) which is accommodated by introducing an effective aperture. Recently, another method realizing the irregularity of the fracture surface was introduced to model (Navier-Stokes) flow through a single fracture, namely, the Lattice Boltzmann method (see Eker and Akin 2006; Kim et al. 2003, and the references therein). In contrast to the traditional “top-down” methods employing partial differential equations, the idea of this “bottom-up” method is to use simple rules to represent fluid flow based on the Boltzmann equation. Additionally to problems caused by roughness, the situation is complicated by the influence of deformation processes due to flow and pressure gradients which should be considered at this scale (see, e.g., Auradou 2009). For heat extraction from Hot Dry Rock (HDR) systems which are a member of petrothermal systems, Heuer et al. (1991) and Ghassemi et al. (2003) developed
1600
M. Augustin et al.
mathematical models for a single fracture. The essential feature of the approach by Heuer et al. (1991) is that the presented (one-dimensional) model can be solved by analytical methods. Furthermore, a generalization to an infinite number of parallel fractures is also given. A three-dimensional model of heat transport in a planar fracture in an infinite reservoir is given by Ghassemi et al. (2003), who derive an integral equation formulation with a Green function effectively sidestepping the need to discretize the geothermal reservoir. Discrete Fracture Network (DFN) Among the methods for modeling a geothermal reservoir, the DFN approach is one of the most accurate but also one of the most difficult to implement. This model restricts fluid flow to the fractures and regards the surrounding rock as impermeable (e.g., Dershowitz et al. 2004). Thus, as described for the single-fracture models, most of the time fluid flow through fractures is compared to flow between parallel plates with smooth walls (Lee et al. 1999) or to flow through pipes (channeling-flow concept; see Tsang and Tsang 1987). In contrast to the dual-porosity model, fluid flow in the DFN model is governed by the cubic law (117). Based on this momentum equation and the continuity relation, unknown heads at intersections of the fracture network can be determined in the DFN model. A comparison of the dual-porosity method and the DFN method can be found in Lee et al. (1999). They identified the fracture volume fraction and the aperture to be the most significant parameters in the dual-porosity and the DFN model, respectively. Furthermore, they derived a onedimensional analytical solution of the dual-porosity model for a confined fractured aquifer problem with the help of Fourier and Laplace transforms based on an earlier solution by Ödner (1998). Different approaches to characterize a reservoir fracture network and the flow in such a network were introduced, e.g., based on stochastic models, fractal models, fuzzy logic and neural networks (incorporating field data), a combination of these models, or on percolation theory (see Jing et al. 2000; Maryška et al. 2004; Mo et al. 1998; Ouenes 2000; Tran and Rahman 2006; Watanabe and Takahashi 1995, and the references therein). Note that percolation theory can also be used to determine the connectivity of a DFN and its effective permeability (see Berkowitz 1995; Masahi et al. 2007; Mo et al. 1998). Analytical models for the determination of the permeability of an anisotropic DFN are presented by Chen et al. (1999). The idea behind the stochastic models is that the entire fracture network cannot be located via seismic and geological means. Thus, fractures that are too small to be detected, but through which an important amount of flow can occur, are generated stochastically. Information on the morphology of the system has to be gathered first (via Monte Carlo sampling or geological analysis) in order to assign the correct probability distributions. Then, a realistic fracture network can be generated, providing a fairly good approximation of the real underground situation. The statistical distribution of fracture orientation is often described as a Fisher distribution (see Fisher et al. 1993), whereas fracture aperture and size can be sampled from log-normal or Gaussian distributions. For more details, see Assteerawatt (2008).
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1601
Fracture-Matrix (FM) Models The last approach in the discrete category are fracture-matrix models, also called explicit discrete fracture and matrix methods (see Reichenberger et al. 2006; Snow 1965; Stothoff and Or 2000; Sudicky and McLaren 1992, and the references therein). These models are an extension of the above-described DFN models in terms of considering the rock matrix as a porous medium. Thus, the influence of the interaction between fractures and surrounding rock matrix on the physical processes during fluid and thermal flow through the reservoir can be captured and analyzed on a more realistic level. For this reason, these models can also be used to determine the parameters needed in the continuum methods (see, e.g., Lang 1995; Lang and Helmig 1995). Although the fracture-matrix models allow to represent fluid potential gradients and fluxes between the fracture system and the rock matrix on a physical level, this advantage is often reduced by the fact that the models are restricted to a vertical or horizontal orientation of the fractures (e.g., Travis 1984) or that the influence of the real geometry of the fractures such as their tortuosity is neglected. Since not only a detailed knowledge of the matrix geometric properties is rarely available at a given site but also since the application of this method is computationally intensive, the less demanding dual-continuum or the more general MINC method are widely used. Nevertheless, the progress of technology allowed the recent beginning of more extensive studies which combine an accurate description of the fracture system with a permeable matrix (e.g., by R. Helmig and his group, University of Stuttgart, Germany). It is now possible to model a very complex set of fractures and nonetheless take into account the fact that every fracture can exchange fluid with the surrounding rock matrix. An attempt at this kind of study has been done with success by Reichenberger et al. (2006). The main idea is that the capillary pressure and the flow must be continuous across the fracture boundary; therefore, a proper interface condition should be given. This method is still at an early stage and, at the moment, can be rarely used due to its strong dependency on a precise and complete knowledge of the real on-site fracture-matrix configuration.
4.5
Heat Transport in a Fissured Medium
In contrast to an isotropic, porous medium which has been discussed in Sect. 4.3, fissured geothermic systems are characterized by strong heterogeneities. They are subject to internal and external forces with different scales in time and space. Furthermore, the heat transfer medium flows through a partially poorly connected fractured network along obstacles with different structures and proportions. This may cause an anomalous behavior of the heat transport (as well as other transport processes), due to the fact that the movement of the diffuse quantity – heat in our case – slows down because it probably remains at a certain position for a longer timespan. By virtue of its wide-ranging applicability, the concept of anomalous transport processes attracts more and more interest in natural sciences (see Luchko 2015; Luchko and Punzi 2011).
1602
M. Augustin et al.
Fig. 25 Evolution of the variance in time of a one-dimensional CTRW simulation starting with a Dirac distribution for various ˛ (Luchko and Punzi 2011)
To model the anomalous behavior of heat diffusion, the Continuous Time Random Walk (CTRW) can be applied. The idea is to interpret the heat as an amount of particles which moves erratically through the porous medium. On the basis of an estimation of the probability density function of the jumps of the moving particles, a continuous heat transport model is obtained. The anomalous behavior is estimated using the temporal development of the expectation value E..ıx/2 / of the mean square deviation, i.e., the variance, of a moving particle in the form E..ıx/2 / t ˛ :
(118)
Depending on the parameter ˛, four cases can be distinguished: subdiffusion (0 < ˛ < 1) and standard diffusion (˛ D 1) as shown in Fig. 25, super diffusion (1 < ˛ < 2) and ballistic diffusion (˛ D 2). In the preceding section on heat transport in a porous medium, we assumed standard diffusion. In the following, we consider the case of subdiffusion, i.e., 0 < ˛ < 1, according to the considerations in Luchko and Punzi (2011). Using a probability density function with first moment and Laplace transform of the form .s/ D 1 .s/˛ ;
for s ! 1;
(119)
the fractional Fokker-Planck equation .c/p
@1˛ @T D 1˛ .r .kp rT / cf f f rT /; @t @t
T .; 0/ D T0 ;
(120)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1603
1˛
results from the CTRW-ansatz, where @t@ 1˛ is the fractional time derivative of order 1 ˛ and k is a scalar such that the heat conductivity is given by kp D kp I. The fractional time derivative turns out to be problematic both from a theoretical and a numerical point of view. Various definitions of the fractional calculus are known. For the simulation of real processes – as in case of modeling anomalous heat transport – the Caputo derivative is usually employed. It is defined by (see Gorenflo and Mainardi 1997) @ˇ 1 f .t/ D @t ˇ .1 ˇ/
Zt
1 @ f ./d ; .1 /ˇ @
for 0 < ˇ < 1;
(121)
0
with the Gamma function ./. The derivative described in (121) corresponds to an integrodifferential operator of convolution type. It is not a Markov process due to the fact that the determination of the integral includes all preceding points in time. This results in problems for the numerical implementation especially with respect to memory capacity. In case of boundary value problems for fractional partial differential equations, there are still some difficulties and unsolved problems – especially as far as existence and uniqueness theorems are concerned. The methods known for classical elliptic and parabolic partial differential equations can often not be (easily) extended. An example for an extension of the established concepts is presented in Luchko (2009, 2010, 2015) and Luchko and Punzi (2011). We will further on neglect the convection term vf rT but account for a reaction term T with the reaction rate . The (Dirichlet) initial boundary value problem with 0 < ˛ < 1 for the three-dimensional case is given by .c/p
@˛ T D r .kp rT / T C QT @t ˛
T .; 0/ D T0 T DF
in B .0; tend /;
(122)
in B;
(123)
on @B Œ0; tend ;
(124)
with the notation used in Sect. 4.3. Furthermore, for the time- and space-dependent source term QT as well as the space-dependent parameters k and , we have for x 2 B; t 2 .0; tend /;
(125)
.B/; k.x/ > 0
for x 2 B;
(126)
2 C .B/; .x/ 0
for x 2 B:
(127)
QT .x; t/ 0 k2C
.1/
By use of a maximum principle in Luchko (2009), it is shown that the initial boundary value problem (122)–(124) has at most one solution. If this solution exists, it depends continuously on the given data. In order to prove the existence of the
1604
M. Augustin et al.
solution, a generalized solution of the initial boundary value problem in the sense of Vladimirov is introduced in Luchko (2010). Further details can be found in Luchko (2009, 2010, 2015) and Luchko and Punzi (2011). As already mentioned above, the numerical approximation of the fractional time derivative is problematic because information of the integrand at all preceding points in time is needed. The “short memory principle” and the “logarithmic memory principle” provide possibilities for the reduction of memory capacity requirements (see Ford and Simpson 2001). In the range of finite-difference schemes, the standard approximation is the Grünwald-Letnikov definition of the fractional derivative. This method converges linearly in time (e.g., Blank 1996). In Luchko and Punzi (2011), some techniques with better convergence rates are given. In order to clarify the influence of the fractional diffusion in comparison to standard diffusion, the following initial boundary value problem is considered: @˛ T D r 2T @t ˛ T .; 0/ D T0 @T D0 @n
in B .0; tend /;
(128)
in B;
(129)
on @B Œ0; tend ;
(130)
where B D .0; 1/2 is the unit square in R2 and tend D 1 s. Furthermore, T0 is the characteristic function of the set f.x1 ; x2 / 2 R2 j .x1 0:75/2 C .x2 0:25/2 < 0:002g:
1
1
1
0.9
0.9
0.9
0.8
0.8
0.8
0.7
0.7
0.7
0.6
0.6
0.6
0.5
0.5
0.5
0.4
0.4
0.4
0.3
0.3
0.3
0.2
0.2
0.2
0.1
0.1
0
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
1
1
1
0.9
0.9
0.9
0.8
0.8
0.8
0.7
0.7
0.7
0.6
0.6
0.6
0.5
0.5
0.5
0.4
0.4
0.4
0.3
0.3
0.3
0.2
0.2
0.2
0.1
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.1
0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(131)
0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Fig. 26 Numerical solution of problem (128)–(130) using a two-dimensional DG-FEM for ˛ D 0:8 (upper row) and ˛ D 1 (bottom row) at t D 0:33 s, t D 0:66 s, and t D 1:00 s from left to right (see also Luchko and Punzi 2011)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1605
The numerical solution calculated by use of the approximation method described in McLean and Mustapha (2009) for ˛ D 0:8 (subdiffusion) and ˛ D 1 (standard diffusion) is illustrated in Fig. 26. The method uses a combination of linear time iteration and a discontinuous Galerkin scheme on the basis of a finiteelement triangulation. The illustration shows that the heat diffuses slower in case of subdiffusion (upper row) than in case of standard diffusion (bottom row). Deviations from a symmetrical propagation behavior are due to boundary effects.
5
Poroelastic Stress Field Modeling
It is evident that pumping a pressurized fluid into a rock formation changes the stresses within the rock as the fluid flows through pores and fractures. Leak-off causes the surrounding rock to expand (e.g., Biot 1935, 1941, 1955; Rice and Cleary 1976). This relation between pore pressure and stress (see Addis 1997; Engelder and Fischer 1994; Hillis 2000, 2001, 2003; Zhou and Ghassemi 2009, and the references therein) is known as poroelasticity. It may lead to changes in permeability and porosity – especially when fractures are stimulated within a petrothermal reservoir. Such fractures grow along the axis of maximum principal stress, whereas its direction results from the stresses at the fracture front (cf. Moeck et al. 2009). Furthermore, poroelastic effects also yield changes in the velocity regime of seismic waves, microseismicity, reactivation of slips and faults, disturbance of borehole stability, and changes in the flow paths of the fluid through the reservoir (see, e.g., Altmann et al. 2008). Moreover, the occurrence of mild seismic shocks in the surroundings of geothermal facilities, e.g., in Basel, Switzerland (December 2006 and January 2007, see Baisch et al. 2009) or Landau, Germany (August 2009, see Expertengruppe “Seismisches Risiko bei hydrothermaler Geothermie” 2010), led to a discussion about the safety of geothermal power plants. Consequently, modeling of stress field effects forms the fourth column of the Kaiserslautern model (Fig. 4). Along with the poroelastic effects on the rock matrix, the other major sources of possible rock displacement, even after terminating the injection of fluid, are thermoelastic processes due to the temperature difference between the injected fluid and the medium (see Brouwer et al. 2005; Ghassemi 2003; Ghassemi and Tarasovs 2004; Ghassemi and Zhang 2004; Hicks et al. 1996; Yin 2008). As a matter of fact, the hot rock is cooled down by the injection of cold fluid and, as a consequence, the rock shrinks contrasting the poroelastic effects of dilation (e.g., Nakao and Ishido 1998). Normally, on a short-time scale, the mechanical effects are dominant and the thermoelastic effects can be neglected, but in a long-term simulation, they have to be taken into account. For a more specific treatment of this problem, the reader is referred to Evans et al. (1999), Jing and Hudson (2002), Ghassemi (2003), Rutqvist and Stephansson (2003), and the references therein for comprehensive reviews. We do not treat thermoelastic effects here, as a consequent treatment of those requires the consideration of flow transport, heat transport, and deformation processes as
1606
M. Augustin et al.
a coupled system of equations (thermo-hydro-mechanical modeling) which goes beyond the scope of this chapter. For simplicity, within this section, we focus on poroelastic effects based on the consideration in Augustin (2012). In order to simulate the stress field in a homogeneous, isotropic medium B R3 , the so-called quasistatic equations of poroelasticity may be used. They were first described by Biot (1935, 1941). We briefly state the basic relations which are used to derive these equations. Further details can be found in Auriault (1973), Landau et al. (1986), Showalter (2000), Phillips (2005), Jaeger et al. (2007), Phillips and Wheeler (2007), and Lai et al. (2010). Considering a domain B R3 , with x being a point in B and x 0 its image after a small deformation, the displacement vector u is defined as u.x; t/ D x 0 .t/ x.t/
(132)
for some t 2 Œ0; tend . It describes deformations of the solid porous medium. With the standard assumptions of linear elasticity, the strain tensor " is given by "ij .u/ D
1 2
@uj @ui .x; t/ C .x; t/ @xj @xi
(133)
and in case of an isotropic, homogeneous medium, this yields the stress tensor given by ij .x; t/ D Cij kl .x; t/"kl .u/ D "kk .u/ıij C 2"ij .u/:
(134)
Here, C is the Cauchy elasticity tensor of rank four, and and are the Lamé parameters of the system. Moreover, we used Einstein’s summation convention. Since u describes the solid medium, we need another quantity to describe the behavior of the fluid. As mentioned before in Sect. 4.1, the fluid flow in a porous medium can be modeled by Darcy’s law (89). As we assume an isotropic, homogeneous medium and fluid, can be replaced by a scalar . Moreover, we set k D f . Next, we have to describe the interactions between fluid and solid. According to Biot (1935, 1941), there are two quantities which we have to consider. The first one is the poroelastic stress tensor pe given by pe .x; t/ D .x; t/ ˛Ip.x; t/:
(135)
Here, I is the unity tensor of rank two, ˛ is the so-called Biot-Willis constant, and p is the pore pressure of the fluid. The second quantity is the so-called volumetric fluid content .x; t/ D c0 p.x; t/ C ˛r u.x; t/:
(136)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1607
It describes those parts of volume changes of the fluid which are due to changes in fluid mass (see Jaeger et al. 2007, for a detailed derivation). c0 is the specific storage coefficient. Equations (133)–(136) form a complete list of constitutive relations to describe the interacting fluid-solid system. The behavior of the system is governed by the usual conservation laws of physics. In our case, balance of linear momentum and balance of mass read (Augustin 2012) s .x; t/
@2 u .x; t/ r pe .x; t/ Df .x; t/ @t 2
(balance of linear momentum); (137)
@ .x; t/ C r vf .x; t/ Dh.x; t/ @t
(balance of mass);
(138)
with the mass density s of the solid. We neglect a detailed description of how these differential equations can be derived from the integral equations d dt
Z
Z s .x; t/vs .x; t/ dV .x/ D Bt
Z f .x; t/ dV .x/ C
Bt
d dt
pe .x; t/n.x; t/ dS .x/;
@Bt
(139)
Z
Z .x; t/ dV .x/ D Bt
h.x; t/ dV .x/:
(140)
Bt
The interested reader can find those, e.g., in Jaeger et al. (2007), Lai et al. (2010), and the references therein. The occurring quantities here, as far as not yet introduced, are Bt B arbitrary material volume; vs .x; t/
deformation velocity of the solid;
f .x; t/
body force density;
h.x; t/
fluid source density:
By combining Eqs. (89) and (133)–(138), we get (Augustin et al. 2012) s .x; t/
@2 u .x; t/ . C / r .r u.x; t// r 2 u.x; t/ C ˛rp.x; t/ D f .x; t/; @t 2 (141) @ .c0 p.x; t/ C ˛r u.x; t// r k rp.x; t/ f g.x; t/ D h.x; t/ @t (142)
1608
M. Augustin et al.
as the governing equations of poroelasticity in a homogeneous, isotropic medium. This can be seen as the poroelastic generalization of the acoustic wave equation (38). Since we are not interested in wave phenomena here, but in consolidation processes, it appears plausible to neglect the second-order time derivative term in (141). This can be justified by a nondimensionalization. Let x0 , t0 be a characteristic length scale and a characteristic time scale, respectively. Neglecting fluid forces and setting xQ D
x ; x0 u ; x0
uQ D
tQ D
t ; t0
pQ D
p ;
1 rQ D rxQ D rx xo fQ D
x0 f;
hQ D t0 h
yields h
i
h
i
rQ rQ uQ rQ 2 uQ C ˛ rQ pQ D fQ; i h @ Q c0 pQ C ˛ rQ uQ t0xk rQ rQ pQ D h: 2 @Qt
s x02 t02
@2 uQ @Qt 2
C
(143) (144)
0
The nondimensional coefficient in front of the diffusion term in (144) suggests t0 D
x02 k
leading to h
s k 2 x02
i
@2 uQ @Qt 2
h
i
rQ rQ uQ rQ 2 uQ C ˛ rQ pQ D fQ; @ Q Q p Q C ˛ r u Q rQ 2 pQ D h: c 0 @Qt
C
(145) (146)
For our purposes, the characteristic length scale x0 can be assumed to be of the order of several hundred meters or several kilometers. For example, with x0 D 100 m and for Berea sandstone (Schanz 2001), we have s k 2 x02
5:3 1011
which is rather small compared to C
D 53 ; ˛ D 0:867; and c0 0:461:
Therefore, the term s .x; t/@2t u.x; t/ in (141) is negligible. Hence, the quasistatic equations of poroelasticity can be written as . C / r .r u/ r 2 u C ˛rp D f
in B .0; tend /;
(147)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
@ .c0 p C ˛r u/ r k rp f g D h @t
in B .0; tend /:
1609
(148)
To guarantee uniqueness of a solution, these equations have to be equipped with initial and boundary conditions. Boundary conditions might be u D uD
on d Œ0; tend ;
. ˛Ip/ n D tN
on t Œ0; tend ;
p D pD k.rp g/ n D q
on p Œ0; tend ; (149) on f Œ0; tend : (150)
Here, the different parts of the boundary have to satisfy the conditions d \ t D ; D p \ f and d [ t D @B D p [ f . The indices “d ”, “t”, “p”, and “f ” stand for “displacement”, “tension”, “pressure”, and “flow”. It turns out that a suitable initial condition is given by prescribing the fluid content D c0 p C ˛r u at t D 0, such that (Augustin 2012) c0 p.x; 0/ C ˛r u.x; 0/ D .x; 0/ D .0/ .x/
for all x 2 B:
(151)
Probably, the most realistic way to do this would be to prescribe an initial pressure p.x; 0/ D p .0/ .x/, initial volume force density f .x; 0/ D f .0/ .x/, and boundary conditions on u at t D 0. With this, u.x; 0/ D u.0/ .x/ can be determined from Eq. (147) as this equation is independent of time and, hence, also holds for t D 0. As a consequence, .x; 0/ D .0/ .x/ can be calculated. Existence and uniqueness results for a solution of the initial boundary value problem (147)–(151) can be found in Augustin (2012). In order to put Eqs. (147)–(148) in context, we mention some relations, which are also visualized in Fig. 27. Obviously, we have a relation to the Cauchy-Navier equation . C / r .r u/ r 2 u D f
(152)
of elastostatics. Moreover, a relation to the heat or diffusion equation @ p r 2p D h @t
(153)
is clearly recognizable, as this is a prototype of a time-dependent parabolic differential equation. Furthermore, there is a relation to the Stokes equations r 2 u C rp D 0; r u D 0;
(154)
1610
M. Augustin et al.
−μ∇2u + ∇p = 0 ∇·u= 0
∂tp − ∇2p = 0
λ+μ − μ ∇ (∇ · u) − ∇2u + α∇p = 0 ∂t (c0μp + α∇ · u) − ∇2p = 0
λ+μ − μ ∇(∇ · u) − ∇2u = 0 Fig. 27 Relations of the equations of quasistatic poroelasticity to well-known (systems of) differential equations
which is probably the simplest system of equations for a vector-valued quantity u and a scalar-valued quantity p. The above given relations can also be established by regarding fundamental solutions to the quasistatic equations of poroelasticity. For the sake of convenience, we consider the dimensionless equations (cf. Eqs. (145) and (146)) C r .r u/ r 2 u C ˛rp D0;
(155)
@ .c0 p C ˛ .r u// r 2 p D0: @t
(156)
For the sake of readability, dimensionless quantities are denoted without the Q further on. With the abbreviations C1 D
˛ ; c0 .C2/C˛ 2
C2 D
C2 ; c0 .C2/C˛ 2
C3 D
c0 .C3/C˛ 2 ; 2.c0 .C2/C˛ 2 /
C4 D
c0 .C/C˛ 2 ; c0 .C3/C˛ 2
(157) we obtain the fundamental solutions (Augustin 2012) p Si .x; t/ DC2 G Heat .x; t/; 0 1 Zt Si Heat Harm u .x; t/ DC1 @rx C2 G .x; / C G .x/ı./ d A ;
(158) (159)
0
p Fi .x; t/ DC1 rx C2 G Heat .x; t/ C rx G Harm .x/ı.t/ ;
(160)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
0 uFi .x; t/ DC12 rx @rx
Zt
1611
1 C2 G Heat .x; / C G Harm .x/ı./ d A C uCN .x/ı.t/;
0
(161) or, more explicitly, p Si .x; t/ DC2 p1 3 p
1 3 4C2 t
kxk2 ; exp 4C 2t
(162)
erf pkxk 4C t
2 p2 pkxk exp kxk ; uSi .x; t/ DC1 x 3 4C2 t 4C2 t 4kxk 2 kxk2 p Fi .x; t/ DC1 x 3 ı.t/ C1 C2 p2 3 p x 5 exp 4C ; t 2 4kxk
4C2 t
(163) (164)
xi xk 1 ı ı.t/ uFi .x; t/ DC C C 3 ki 4 2 ki 4kxk kxk 2 p2 pkxk exp kxk C C12 1 3 ıi k 3xi x2k erf pkxk 4C2 t 4C2 t 4C2 t 4kxk kxk 2 (165) C p4 pkxk 3 xi xk exp kxk 4C2 t 4C2 t
with Dirac’s delta distribution ı./, G Heat .; /, G Harm ./, and uCN ./ being the fundamental solutions to the heat, Laplace, and Cauchy-Navier equation, respectively, as well as erf./ being the Gaussian error function defined as Z erf. / D
exp. 2 / d :
p2
(166)
0
Note that the fundamental solution for the pressure in the Stokes equations (154) is rx G Harm .x/, which is part of p Fi . Although the existence of a solution is guaranteed under certain conditions, it is in most cases not possible to compute such a solution analytically. Instead, a numerical solution scheme is needed to find a good approximate solution. Here, we introduce a solution scheme based on the method of fundamental solutions. The idea of the method of fundamental solutions is to choose a system of fundamental solutions to a given differential equation such that their singularities are not contained in the domain under consideration. Any finite linear combination of these fundamental solutions satisfies the corresponding differential equations with vanishing right-hand side and, thus, is a suitable ansatz for the solution of a corresponding initial boundary value problem. The coefficients of this ansatz can be determined by demanding that prescribed initial and boundary values have to be approximated, e.g., in a least square or collocation sense. The basics of this idea can be dated back to Runge (1885), Trefftz (1926), and Walsh (1929). In the context of electrostatics, the method is known as method of image charges (Jackson
1612
M. Augustin et al.
1998). Results in the context of potential theory, which also give a connection to a single-layer approach, can be found in Freeden (1983), Freeden and Michel (2004), and the references therein. The main advantages of the method are that it is mesh-free, integration-free, and easy to implement (Smyrlis 2009a). On the other hand, there is no general advice on how to choose the location of singularities or collocation points (Barnett and Betcke 2008; Katsurada and Okamoto 1996; Smyrlis and Karageorghis 2009). Choosing suitable sets of points is crucial for the quality of the approximation. Often, the occurring linear equation systems are ill-conditioned and have to be stabilized (Smyrlis 2009a). It is possible to show that certain systems of fundamental solutions are dense subsets of the solution space to a given partial differential equation. We refer to Freeden (1980, 1983), Freeden and Kersten (1981), and Freeden and Gerhards (2013) for results in potential theory, Müller and Kersten (1980) for results on the Helmholtz equation, Browder (1962) and Smyrlis (2009a) for results on elliptic operators, Freeden and Reuter (1990), Freeden and Michel (2004), Smyrlis (2009b) for results on the Cauchy-Navier equation, Mayer (2007) and Mayer and Freeden (2015) for results on the Stokes equations, and Kupradze (1964), Johansson and Lesnic (2008), and Johansson et al. (2011) for results on the heat equation. Quantitative convergence results are harder to get. They are known for the Laplace equation (Katsurada 1989; Li 2008a) and a (modified) Helmholtz equation (Barnett and Betcke 2008; Li 2008b). For the case of quasistatic poroelasticity, an ansatz based on the method of fundamental solutions can be written in the form
ui .x; t/ D
N X
M X 3 X .k/ fi anm uki .x yn ; t m / C bnm uSi .x y ; t / n m i
nD1
mD1
C
3 X
kD1
!
anCN;.k/ .t/uCN ki .x
yn / C
kD1
L X
.0/
bl uSi i .x yl ; t o /;
lD1
(167)
p.x; t/ D
N X
M X 3 X .k/ fi Si anm pk .x yn ; t m / C bnm p .x yn ; t m /
nD1
mD1
C
3 X
kD1
anCN;.k/ .t/pkSt .x
kD1
! yn / C
L X
.0/
bl p Si .x yl ; t o /;
lD1
(168) with N; M; L 2 N, and ufi .x y; t / D uFi .x y; t / uCN .x y/ı.t /;
(169)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1613
2.36e−03 −7
10 E (x=0,t)
2.31e−03
rel
2.26e−03
−8
10
2.21e−03 −9
10 2.16e−03 0
0.2
0.4
0.6
0.8
1
t
0
0.2
0.4
0.6
0.8
1
t
Fig. 28 Time development of p (left), evaluated at the origin x D 0, and approximation error (right)
p fi .x y; t / D p Fi .x y; t / p St .x y/ı.t /; p St .x y/ D rx G Harm .x y/:
(170) (171)
It can be shown that this ansatz can be reduced by neglecting the fi-parts. A detailed discussion on the method of fundamental solutions, including density results in appropriate solution spaces, can be found in the PhD thesis Augustin (2014). The considerations here are restricted to the behavior of the pressure p in an exemplary problem on the cube .1; 1/3 . For this purpose, we assume that the fluid content .x; t/ associated to the functions uSi .x1 2; x2 2; x3 2; t C 1/, p Si .x1 2; x2 2; x3 2; t C 1/ restricted to .1; 1/3 .0; 1/ is known at time t D 0 in the whole cube. As boundary conditions, let the associated normal tensions and normal fluid fluxes be prescribed on the surfaces of the cube at x3 D ˙1 and the displacement and pressure themselves at all other surfaces. We want to approximate the values of uSi .x1 2; x2 2; x3 2; t C 1/, p Si .x1 2; x2 2; x3 2; t C 1/ in .1; 1/3 .0; 1/. Figure 28 depicts the development of the pressure p in time at the origin, i.e., the center of the cube. The behavior of p is non-monotonic as a result of poroelastic effects. The approximation error is shown on the right-hand side of Fig. 28, using a logarithmic scale. As can be seen, the approximation is very good with an error between 2 107 and 109 . Figures 29 and 30 show on the left-hand side the pressure in the intersections of .1; 1/3 with the planes x3 D 0 and x2 D 0, respectively, at times t D 1 and t D 5, respectively. On the right-hand side, the associated relative errors are shown in a logarithmic pseudocolor plot using logarithmic contour distances. As can be seen, the largest errors in the intersection with the plane x3 D 0 are found inside the cube, and the largest errors in the intersection with the plane x2 D 0 are found at the boundaries at x3 D ˙1. This leads to the conclusion that the approximation
1614
M. Augustin et al. 1
1
5.75e−03
0.5
4.50e−03
0.5
0
3.25e−03
0
1.0e−06 1.0e−07
x2
x2
1.0e−08 1.0e−09 2.00e−03
−0.5
−1 −1
−0.5
0.75e−03 −0.5
0 x1
0.5
−1 −1
1
1.0e−10
−0.5
0 x1
0.5
1
1
1
5.75e−03
0.5
4.50e−03
0.5
0
3.25e−03
0
1.0e−11
1.0e−06 1.0e−07
x3
x3
1.0e−08 1.0e−09 2.00e−03
−0.5
−1 −1
0.75e−03 −0.5
0 x2
0.5
1
−0.5
−1 −1
1.0e−10
−0.5
0 x2
0.5
1
1.0e−11
Fig. 29 Left: pseudocolor plot of p in the intersection with the plane x3 D 0 (upper row) and the intersection with the plane x1 D 0 (lower row) at time t D 0:2. Right: logarithmic pseudocolor plots using logarithmic contour distances for the associated relative errors in approximating p
of normal tensions and normal fluid fluxes is harder than the approximation of prescribed values of displacement or pressure at the boundary. For further examples including a parameter study on the method of fundamental solutions, the reader is referred to the PhD thesis Augustin (2014). In order to use the method of fundamental solutions for problems with nonvanishing right-hand side, it has to be combined with another method. A possible extension is the dual reciprocity method (Golberg and Chen 1998). Starting point is the fact that a solution of a linear differential equation can be represented as the sum of a solution to the homogeneous equation and a particular solution to the equation in case of a nonvanishing right-hand side. Let us assume that we have a differential equation for the scalar-valued function p with a righthand side h. The idea of the dual reciprocity method is to find systems of radial .i / N basis functions fp .i / gN i D0 and fh gi D0 such that for every i , the corresponding .i / p is a solution to the differential equation under consideration with righthand side h.i / . If such systems exist and build up a basis of the respective spaces to which h and p belong, a solution with arbitrary right-hand sides h can be found by expanding h with respect to fh.i / gN i D0 . The solution ppart is then given by plugging the coefficients of this expansion into an ansatz given by a similar expansion based on fp .i / gN i D0 . In order to satisfy initial and boundary
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1615
1
1
1.0e−06
3.80e−03
0.5
3.10e−03
0
2.40e−03
−0.5
1.70e−03
1.0e−07
0.5
x2
x2
1.0e−08 0 1.0e−09
−1 −1
−0.5
1.00e−03 −0.5
0 x1
0.5
−1 −1
1
1.0e−10
−0.5
1
1
0 x1
0.5
1
1.0e−11 1.0e−06
3.80e−03
0.5
3.10e−03
0
2.40e−03
−0.5
1.70e−03
1.0e−07
0.5
x3
x3
1.0e−08 0 1.0e−09
−1 −1
1.00e−03 −0.5
0 x2
0.5
1
−0.5
−1 −1
1.0e−10
−0.5
0 x2
0.5
1
1.0e−11
Fig. 30 Left: pseudocolor plot of p in the intersection with the plane x3 D 0 (upper row) and the intersection with the plane x1 D 0 (lower row) at time t D 1. Right: logarithmic pseudocolor plots using logarithmic contour distances for the associated relative errors in approximating p
conditions, modify the given initial and boundary values by the values of ppart for t D 0 as well as at the boundary, respectively, and apply the method of fundamental solutions to the new problem with h D 0 and the modified initial and boundary conditions to obtain phom . The wanted solution is then given by p D ppart C phom . In case of quasistatic poroelasticity, the above-explained procedure has to be generalized. It is convenient to regard (147) and (148) as a system of four equations. The right-hand side of this system is a vector with four components, f1 , f2 , f3 , and h. Thus, we need a system ff.i / gN i D0 of radial basis functions which takes values in the space of 44-real-valued tensors. This becomes clear if we consider that in order to express any arbitrary vector in R4 , a basis of four vectors is needed. Consequently, 4 4 the corresponding radial basis functions fu.i / gN of solutions are also i D0 R tensor valued. The procedure of how to find a solution stays just the same as in the scalar-valued case mentioned above.
6
Opportunities, Challenges, and Perspectives
Geothermal energy is a key renewable resource with many valuable opportunities (see, e.g., Mongillo 2011) such as
1616
M. Augustin et al.
• Extensive global distribution: There is a high potential for geothermal energy production in many areas. Nevertheless, a joint worldwide research activity in the geothermal sector is not feasible, as the geological and geothermal conditions vary substantially between countries. Geothermal energy requires decentralized research facilities, however, complemented by a close networking in training and education activities worldwide. • Independence of season: Geothermal energy production is not dependent on any weather phenomenon (such as solar and wind influences) and is inexhaustibly available anywhere and anytime in the Earth’s interior. • Environmentally friendly character: Deep geothermal energy projects realize an excellent carbon balance. After the installation of a geothermal power plant, the net power production is totally CO2 -free. Further valuable arguments are the creation of a local value, the prevention of a foreign policy influence, and the independence of fossil energy forms, as well as the requirement of little space. • Contribution to the development of diversified power: Geothermal resources provide a reliable local energy source that can, at least to some extent, be used to replace the energy production based on fossil fuels. In consequence, the International Energy Agency report (see, e.g., International Energy Agency 2010) points out the perspective that geothermal resources have the potential to make a considerable contribution towards meeting the world’s energy needs well into the future while contributing to reduced emissions and to the mitigation of climate change. The global geothermal potential is enormous and the special significance of geomathematics as an interdisciplinary science can play a decisive role in the scientific consortium concerned with deep geothermal research. The GLITNIR geothermal research report (see Welding 2007) lists some challenges which are still valid today: • With the rapid development of the industry, there is an immediate need for additional expertise (from resource development to business management), support services (drilling capacity, data/information generation and availability, technology, power generation equipment, etc.), focused capital, transparency of the assurance market, and public acceptance. • There is still some fragmentation in the sector to appropriately fund and develop the resource. • Risk minimization including geomathematical as well as geophysical research is absolutely indispensable. Further academic education is necessary to increase the industry talent pool. Overall project risks are higher with inexperienced scientists. • Pressure will remain to ramp up projects at a rapid pace, requiring commitments from multiple players and institutions including the public sector. • Geothermal energy is the only real renewable baseload electricity option, yet does not get enough political support. Other incentives for geothermal exploration are needed which could help to cover at least parts of the drilling risk for geothermal reservoirs.
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1617
• Future expansion of geothermal developments depends on exploring new fields and overcoming technical calamities in known fields that are not exploited, yet. A particular issue that should currently be addressed by the geothermal community is the development of reliable EGS procedures that can ensure sustainable flow rates. • The potential risk of induced seismicity prevents the sustainable success of the geothermal technology and leads to the problem of social acceptance of the population (see Bauer et al. 2014a; Freeden 2013, and the references therein for more detailed explanations). There is a discrepancy between the willingness of the population for more intensive utilization of “green energies” and the acceptance of neighborhoods close to geothermal facilities. Altogether, we are led to the following key elements in geothermal development (see, e.g., Gehringer and Loksha 2012; MIT (Massachusetts Institute of Technology) 2006): • Geothermal energy is well positioned within the renewable energies. • Advanced exploration methods are crucial to reduce risks of failure in geothermal drilling and relevant costs during the exploratory phase. • A large and targeted geomathematical as well as geophysical research effort is needed to improve the results obtained by geothermal exploration methods. • Substantial geothermal growth could be provided by EGS technology; however, simultaneously sociopolitical acceptance of the population represents more and more a success factor for the realization of geothermal projects. The indicators “risk perception” and “societal and individual benefits” are decisive features (see, e.g., Aitken 2010; Bauer et al. 2014a,b; Freeden 2013). All in all, geothermal energy becomes more and more important in the increasing demand of renewable energy production. The key elements of a successful geothermal energy development are illustrated in Fig. 31. All developments and success factors are liable to a locally dependent understanding of reliable research study and external as well as internal stakeholder management. Geomathematics together with the scientific consortium concerned with the geothermal business will help to overcome the critical issues, i.e., minimizing risks, maximizing economic efficiency, and, finally, strong decorrelation of geothermal energy retrieval and local seismicity, as well as sociopolitical acceptance. It is remarkable that, in the interplay of risk and bankability (see Figs. 32 and 33), geomathematical components within pre-survey and exploration turn out to be both of tremendous significance and low cost. Unfortunately, the total success of exploration phases in the past is typically seen as critical as illustrated in Fig. 32 (see Gehringer and Loksha 2012). Undoubtedly, the special feature is that potential and exploration techniques have to deal with regions of the Earth which are not at all accessible for direct measurement and observation. Characterizing the interrelation between project risk and investment costs invariably remains beyond a certain success level such that there is a canonical limitation in all geoscientific work.
1618
M. Augustin et al.
Fig. 31 Key elements of successful geothermal energy development (due to Gehringer and Loksha 2012)
Fig. 32 Geothermal project risk, cumulative investment costs, and bankability (under past geomathematical capabilities, cf. Gehringer and Loksha 2012)
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1619
Fig. 33 Geothermal project risk, cumulative investment costs, and bankability (under today’s geomathematical capabilities)
This is also the reason why geothermal projects inherently show an always existing exploration risk (i.e., resource risk) often considered the greatest challenge from an investor’s point of view. Meanwhile, however, computers and observational technology have resulted in an explosive propagation of mathematics in about every area of science. Indeed, mathematics acts as an interdisciplinary accelerator even in exploration. The “mathematization of sciences” allows for the handling of complicated models and structures even for large data sets. Nowadays, modeling, computation, and visualization yield reliable simulation of states and processes. The resume is that the authors estimate the project risk in a more realistic way to be of the type illustrated in Fig. 33, at least when modern geomathematical tools are appropriately taken into account. Acknowledgements The work of the Geomathematics Group Kaiserslautern and G.E.O.S Ingenieurgesellschaft mbH, Freiberg, is supported by the “Verbundprojekt GEOFÜND: Charakterisierung und Weiterentwicklung integrativer Untersuchungsmethoden zur Quantifizierung des Fündigkeitsrisikos” (PI: W. Freeden) Federal Ministry for Economic Affairs and Energy (BMWi) Germany. M. Augustin has been supported by a fellowship of the German National Academic Foundation (Studienstiftung des deutschen Volkes). C. Gerhards has been supported by a fellowship within the Postdoc-program of the German Academic Exchange Service (DAAD). S. Eberle is thankful for the support by the Rhineland-Palatinate Center of Excellence for Climate Change Impacts. M. Ilyasov, S. Möhringer, H. Nutz, I. Ostermann, and A. Punzi thank for the support by
1620
M. Augustin et al.
the Rhineland-Palatinate excellence research center “Center for Mathematical and Computational Modeling (CM)2 ” and the University of Kaiserslautern within the scope of the project “EGMS” (PI: W. Freeden).
References Addis MA (1997) The stress-depletion response of reservoirs. In: SPE annual technical conference and exhibition, San Antonio, 5–8 Oct 1997 Adler PM, Thovert JF (1999) Theory and applications in porous media. Fractures and fracture networks, vol 15. Kluwer Academic, Dordrecht Aitken M (2010) Why we still don’t understand the social aspects of wind power: a critique of key assumptions with the literature. Energy Policy 38:1834–1841 Altmann J, Dorner A, Schoenball M, Müller BIR, Müller T (2008) Modellierung von porendruckinduzierten Änderungen des Spannungsfeldes in Reservoiren. In: Kongressband, Geothermiekongress 2008, Karlsruhe Arbogast T (1989) Analysis of the simulation of single phase flow through a naturally fractured reservoir. SIAM J Numer Anal 26:12–29 Arbogast T, Douglas J, Hornung U (1990) Derivation of the double porosity model of single phase flow via homogenization theory. SIAM J Math Anal 21:823–836 Assteerawatt A (2008) Flow and transport modelling of fractured aquifers based on a geostatistical approach. PhD thesis, Institute of Hydraulic Engineering, University of Stuttgart Augustin M (2012) On the role of poroelasticity for modeling of stress fields in geothermal reservoirs. Int J Geomath 3:67–93 Augustin M (2014) A method of fundamental solutions in poroelasticity to model the stress field in geothermal reservoirs. PhD thesis, Geomathematics Group, University of Kaiserslautern Augustin M, Freeden W, Gerhards C, Möhringer S, Ostermann I (2012) Mathematische Methoden in der Geothermie. Math Semesterber 59:1–28 Auradou H (2009) Influence of wall roughness on the geometrical, mechanical and transport properties of single fractures. J Phys D Appl Phys 42:214015 Auriault J-L (1973) Contribution à l’étude de la consolidation des sols. PhD thesis, L’Université scientifique et médicale de Grenoble Axelsson G, Gunnlaugsson E (2000) Long-term monitoring of high- and low-enthalpy fields under exploitation. In: World geothermal congress 2000, pre-congress course, Kokonoe Baisch S, Carbon D, Dannwolf U, Delacou B, Delvaux M, Dunand F, Jung R, Koller M, Martin C, Sartori M, Secanell R, Vorös R (2009) Deep heat mining Basel – seismic risk analysis. SERIANEX. Technical report, study prepared for the Departement für Wirtschaft, Soziales und Umwelt des Kantons Basel-Stadt, Amt für Umwelt und Energie Barenblatt GI, Zheltov IP, Kochina IN (1960) Basic concepts in the theory of seepage of homogeneous liquids in fissured rocks. PMM Sov Appl Math Mech 24:852–864 Barnett AH, Betcke T (2008) Stability and convergence of the method of fundamental solutions for Helmholtz problems on analytic domains. J Comput Phys 227:7003–7026 Bauer M, Freeden W, Jacobi H, Neu T (2014a) Energiewirtschaft 2014. Springer Spektrum, Wiesbaden Bauer M, Freeden W, Jacobi H, Neu T (2014b) Handbuch Tiefe Geothermie. Springer Spektrum, Berlin/Heidelberg Baysal E, Kosloff DD, Sherwood JWC (1983) Reverse time migration. Geophysics 48: 1514–1524 Baysal E, Kosloff DD, Sherwood JWC (1984) A two-way nonreflecting wave equation. Geophysics 49:132–141 Bear J (1972) Dynamics of fluids in porous media. Elsevier, New York Bear J, Tsang CF, de Marsily G (1993) Flow and contaminant transport in fractured rock. Academic, San Diego
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1621
Berkowitz B (1995) Analysis of fracture network connectivity using percolation theory. Math Geol 27:467–483 Berkowitz B (2002) Characterizing flow and transport in fractured geological media: a review. Adv Water Resour 25:852–864 Billette F, Brandsberg-Dahl S (2005) The 2004 BP velocity benchmark. In: 67th annual international meeting EAGE, Madrid. Expanded abstracts. EAGE Biondi BL (2006) Three-dimensional seismic imaging. Society of Exploration Geophysicists, Tulsa Biot MA (1935) Le problème de la consolidation des matières argileuses sous une charge. Ann Soc Sci Brux B55:110–113 Biot MA (1941) General theory of three-dimensional consolidation. J Appl Phys 12:151–164 Biot MA (1955) Theory of elasticity and consolidation for a porous anisotropic solid. J Appl Phys 26:182–185 Blakely RJ (1996) Potential theory in gravity & magnetic application. Cambridge University Press, Cambridge Blank L (1996) Numerical treatment of differential equations of fractional order. Technical report, numercial analysis report, University of Manchester Bleistein N (1987) On the imaging of reflectors in the Earth. Geophysics 49:931–942 Bleistein N, Cohen JK, Stockwell JW (2000) Mathematics of multidimensional seismic imaging, migration, and inversion. Springer, New York Bödvarsson G (1964) Physical characteristics of natural heat sources in Iceland. In: Proceedings of the United Nations conference on new sources of energy, vol 2. United Nations Bollhöfer M, Grote MJ, Schenk O (2008) Algebraic multilevel preconditioner for the Helmholtz equation in heterogeneous media. SIAM J Sci Comput 31:3781–3805 Bonomi E, Pieroni E (1998) Energy-tuned absorbing boundary conditions. In: 4th SIAM international conference on mathematical and numerical aspects of wave propagation, Colorado School of Mines Bording RP, Liner CL (1994) Theory of 2.5-D reverse time migration. In: Proceedings, 64th annual international meeting: society of exploration geophysicists, Los Angeles Brouwer GK, Lokhorst A, Orlic B (2005) Geothermal heat and abandoned gas reservoirs in the Netherlands. In: Proceedings world geothermal congress 2005, Antalya Browder FE (1962) Approximation by solutions of partial differential equations. Am J Math 84:134–160 Brown SR (1987) Fluid flow through rock joints: the effect of surface roughness. J Geophys Res 92:1337–1347 Buhmann MD (2003) Radial basis functions: theory and implementations. Cambridge monographs on applied and computational mathematics, vol 12. Cambridge University Press, Cambridge Buske S (1994) Kirchhoff-Migration von Einzelschußdaten. Master thesis, Institut für Meteorologie und Geophysik der Johann Wolfgang Goethe Universität Frankfurt am Main Chen M, Bai M, Roegiers JC (1999) Permeability tensors of anisotropic fracture networks. Math Geol 31:355–373 Chen Z, Huan G, Ma Y (2006) Computational methods for multiphase flows in porous media. Computational science & engineering, vol 2. SIAM, Philadelphia Cheng H-P, Yeh G-T (1998) Development and demonstrative application of a 3-D numerical model of subsurface flow, heat transfer, and reactive chemical transport: 3DHYDROGEOCHEM. J Contam Hydrol 34:47–83 Claerbout J (2009) Basic Earth imaging. Stanford University, Stanford Darcy HPG (1856) Les Fontaines Publiques de la Ville de Dijon. Victor Dalmont, Paris de Boer R (2000) Theory of porous media – highlights in historical development and current state. Springer, Berlin Deng F, McMechan GA (2007) 3-D true amplitude prestack depth migration. In: Proceedings, SEG annual meeting, San Antonio Dershowitz WS, La Pointe PR, Doe TW (2004) Advances in discrete fracture network modeling. In: Proceedings, US EPA/NGWA fractured rock conference, Portland, pp 882–894
1622
M. Augustin et al.
Diersch H-J (1985) Modellierung und numerische Simulation geohydrodynamischer Transportprozesse. PhD thesis, Akademie der Wissenschaften der DDR Diersch H-J (2000) Numerische Modellierung ober- und unterirdischer Strömungs- und Transportprozesse. In: Martin H, Pohl M (eds) Technische Hydromechanik 4 – Hydraulische und numerische Modelle. Verlag Bauwesen, Berlin Dietrich P, Helmig R, Sauter M, Hötzl H, Köngeter J, Teutsch G (2005) Flow and transport in fractured porous media. Springer, Berlin Du X, Bancroft JC (2004) 2-D wave equation modeling and migration by a new finite difference scheme based on the Galerkin method. Technical report, CREWES Durst P, Vuataz FD (2000) Fluid-rock interactions in hot dry rock reservoirs: a review of the HDR sites and detailed investigations of the Soultz-sous-Forets system. In: Proceedings of the world geothermal congess 2000, Kyushu-Tohoku Eberle S (2014) Forest fire determination: theory and numerical aspects. PhD thesis, Geomathematics Group, University of Kaiserslautern Eberle S, Freeden W, Matthes U (2015) Forest fire spreading. In Freeden W, Nashed B, Sonar T (Eds) Handbook of Geomathematics, 2nd Edn. Springer Eker E, Akin S (2006) Lattice Boltzmann simulation of fluid flow in synthetic fractures. Transp Porous Media 65:363–384 Ene HI, Poliševski D (1987) Thermal flow in porous media. D. Reidel, Dordrecht Engelder T, Fischer MP (1994) Influence of poroelastic behaviour on the magnitude of minimum horizontal stress, Sh , in overpressured parts of sedimentary basins. Geology 22:949–952 Engl W, Hanke M, Neubauer A (1996) Regularization of inverse problems. Kluwer Academic, Dordrecht Ernstson K, Alt W (2013) Gravity and geomagnetic methods in geothermal exploration: understanding and misunderstanding. World Min 65:115–122 Evans KF, Cornet FH, Hashida T, Hayashi K, Ito T, Matsuki K, Wallroth T (1999) Stress and rock mechanics issues of relevance to HDR/HWR engineered geothermal systems: review of developments during the past 15 years. Geothermics 28:455–474 Expertengruppe “Seismisches Risiko bei hydrothermaler Geothermie” (2010) Das seismische Ereignis bei Landau vom 15. August 2009, Abschlussbericht. Technical report, on behalf of the Ministerium für Umwelt, Landwirtschaft, Ernährung, Weinbau und Forsten des Landes Rheinland-Pfalz Fehlinger T (2009) Multiscale formulations for the disturbing potential and the deflections of the vertical in locally reflected physical geodesy. PhD thesis, Geomathematics Group, University of Kaiserslautern Fisher N, Lewis T, Embleton B (1993) Statistical analysis of spherical data. Cambridge University Press, Cambridge Fomin S, Hashida T, Shimizu A, Matsuki K, Sakaguchi K (2003) Fractal concept in numerical simulation of hydraulic fracturing of the hot dry rock geothermal reservoir. Hydrol Process 17:2975–2989 Ford NJ, Simpson A (2001) The numerical solution of fractional differential equations: speed versus accuracy. Numer Algorithms 26:333–346 Foulger G, Natland J, Presnall D, Anderson D (2005) Plates, plumes, and paradigms. Geological Society of America, Boulder Freeden C (2013) The role and the potential of communication by analysing the social acceptance of the German deep geothermal energy market. Master thesis, University of Plymouth Freeden W (1980) On the approximation of external gravitational potential with closed systems of (trial) functions. Bull Geod 54:1–20 Freeden W (1981) On approximation by harmonic splines. Manuscr Geod 6:193–244 Freeden W (1983) Least squares approximation by linear combination of (multi-)poles. Report 344, Departement of Geodetic Science and Surveying, The Ohio State University, Columbus Freeden W (1999) Multiscale modelling of spaceborne geodata. Teubner, Stuttgart Freeden W (2011) Metaharmonic lattice point theory. CRC/Taylor & Francis, Boca Raton
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1623
Freeden W (2015) Geomathematics: Its Role, Its Aim, and Its Potential. In: Freeden W, Nashed Z, Sonar T (Eds) Handbook of Geomathematics, 2nd Edn. Springer Freeden W, Blick C (2013) Signal decorrelation by means of multiscale methods. World Min 65:304–317 Freeden W, Gerhards C (2010) Poloidal and toroidal field modeling in terms of locally supported vector wavelets. Math Geosci 42:817–838 Freeden W, Gerhards C (2013) Geomathematically oriented potential theory. Chapman & Hall/CRC, Boca Raton Freeden W, Gutting M (2013) Special functions of mathematical (geo-)physics. Birkhäuser, Basel Freeden W, Kersten H (1981) A constructive approximation theorem for the oblique derivative problem in potential theory. Math Methods Appl Sci 3:104–114 Freeden W, Mayer C (2003) Wavelets generated by layer potentials. Appl Comput Harm Anal 14:195–237 Freeden W, Michel V (2004) Multiscale potential theory with applications to geoscience. Birkhäuser, Boston Freeden W, Nutz H (2011) Satellite gravity gradiometry as tensorial inverse problem. Int J Geomath 2:123–146 Freeden W, Nutz H (2014) Mathematische Methoden. In: Bauer M, Freeden W, Jacobi H, Neu T (eds) Handbuch Tiefe Geothermie. Springer, Heidelberg, pp 125–222 Freeden W, Reuter R (1990) A constructive method for solving the displacement boundary-value problem of elastostatics by use of global basis systems. Math Methods Appl Sci 12:105–128 Freeden W, Schreiner M (2006) Local multiscale modelling of geoid undulations from deflections of the vertical. J Geodesy 79:641–651 Freeden W, Schreiner M (2009) Spherical functions of mathematical geosciences: a scalar, vectorial, and tensorial setup. Springer, Berlin Freeden W, Wolf K (2009) Klassische Erdschwerefeldbestimmung aus der Sicht moderner Geomathematik. Math Semesterber 56:53–77 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere (with applications to geomathematics). Oxford Science Publications/Clarendon, Oxford Freeden W, Mayer C, Schreiner M (2003) Tree algorithms in wavelet approximations by Helmholtz potential operators. Numer Funct Anal Optim 24:747–782 Freeden W, Fehlinger T, Klug M, Mathar D, Wolf K (2009) Classical globally reflected gravity field determination in modern locally oriented multiscale framework. J Geodesy 83:1171–1191 Gehringer M, Loksha V (2012) Handbook on planning and financing geothermal power generation. ESMAP (Energy Sector Management Assistence Programm), main findings and recommendations, The International Bank for Reconstruction and Development, Washington Georgsson LS, Friedleifsson IB (2009) Geothermal energy in the world from energy perspective. In: Short course IV on exploration for geothermal resources, Lake Naivasha, pp 1–22 Geothermal Energy Association (2011) Annual US geothermal power production and development report. Technical report Gerhards C (2011) Spherical multiscale methods in terms of locally supported wavelets: theory and application to geomagnetic modeling. PhD thesis, Geomathematics Group, University of Kaiserslautern Gerhards C (2012) Locally supported wavelets for the separation of spherical vector fields with respect to their sources. Int J Wavel Multires Inf Proc 10:1250034 Gerhards C (2014) A multiscale power spectrum for the analysis of the lithospheric magnetic field. Int J Geomath. 5:63–79 Ghassemi A (2003) A thermoelastic hydraulic fracture design tool for geothermal reservoir development. Technical report, Department of Geology & Geological Engineering, University of North Dakota Ghassemi A, Tarasovs S (2004) Three-dimensional modeling of injection induced thermal stresses with an example from Coso. In: Proceedings, 29th workshop on geothermal reservoir engineering, Stanford University, Stanford
1624
M. Augustin et al.
Ghassemi A, Zhang Q (2004) Poro-thermoelastic mechanisms in wellbore stability and reservoir stimulation. In: Proceedings, 29th workshop on geothermal reservoir engineering, Stanford University, Stanford Ghassemi A, Tarasovs S, Cheng AHD (2003) An integral equation solution for three-dimensional heat extraction from planar fracture in hot dry rock. Int J Numer Anal Methods Geomech 27:989–1004 Golberg MA, Chen CS (1998) The method of fundamental solutions for potential, Helmholtz and diffusion problems. In: Golberg MA (ed) Boundary integral methods – numerical and mathematical aspects. Computational mechanics publications. WIT, Southhampton, pp 103– 176 Gorenflo R, Mainardi F (1997) Fractional calculus: integral and differential equations of fractional order. In: Carpinteri A, Mainardi F (eds) Fractals and fractional calculus in continuum mechanics. Springer, Wien, pp 223–276 Hammons TJ (2011) Geothermal power generation: global perspectives, technology, direct uses, plants, drilling and sustainability worldwide. In: Electricity infrastructures in the global marketplace. InTech, pp 195–234 Haney MM, Bartel LC, Aldridge DF, Symons NP (2005) Insight into the output of reverse-time migration: what do the amplitudes mean? In: Proceedings, SEG annual meeting, Houston Helmig R, Niessner J, Flemisch B, Wolff M, Fritz J (2014) Efficient modeling of flow and transport in porous media using multi-physics and multi-scale approaches. In: Freeden W, Nashed Z, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, New York Heuer N, Küpper T, Windelberg D (1991) Mathematical model of a hot dry rock system. Geophys J Int 105:659–664 Hicks TW, Pine RJ, Willis-Richards J, Xu S, Jupe AJ, Rodrigues NEV (1996) A hydro-thermomechanical numerical model for HDR geothermal reservoir evaluation. Int J Rock Mech Min Sci 33:499–511 Hillis RR (2000) Pore pressure/stress coupling and its implications for seismicity. Explor Geophys 31:448–454 Hillis RR (2001) Coupled changes in pore pressure and stress in oil fields and sedimentary basins. Pet Geosci 7:419–425 Hillis RR (2003) Pore pressure/stress coupling and its implications for rock failure. In: Vanrensbergen P, Hillis RR, Maltman AJ, Morley CK (eds) Subsurface sediment mobilization. Geological Society of London, London, pp 359–368 Ilyasov M (2011) A tree algorithm for Helmholtz potential wavelets on non-smooth surfaces: theoretical background and application to seismic data postprocessing. PhD thesis, Geomathematics Group, University of Kaiserslautern International Energy Agency (2010) Annual report. Technical report Itasca Consulting Group Inc (2000) UDEC user’s guide. Minnesota Jackson JD (1998) Classical electrodynamics. Wiley, New York Jacobs F, Meyer H (1992) Geophysik – Signale aus der Erde. Teubner, Stuttgart Jaeger JC, Cook NGW, Zimmerman RW (2007) Fundamentals of rock mechanics. Blackwell, Malden Jia X, Hu T (2006) Element-free precise integration method and its application in seismic modelling and imaging. Geophys J Int 166:349–372 Jing L, Hudson JA (2002) Numerical methods in rock mechanics. Int J Rock Mech Min Sci 39:409– 427 Jing Z, Willis-Richards J, Watanabe K, Hashida T (2000) A three-dimensional stochastic rock mechanics model of engineered geothermal systems in fractured crystalline rock. J Geophys Res 105:23663–23679 Jing Z, Watanabe K, Willis-Richards J, Hashida T (2002) A 3-D water/rock chemical interaction model for prediction of HDR/HWR geothermal reservoir performance. Geothermics 31:1–28 Johansson BT, Lesnic D (2008) A method of fundamental solutions for transient heat conduction. Eng Anal Bound Elem 32:697–703
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1625
Johansson BT, Lesnic D, Reeve T (2011) A method of fundamental solutions for two-dimensional heat conduction. Int J Comput Math 88:1697–1713 John V, Schmeyer E (2008) Finite element methods for time-dependent convection-diffusionreaction equations with small diffusion. Comput Methods Appl Mech Eng 198:475–494 John V, Kaya S, Layton W (2006) A two-level variational multiscale method for convectiondominated convection-diffusion equations. Comput Methods Appl Mech Eng 195:4594–4603 Jung R (2007) Stand und Aussichten der Tiefengeothermie in Deutschland. Erdöl, Erdgas, Kohle 123:1–7 Katsurada M (1989) A mathematical study of the charge simulation method II. J Fac Sci Univ Tokyo Sect IA Math 36:135–162 Katsurada M, Okamoto H (1996) The collocation points of the fundamental solution method for the potential problem. Comput Math Appl 31:123–137 Kazemi H (1969) Pressure transient analysis of naturally fractured reservoirs with uniform fracture distribution. Soc Petrol Eng J 246:451–461 Kazemi H, Merril LS, Porterfield KL, Zeman PR (1976) Numerical simulation of water-oil flow in naturally fractured reservoirs. In: Proceedings, SPE-AIME 4th symposium on numerical simulation of reservoir performance, Los Angeles Kim I, Lindquist WB, Durham WB (2003) Fracture flow simulation using a finite-difference lattice Boltzmann method. Phys Rev E 67:046708 Kimura S, Masuda Y, Hayashi K (1992) Efficient numerical method based on double porosity model to analyze heat and fluid flows in fractured rock formations. JSME Int J Ser 2 35:395– 399 Kühn M (2009) Modelling feed-back of chemical reactions on flow fields in hydrothermal systems. Surv Geophys 30:233–251 Kühn M, Stöfen H (2005) A reactive flow model of the geothermal reservoir Waiwera, New Zealand. Hydrogeol J 13:606–626 Kupradze VD (1964) A method for the approximate solution of limiting problems in mathematical physics. USSR Comput Math Math Phys 4:199–205 Lai M, Krempl E, Ruben D (2010) Introduction to continuum mechanics. Butterworth-Heinemann, Burlington Landau LD, Pitaevskii LP, Lifshitz EM, Kosevich AM (1986) Theory of elasticity. Theoretical physics, vol 7, 3rd edn. Butterworth-Heinemann, Oxford Lang U (1995) Simulation regionaler Strömungs- und Transportvorgänge in Karstaquifern mit Hilfe des Doppelkontinuum-Ansatzes: Methodenentwicklung und Parameteridentifikation. PhD thesis, University of Stuttgart Lang U, Helmig R (1995) Numerical modeling in fractured media – identification of measured field data. In: Herbert M, Kovar K (eds) Groundwater quality: remediation and protection. IAHS and University Karlova, Prague, pp 203–212 Lee J, Choi SU, Cho W (1999) A comparative study of dual-porosity model and discrete fracture network model. KSCE J Civ Eng 3:171–180 Li X (2008a) Convergence of the method of fundamental solutions for Poisson’s equation on the unit sphere. Adv Comput Math 28:269–282 Li X (2008b) Rate of convergence of the method of fundamental solutions and hyperinterpolation for modified Helmholtz equations on the unit ball. Adv Comput Math 29:393–413 Lomize GM (1951) Seepage in fissured rocks. State Press, Moscow Long J, Remer J, Wilson C, Witherspoon P (1982) Porous media equivalents for networks of discontinuous fractures. Water Resour Res 18:645–658 Luchko Y (2009) Maximum principle for the generalized time-fractional diffusion equation. J Math Anal Appl 351:218–223 Luchko Y (2010) Some uniqueness and existence results for the initial-boundary-value problems for the generalized time-fractional diffusion equation. Comput Math Appl 59:1766–1772 Luchko Y (2015) Fractional diffusion and wave propagation. In: Freeden W, Nashed M, Sonar T (Eds) Handbook of Geomathematics, 2nd Edn. Springer
1626
M. Augustin et al.
Luchko Y, Punzi A (2011) Modeling anomalous heat transport in geothermal reservoirs via fractional diffusion equations. Int J Geomath 1:257–276 Martin GS, Marfurt KJ, Larsen S (2002) Marmousi-2: an updated model for the investigation of AVO in structurally complex areas. In: Proceedings, SEG annual meeting, Salt Lake City Maryška J, Severýn O, Vohralík M (2004) Numerical simulation of fracture flow in mixed-hybrid FEM stochastic discrete fracture network model. Comput Geosci 8:217–234 Masahi M, King P, Nurafza P (2007) Fast estimation of connectivity in fractured reservoirs using percolation theory. SPE J 12:167–178 Mayer C (2007) A wavelet approach to the Stokes problem. Habilitation thesis, Geomathematics Group, University of Kaiserslautern Mayer C, Freeden W (2015) Stokes problem, layer potentials and regularizations, multiscale applications. In: Freeden W, Nashed Z, Sonar T (Eds) Handbook of Geomathematics, 2nd Edn. Springer McLean W, Mustapha K (2009) Convergence analysis of a discontinuous Galerkin method for a sub-diffusion equation. Numer Algorithms 52:69–88 Menke W (1984) Geophysical data analysis: discrete inverse theory. Academic, Orlando Michel V (2002) A multiscale approximation for operator equations in separable Hilbert spaces – case study: reconstruction and description of the Earth’s interior. Habilitation thesis, Geomathematics Group, University of Kaiserslautern Michel V, Fokas AS (2008) A unified approach to various techniques for the non-uniqueness of the inverse gravimetric problem and wavelet based methods. Inverse Probl 24:045019 Min KB, Jing L, Stephansson O (2004) Determining the equivalent permeability tensor for fractured rock masses using a stochastic REV approach: method and application to the field data from Sellafield, UK. Hydrogeol J 12:497–510 MIT (Massachusetts Institute of Technology) (2006) The future of geothermal energy. http://mitei. mit.edu/publications/reports-studies/future-geothermal-energy Mo H, Bai M, Lin D, Roegiers JC (1998) Study of flow and transport in fracture network using percolation theory. Appl Math Model 22:277–291 Moeck I, Kwiatek G, Zimmermann G (2009) The in-situ stress field as a key issue for geothermal field development – a case study from the NE German Basin. In: Proceedings, 71st EAGE conference & exhibition, Amsterdam Möhringer S (2014) Decorrelation of gravimetric data. PhD thesis, Geomathematics Group, University of Kaiserslautern Mongillo M (2011) International efforts to promote global sustainable geothermal development. In: GIA annual report executive summary, Singapore, pp 1–19 Morgan WJ (1971) Convective plumes in the lower mantle. Nature 230:42–43 Müller C (1998) Analysis of spherical symmetries in euclidean spaces. Applied mathematical sciences, vol 129. Springer, Berlin Müller C, Kersten H (1980) Zwei Klassen vollständiger Funktionensysteme zur Behandlung der Randwertaufgaben der Schwingungsgleichung 4U C k 2 U D 0. Math Method Appl Sci 2:48– 67 Nakao S, Ishido T (1998) Pressure-transient behavior during cold water injection into geothermal wells. Geothermics 27:401–413 Neuman S (2005) Trends, prospects and challenges in quantifying flow and transport through fractured rocks. Hydrogeol J 13:124–147 Neuman S, Depner J (1988) Use of variable-scale pressure test data to estimate the log hydraulic conductivity covariance and dispersivity of fractured granites near Oracle, Arizona. J Hydrol 102:475–501 Nolet G (2008) Seismic tomography: imaging the interior of the Earth and Sun. Cambridge University Press, Cambridge Oden M, Niemi A (2006) From well-test data to input to stochastic continuum models: effect of the variable support scale of the hydraulic data. Hydrogeol J 14:1409–1422 Ödner H (1998) One-dimensional transient flow in a finite fractured aquifer system. Hydrol Sci J 43:243–265
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1627
Ostermann I (2011a) Modeling heat transport in deep geothermal systems by radial basis functions. PhD thesis, Geomathematics Group, University of Kaiserslautern Ostermann I (2011b) Three-dimensional modeling of heat transport in deep hydrothermal reservoirs. Int J Geomath 2:37–68 O’Sullivan MJ, Pruess K, Lippmann MJ (2001) State of the art of geothermal reservoir simulation. Geothermics 30:395–429 Ouenes A (2000) Practical application of fuzzy logic and neural networks to fractured reservoir characterization. Comput Geosci 26:953–962 Peters RR, Klavetter EA (1988) A continuum model for water movement in an unsaturated fractured rock mass. Water Resour Res 24:416–430 Phillips PJ (2005) Finite element method in linear poroelasticity: theoretical and computational results. PhD thesis, University of Texas, Austin Phillips PJ, Wheeler MF (2007) A coupling of mixed and continuous Galerkin finite element methods for poroelasticity I: the continuous in time case. Comput Geosci 11:131–144 Phillips WS, Rutledge JT, House LS, Fehler MC (2002) Induced microearthquake patterns in hydrocarbon and geothermal reservoirs: six case studies. Pure Appl Geophys 159:345–369 Podvin P, Lecomte I (1991) Finite difference computation of traveltimes in very contrasted velocity models: a massively parallel approach and its associated tools. Geophys J Int 105:271–284 Popov M (1982) A new method of computation of wave fields using Gaussian beams. Wave Motion 4:85–97 Pruess K (1990) Modelling of geothermal reservoirs: fundamental processes, computer simulation and field applications. Geothermics 19:3–15 Pruess K, Narasimhan TN (1985) A practical method for modeling fluid and heat flow in fractured porous media. Soc Pet Eng J 25:14–26 Pruess K, Wang JSY, Tsang YW (1986) Effective continuum approximation for modeling fluid and heat flow in fractured porous tuff. Technical report, Sandia National Laboratories Report SAND86-7000, Albuquerque Reichenberger V, Jakobs H, Bastian P, Helmig R (2006) A mixed-dimensional finite volume method for two-phase flow in fractured porous media. Adv Water Resour 29:1020–1036 Renaut R, Fröhlich J (1996) A pseudospectral Chebychev method for 2D wave equation with domain stretching and absorbing boundary conditions. J Comput Phys 124:324–336 Renner J, Steeb H (2015) Modeling of fluid transport in geothermal research. In: Freeden W, Nashed Z, Sonar T (Eds) Handbook of Geomathematics, 2nd Edn. Springer Rice JR, Cleary MP (1976) Some basic stress diffusion solutions for fluid-saturated elastic porous media with compressible constituents. Rev Geophys Space Phys 14:227–241 Ritter JRR, Christensen UR (2007) Mantle plumes: a multidisciplinary approach. Springer, Berlin Runge C (1885) Zur Theorie der eindeutigen analytischen Funktionen. Acta Math 6:229–234 Rutqvist J, Stephansson O (2003) The role of hydromechanical coupling in fractured rock engineering. Hydrogeol J 11:7–40 Saemundsson K (2009) Geothermal systems in global perspective. In: Short course IV on exploration for geothermal resources, Lake Naivasha Sahimi M (1995) Flow and transport in porous media and fractured rock: from classical methods to modern approaches. VCH, Weinheim Sanyal SK (2005) Classification of geothermal systems – a possible scheme. In: Proceedings, 30th workshop on geothermal reservoir engineering, Stanford University, Stanford, SGP-TR-176, pp 85–92 Sanyal SK, Butler SJ, Swenson D, Hardeman B (2000) Review of the state-of-the-art of numerical simulation of enhanced geothermal systems. In: Proceedings, world geothermal congress, Kyushu-Tohoku Schanz M (2001) Application of 3D time domain boundary element formulation to wave propagation in poroelastic solids. Eng Anal Bound Elem 25:363–376 Schubert G, Turcotte DL, Olson P (2001) Mantle convection in the Earth and Planets. Cambridge University Press, Cambridge
1628
M. Augustin et al.
Schulz R (2009) Aufbau eines geothermischen Informationssystems für Deutschland. Technical report, Leibniz-Institut für Angewandte Geophysik, Hannover Semtchenok NM, Popov MM, Verdel AR (2009) Gaussian beam tomography. In: Extended abstracts, 71st EAGE conference & exhibition, Amsterdam Showalter RE (2000) Diffusion in poro-elastic media. J Math Anal Appl 251:310–340 Smyrlis Y-S (2009a) Applicability and applications of the method of fundamental solutions. Math Comput 78:1399–1434 Smyrlis Y-S (2009b) Mathematical foundation of the MFS for certain elliptic systems in linear elasticity. Numer Math 112:319–340 Smyrlis Y-S, Karageorghis A (2009) Efficient implementation of the MFS: the three scenarios. J Comput Appl Math 227:83–92 Snieder R (2002) The perturbation method in elastic wave scattering and inverse scattering in pure and applied science. In: General theory of elastic wave. Academic, San Diego, pp 528–542 Snow DT (1965) A parallel plate model of fractured permeable media. PhD thesis, University of California, Berkeley Stothoff S, Or D (2000) A discrete-fracture boundary integral approach to simulating coupled energy and moisture transport in a fractured porous medium. In: Faybishenko B, Witherspoon PA, Benson SM (eds) Dynamics of fluids in fractured rocks, concepts and recent advances. AGU geophysical monograph, vol 122. American Geophysical Union, Washington, DC, pp 269–279 Sudicky EA, McLaren RG (1992) The Laplace transform Galerkin technique for large-scale simulation of mass transport in discretely fractured porous formations. Water Resour Res 28:499–514 Symes WW (2003) Kinematics of reverse time S-G migration. Technical report, Rice University Symes WW (2007) Reverse time migration with optimal checkpointing. Geophysics 72:SM213– SM221 Takenaka H, Wang Y, Furumura T (1999) An efficient approach of the pseudospectral method for modelling of geometrical symmetric seismic wavefields. Earth Planets Space 51:73–79 Tang DH, Frind EO, Sudicky EA (1981) Contaminant transport in fractured porous media: analytical solution for a single fracture. Water Resour Res 17:555–564 Tran NH, Rahman SS (2006) Modelling discrete fracture networks using neuro-fractal-stochastic simulation. J Eng Appl Sci 1:154–160 Travis BJ (1984) TRACR3D: a model of flow and transport in porous/fractured media. Technical report, Los Alamos National Laboratory LA-9667-MS, Los Alamos Trefftz E (1926) Ein Gegenstück zum Ritzschen Verfahren. In: Proceedings of the 2nd international congress for applied mechanics, Zürich Tsang Y, Tsang C (1987) Chanel flow model through fractured media. Water Resour Res 23:467– 479 Tsang Y, Tsang C (1989) Flow chaneling in a single fracture as a two-dimensional strongly heterogeneous permeable medium. Water Resour Res 25:2076–2080 Tsang Y, Tsang C, Hale F, Dverstorp B (1996) Tracer transport in a stochastic continuum model of fractured media. Water Resour Res 32:3077–3092 Turcotte DL, Schubert G (2001) Geodynamics. Cambridge University Press, Cambridge Vidale J (1988) Finite-difference calculation of travel times. Bull Seismol Soc Am 78:2062–2076 Walsh J (1929) The approximation of harmonic functions by harmonic polynomials and by harmonic rational functions. Bull Am Math Soc 35:499–544 Warren JE, Root PJ (1963) The behaviour of naturally fractured reservoirs. Soc Pet Eng J 228:245– 255 Watanabe K, Takahashi T (1995) Fractal geometry characterization of geothermal reservoir fracture networks. J Geophys Res 100:521–528 Welding L (2007) GLITNIR geothermal research. In: United States geothermal energy market report, pp 1–37 Wendland H (2005) Scattered data approximation. Cambridge monographs on applied and computational mathematics, vol 17. Cambridge University Press, Cambridge Wilson JT (1963) A possible origin of the Hawaiian island. Can J Phys 41:863–868
Modeling Deep Geothermal Reservoirs: Recent Advances and Future Perspectives
1629
Wolf K (2009) Multiscale modeling of classical boundary value problems in physical geodesy by locally supported wavelets. PhD thesis, Geomathematics Group, University of Kaiserslautern Wu YS (2000) On the effective continuum method for modeling multiphase flow, multicomponent transport and heat transfer in fractured rock. In: Faybishenko B, Witherspoon PA, Benson SM (eds) Dynamics of fluids in fractured rocks, concepts and recent advances. American Geophysical Union, Washington, DC, pp 299–312 Wu YS, Pruess K (2005) A physically based numerical approach for modeling fracture-matrix interaction in fractured reservoirs. In: Proceedings, world geothermal congress 2005, Antalya Wu YS, Qin G (2009) A generalized numerical approach for modeling multiphase flow and transport in fractured porous media. Commun Comput Phys 6:85–108 Wu X, Pope GA, Shook GM, Srinivasan S (2005) A semi-analytical model to calculate energy production in single fracture geothermal reservoirs. Geotherm Resour Counc Trans 29:665–669 Wu RS, Xie XB, Wu XY (2006) One-way and one-return approximations (de Wolf approximation) for fast elastic wave modeling in complex media. Adv Geophys 48:265–322 Xie XB, Wu RS (2006) A depth migration method based on the full-wave reverse time calculation and local one-way propagation. In: Proceedings, SEG annual meeting, New Orleans Yilmaz O (1987) Seismic data analysis: processing, inversion, and interpretation of seismic data. Society of Exploration Geophysicists, Tulsa Yin S (2008) Geomechanics-reservoir modeling by displacement discontinuity-finite element method. PhD thesis, University of Waterloo, Ontario Zhao C, Hobbs BE, Baxter K, Mühlhaus HB, Ord A (1999) A numerical study of pore-fluid, thermal and mass flow in fluid-saturated porous rock basins. Eng Comput 16:202–214 Zhou XX, Ghassemi A (2009) Three-dimensional poroelastic simulation of hydraulic and natural fractures using the displacement discontinuity method. In: Proceedings of the 34th workshop on geothermal reservoir engineering, Stanford Zubkov VV, Koshelev VF, Lin’kov AM (2007) Numerical modeling of hydraulic fracture initiation and development. J Min Sci 43:40–56 Zyvoloski G (1983) Finite element methods for geothermal reservoir simulation. Int J Numer Anal Methods Geomech 7:75–86
Part IV Analytic, Algebraic, and Operator Theoretical Methods
Noise Models for Ill-Posed Problems Paul N. Eggermont, Vincent LaRiccia, and M. Zuhair Nashed
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Noise in Well-Posed and Ill-Posed Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Well-Posed and Ill-Posed Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Fredholm Integral Equations with Strong Noise . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Super-Strong and Weak Noise Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Weakly Bounded Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Tikhonov Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Strongly Bounded Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Weakly Bounded Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Regularization Parameter Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Discrepancy Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Lepski˘ı’s Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 An Example: The Second Derivative of a Univariate Function . . . . . . . . . . . . . . . . . . . 7 Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1634 1635 1635 1637 1639 1640 1643 1643 1645 1648 1648 1650 1652 1655 1655 1657
Abstract
The standard view of noise in ill-posed problems is that it is either deterministic and small (strongly bounded noise) or random and large (not necessarily small). Following Eggerment, LaRiccia and Nashed (2009), a new noise model is investigated, wherein the noise is weakly bounded. Roughly speaking, this means that local averages of the noise are small. A precise definition is P.N. Eggermont () • V. LaRiccia Food and Resource Economics, University of Delaware, Newark, DE, USA e-mail: [email protected] M.Z. Nashed Department of Mathematics, University of Central Florida, Orlando, FL, USA e-mail: [email protected]; [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_24
1633
1634
P.N. Eggermont et al.
given in a Hilbert space setting, and Tikhonov regularization of ill-posed problems with weakly bounded noise is analysed. The analysis unifies the treatment of “classical” ill-posed problems with strongly bounded noise with that of ill-posed problems with weakly bounded noise. Regularization parameter selection is discussed, and an example on numerical differentiation is presented.
1
Introduction
The key feature of ill-posed problems is the lack of robustness of solutions with respect to noise in the data. Whereas for well-posed problems it may be acceptable to ignore the effects of the noise in the data, to do so for ill-posed problems would be disastrous. Consequently, one must devise approximate solution schemes to implement the trade-off between fidelity to the data and noise amplification. How this is done depends on how one models the noise. There are two predominant views in the literature. In the classical treatment of ill-posed problems, dating back to Tikhonov (1943, 1963) and Phillips (1962), one assumes that the noise is small, say with a signal-to-noise ratio of 95 % or more. In other words, the noise is deterministic, not random. One then devises approximate solution procedures and studies their behavior when the noise tends to 0, that is, when the signal-to-noise ratio tends to 1. For summaries of these theories, see Tikhonov and Arsenin (1977), Groetsch (1984), Morozov (1984), Tikhonov et al. (1995), Engl et al. (1996), and Kirsch (1996). Of course, there are many settings where the noise is not small in the usual sense. The premier examples of these are those involving high frequency noise, which abound in the geosciences (Tronicke 2007, Duan et al. 2008 and references therein). On the other hand, it was realized early on in the development of ill-posed problems (Sudakov and Khalfin 1964; Franklin 1970) that noise is often random, and that probabilistic methods must be used for its analysis. Typically, one shows by way of exponential inequalities that the probabilities that the error in the approximate solution exceeds certain small levels are themselves small. See, e.g., Mair and Ruymgaart (1996), Cavalier et al. (2002), Cavalier and Golubev (2006), Bissantz et al. (2007) to name a few. Another approach is based on the Bayesian point of view but is not considered further (e.g., Kaipio and Somersalo 2005). It is then not surprising that the classical and probabilistic approaches are at odds, not only in assumptions and conclusions, but also in technique. As a way to narrow this gap, a weak model is proposed for the noise, in which local averages of the noise are small in the classical sense. This covers small low frequency noise as well as large high frequency noise, and almost covers white noise and discrete random noise. However, the precise notion of local averages must be more or less precisely matched with the particular ill-posed problem at hand. The approach is worked out for linear ill-posed problems and is illustrated on numerical differentiation.
Noise Models for Ill-Posed Problems
2
1635
Noise in Well-Posed and Ill-Posed Problems
Noise is pervasive in scientific data analysis and processing. In scientific computing, an important notion is that of the propagation of errors, whether internal (roundoff) errors due to finite precision arithmetic or external such as errors (noise) in the initial data assuming infinite precision arithmetic. The scientific computation problem under consideration determines the minimal level of noise amplification; optimal algorithms are the ones that achieve the minimum amplification of errors. However, all of this depends very much on the precise properties of the noise. In fact, these properties determine whether the (inverse) problem at hand is well-posed or ill-posed. The notions of well-posed and ill-posed problems are discussed and elaborated in the context of integral equations of the first and of the second kind.
2.1
Well-Posed and Ill-Posed Problems
In this chapter, and several other chapters in this volume, the interest is in solving “problems” with noisy data, abstractly formulated as follows. Let X and Y be Hilbert spaces with inner products and norms denoted by h , iX , h , iY , and jj jjX ; jj jjY . Let F W D.F / X ! Y be a mapping of the domain D.F / in X to Y , and suppose “data” y 2 Y is given such that y D F .xo / C ;
(1)
for some unknown xo 2 D.F /, where 2 Y is noise. The objective is to recover (estimate) xo given the data y. The natural, possibly naive, recovery scheme is to find x 2 D.F / such that F .x/ D y:
(2)
The question is then whether this is a “nice” problem. The obvious desirable requirements for a problem to be nice are that the problem should have one and only one solution. If there is no solution, then one is in trouble, but one is likewise in trouble if there are many (more than one) solutions, because how is one supposed to choose one? There is one more requirement for a problem to be “nice” and it has to do with the presence of the noise . The requirement is that if the noise is “small,” then the solution of the problem should be close to the “true” xo . That is, the solution should be robust with respect to small perturbations in the data. The first author to recognize the significance of all this was Hadamard (1902) in the context of initial/boundary-value problems for partial differential equations. Adapted to (2), his definition of a “nice” problem may be stated as follows. Definition 1. The problem (2) is well-posed if the solution x exists and is unique and depends continuously on y, that is for all xo 2 X
1636
P.N. Eggermont et al.
8" > 0 9 ı > 0 W jjy F .xo /jjY < ı H) jjx xo jjx < ":
(3)
If any of these three conditions is not satisfied, then the problem (2) is ill-posed. What is one to do if any of the three conditions for well-posedness are violated? It turns out that violations of the third condition are the most problematic. 1. If the problem (2) does not have a solution, it is accepted practice to change the notion of a solution. The standard one is to consider least-squares solutions in the sense of minimize jjF .x/ yjj2Y
subject to
x 2 D.F /:
(4)
Borrowing from partial differential equations, the notion of a weak solution suggests finding x 2 D.F / such that hz; F .x/ yiY D 0
(5)
for all z in some subset of Y . 2. If the problem (2) has more than one solution, it may be acceptable (or necessary) to impose extra conditions on the solution. A standard approach is to find the minimum-norm least-squares solution by solving minimize jjxjjX
subject to x minimizes jjF .x/ yjjY :
(6)
3. If the solution exists and is unique, then the question of continuous dependence makes sense. If this requirement is violated, there are two, no three, options. Change the way differences in the data are measured, change the way differences in the solutions are measured, or both. The cheating way would be to declare the data vectors y and z to be “close” if the solutions of F .x/ D y and F .x/ D z are close. However, the way differences in the data are measured is determined by the properties of the (idealized) measuring device as well as the properties of the noise (perturbations in the data external or internal to the measuring device). Changing this so that the problem becomes well-posed is probably not acceptable. Apparently the only way to restore the continuous dependence of the solution on the data is by restricting the set of “admissible” solutions (Tikhonov 1943, 1963). Here too there are two approaches. In the first scenario, one has some a priori information on the solution, such as it belonging to some (compact) set of smooth elements of X , say jjxjjZ 6 C for some constant C where jj jjZ is some stronger norm on a dense subset of X . In the random setting, this is the Bayesian approach (Kaipio and Somersalo 2005). Alternatively, one may just have to be satisfied with constructing a smooth approximation to the true solution. It should be mentioned that an alternative
Noise Models for Ill-Posed Problems
1637
to smooth approximations is finite-dimensional approximation (Grenander 1981; Natterer 1977).
2.2
Fredholm Integral Equations with Strong Noise
The notions of well- and ill-posedness are explored in the concrete setting of some specific Fredholm integral equations of the first and second kind. To avoid confusing the issues, a completely understood one-dimensional example set in the Hilbert space L2 .0; 1/ of square integrable functions on the interval [0, 1] is considered. The inner product and norm of L2 .0; 1/ are denoted by h ; iL2.0;1/ and jj jjL2 .0;1/ . Let k be defined on [0, 1 ] [0, 1] by k.s; t/ D s ^ t st; s; t 2 Œ0; 1 ;
(7)
where s ^ t D min.s; t/, and define the operator K : L2 .0; 1/ ! L2 .0; 1/ by Z
1
ŒKx .s/ D
k.s; t/x.t/dt;
s 2 Œ0; 1 :
(8)
0
In fact, K is the Green’s function operator for the two-point boundary-value problem
u00 D x in .0; 1/; u.0/ D u.1/ D 0:
(9)
So, the solution of (9) is given by u D Kx. The operator K is compact and the Fredholm equation of the first kind Kx D y
(10)
is an ill-posed problem. Consider the data y 2 L2 .0; 1/ following the model y D Kxo C ;
(11)
for some unknown xo 2 L2 .0; 1/ and noise with jj jjL2 .0;1/ D ı;
(12)
with ı “small”. Taking for example y D 0 and D .k/2 sin.kt/; 0 6 t 6 1;
(13)
p where k 2 N is “large,” one has jj jjL2 .0;1/ D 2=.k/2 , and the solution of (10) and (11) is x D .k/2 . This shows that any inequality of the form
1638
P.N. Eggermont et al.
jjx xo jjL2 .0;1/ 6 cjj jjL2 .0;1/ ; with c not depending on , must fail. Consequently, x does not depend continuously on the data, and the problem is ill-posed. For more on the ill-posed aspects of Fredholm integral equations of the first kind as well as equations of the second kind, see, e.g., Kress (1999). Now consider integral equations of the second kind. Consider the data y according to the model y D xo C Kxo C ;
(14)
with the operator K as before, with some unknown xo 2 L2 .0; 1/, and with the noise. If it is assumed that is small as in (12), then the problem find x such that x C Kx D y
(15)
is well-posed: The solution x exists and is unique, and it satisfies jjx xo jjL2 .0;1/ 6 jj jjL2 .0;1/ :
(16)
This may be seen as follows. First, it is readily seen that K is positive-definite. Since u D Kx solves (9), then by integration by parts hx; KxiL2 .0;1/ D hu00 ; uiL2 .0;1/ D jju0 jj2L2 .0;1/ > 0;
(17)
unless u is a constant function, which must then be identically zero, since it must vanish at t D 0 and t D 1. Thus, K is positive-definite. Then on the one hand hx xo ; x xo C K.x xo /iL2 .0;1/ > jjx xo jj2L2 .0;1/ ;
(18)
since K is positive-definite, and on the other hand hx xo ; x xo C K.x xo /iL2 .0;1/ D hx xo ; iL2 .0;1/ 6 jjx xo jjL2 .0;1/ jj jjL2 .0;1/ ;
(19)
so that jjx xo jj2L2 .0;1/ 6 jjx xo jjL2 .0;1/ jj jjL2 .0;1/ ;
(20)
and the conclusion follows. Consequently, the problem (18) is well-posed. Thus, all is well: Fredholm integral equations of the first kind are ill-posed, and Fredholm integral equations of the second kind are well-posed.
Noise Models for Ill-Posed Problems
2.3
1639
Super-Strong and Weak Noise Models
The noise model in the previous section is the classical or strong noise model for noise in L2 .0; 1/, viz. jj jjL2 .0;1/ D ı;
(21)
with ı “small”. Here the discussion centers on how changing this assumption affects the well/ill-posedness of the problems under consideration. Consider again the Fredholm integral equation of the first kind, Kx D y, with K as the Green’s function operator for the boundary-value problem (9). As said, K : L2 .0; 1/ ! L2 .0; 1/ is compact, but much more is true. Let H02 .0; 1/ be the Sobolev space of square integrable functions on the interval (0, 1) which vanish at the endpoints, and with square integrable (weak) first and second derivatives. The norm on H02 .0; 1/ is jujH 2 .0;1/ D jju00 jjL2 .0;1/ :
(22)
Then, since jKxjH 2 .0;1/ D jjxjjL2 .0;1/ , the mapping K : L2 .0; 1/ ! H02 .0; 1/ is a homeomorphism, and the inverse mapping K 1 : H02 .0; 1/ ! L2 .0; 1/ exists and is a bounded linear transformation. Now consider the data y according to the model y D Kxo C
(23)
for some unknown xo 2 L2 .0; 1/ and noise , but now with j jH 2 .0;1/ D ı;
(24)
with ı “small.” This is referred to as a super-strong noise model. Then the solution of the problem find x
such that Kx D y
(25)
satisfies jjx xo jjL2 .0;1/ D jjK 1 jjL2 .0;1/ D j jH 2 .0;1/ ;
(26)
thus showing that (25) is well-posed. Now consider integral equations of the second kind x C Kx D y;
(27)
with the data y following the model y D xo C Kxo C ;
(28)
1640
P.N. Eggermont et al.
with the operator K as before, with some unknown xo 2 L2 .0; 1/, and with
the noise. Note that so far, the noise is assumed to be small. As such, it does not represent noise as one usually understands it: high frequency or random (without necessarily implying a firm probabilistic foundation). In this interpretation, the size of the noise is then better measured by the size of averages of the noise over small intervals (but not too small) or by the size of smoothed versions of the noise. So, the size of the noise must then be measured by something like ı D jjS jjL2 .0;1/
(29)
for a suitable compact smoothing operator S W L2 .0; 1/ ! L2 .0; 1/. Such models are referred to as weak noise models. (Note that if converges weakly to 0, then S
converges strongly to 0 since S is compact.) Now comes the crux of the matter regarding the problem (27). Since the solution is given by x D .I C K/1 y, then x xo D .I C K/1 but any inequality of the form jj.I C K/1 jjL2 .0;1/ 6 cjjS jjL2 .0;1/ ;
(30)
with the constant c not depending on , must necessarily fail. Consequently, there is no constant c such that jjx xo jjx 6 cjjS jjL2 .0;1/ :
(31)
It follows that x does not depend continuously on the data. In other words, in the weak noise setting the problem (27) is ill-posed. In summary, everything is cocked up: The Fredholm integral equation of the first kind turned out to be well-posed, and the Fredholm integral of the second kind is ill-posed. (Of course, this actually shows that the designations of Fredholm integral equations as being of the first or second kind are not the most informative with regard to their well- or ill-posedness.) In the remainder of this chapter, a precise version of the weak noise model is studied.
3
Weakly Bounded Noise
In this section, the weak noise model, the main focus of this work is precisely described, pointing out the strong and weak points along the way. Let K:X ! Y be a linear compact operator between the Hilbert spaces X and Y . The inner products and norms of X and Y are denoted by h , iX , h , iY and jjjjX , jjjjY . Consider the data y 2 Y according to y D Kxo C ;
(32)
Noise Models for Ill-Posed Problems
1641
where 2 Y is the unknown noise and xo 2 X is an unknown element one wishes to recover from the data y. The following model is imposed on the noise. Let T : Y ! Y be linear, compact, Hermitian, and positive-definite (i.e., h y, T yiY > 0 for all y 2 Y , y ¤ 0), and let def
ı 2 D h ; T iY :
(33)
It is assumed that ı is “small” and the investigation concerns what happens when ı ! 0. In the above, the operator T is not arbitrary: It must be connected with K in the sense that for some m > 1 (not necessarily integer) the range of K is continuously embedded into the range of T m . That is T m K W X ! Y
is continuous:
(34)
If satisfies (33) and (34), it is referred to as weakly bounded noise. Some comments are in order. In a deterministic setting, a reasonable model for the noise is that it is “high-frequency,” and one would like to investigate what happens when the frequency tends to 1, but without the noise tending to 0 strongly, that is without assuming that jj jjY ! 0. Thus, ! 0 weakly begins capturing the essence of “noise.” Then, for any linear compact operator S :Y ! Y , one would have jjS jjY ! 0. So, in this sense, there is nothing unusual about (33). Moreover, one would like (33) to capture the whole truth, that is, that the statements h ; T p iY D o.ı 2 /
and h ; T q iY D O.ı 2 /
(35)
fail for p > 1 and q < 1 as ı ! 0. This may be a tall order, although examples of operators T and noises satisfying (33)–(35) are easily constructed (Eggermont et al. 2009b). At the same time T is supposed to capture the smoothing effect of K in the sense of (34). Ideally, one would like T m K to be continuous with a continuous inverse. The natural choice T D .KK /1=2m would achieve this, but would have to be reconciled with (33) and possibly (35). The condition (34) is not unreasonable. There are many cases where the operator K is smoothing of order m and then T 1 could be a first-order differentiation operator. This section is concluded by showing how the weak noise model leads to simple bounds on expressions like h ; yiY for y 2 T m .Y /, the range of T m . For ˇ > 0, introduce the inner product on T m .Y /, hy; zim;ˇ D hy; ziY C ˇ 2m hT m y; T m ziY ;
(36)
and denote the associated norm by jj jjm;ˇ . The following lemma is of interest in itself, but later on plays a crucial role in the analysis of Tikhonov regularization with weakly bounded noise.
1642
P.N. Eggermont et al.
Lemma 1 (Weakly Bounded Noise). Let m > 1. Under the assumptions (33) and (34) on the weakly bounded noise, for all y 2 T m .Y / and all ˇ > 0 jh ; yiY j 6 ˇ 1=2 ıjjyjjm;ˇ :
(37)
Note that the factor ˇ 1=2 stays the same, regardless of m. Proof of Lemma 1. Let ˇ > 0. Consider the smoothing operator Sˇ .T 2 C ˇ 2 I /1 T 2 and let Tˇ denote its Hermitian square root, def
Tˇ D .T 2 C ˇ 2 I /1=2 T:
D
(38)
Note that the inverse of Sˇ is well-defined on T 2 .Y /, and that there one has Sˇ1 D I C ˇ 2 T 2 , and likewise that Tˇ1 is well defined on T .Y /, with Tˇ1 D .I C ˇ 2 T 2 /1=2 : Now, for all y 2 T .Y / one has that y D Tˇ1 Tˇ y and so jh ; yiY j2 D jhTˇ ; Tˇ1 yiY j2 6 jjTˇ jj2Y jjTˇ1 yjj2Y
(39)
by the Cauchy-Schwarz inequality. The last factor on the right equals hy; yiY C ˇ 2 hT 1 y; T 1 yiY D jjyjj21;ˇ ; and it is an easy exercise using the spectral decomposition of T to show that for 1 6 m < n (not necessarily integers) and all ˇ > 0, jjyjj2m;ˇ 6 2jjyjj2n;ˇ :
(40)
See Eggermont et al. (2009b). The first factor on the right in (39) equals h ; .I C ˇ 2 T 2 /1 iY and must be bounded further in terms of fnbigl nlangle ; T nbigr nranglegY . One has h ; .I C ˇ 2 T 2 /1 iY D ˇ 1 h ; .ˇ 1 T C ˇT 1 /1 T iY 6 rˇ 1 h ; T iY ; where r is the spectral radius of .ˇ 1 T C ˇ T 1 /1 . Since r 6 sup.ˇ 1 t C ˇt 1 /1 D sup. C 1 /1 D t >0
>0
1 ; 2
Noise Models for Ill-Posed Problems
1643
then h ; .I C ˇ 2 T 2 /1 iY 6 .2ˇ/1 h ; T iY ;
(41)
To summarize, (39)–(41) show that jh ; yiY j2 6 ˇ 1 h ; T iY jjyjj2m;ˇ :
(42)
This estimate together with the assumption (33) imply the lemma.
4
Tikhonov Regularization
4.1
Strongly Bounded Noise
Tikhonov regularization is now discussed as the scheme to recover xo from the data y in the strong noise model y D Kxo C with
jj jjY 6 ı:
(43)
The interest is in what happens when ı ! 0. In the Tikhonov regularization scheme, the unknown xo is estimated by the solution x D x ˛;ı of minimize L.xj˛; y/
over x 2 X;
(44)
for some regularization parameter ˛; ˛ > 0, yet to be specified. Here, def
L.xj˛; y/ D jjKx yjj2Y C ˛jjxjj2X :
(45)
This dates back to Phillips (1962) and Tikhonov (1963). Note that since L.xj˛; y/ is strongly convex in x, its minimizer exists and is unique. It is well-known (Groetsch 1984) that to get convergence rates on the error jjx ˛;ı xo jjX , one must assume a source condition. For simplicity, it is assumed here that there exists a zo 2 X such that xo D .K K/=2 zo
for some 0 < 6 2:
(46)
Precise necessary and sufficient conditions are given in Neubauer (1997). In the study of convergence rates under the source condition (46), it is assumed here that is known and that ˛ is chosen accordingly. As said, one wants to obtain bounds on the error jjx ˛;ı xo jjX . As usual, this is broken up into two parts
1644
P.N. Eggermont et al.
jjx ˛;ı xo jjX 6 jjx ˛;ı x ˛;o jjX C jjx ˛;o xo jjX ;
(47)
where x ˛;o is the “noiseless” estimator, i.e., the minimizer of L.xj˛; Kxo /. Thus, x ˛;ı x ˛;o is the noise part of the error and x ˛;o xo is the error introduced by the regularization. One has the following classical results. See, e.g., Groetsch (1984) and Engl et al. (1996). Theorem 1. There exists a constant c such that for all ˛; 0 < ˛ 6 1, jjx ˛;ı x ˛;o jjX 6 c˛ 1=2 jj jjY :
(48)
Theorem 2. Under the source condition (46), there exists a constant c such that for all ˛; 0 < ˛ 6 1, jjx ˛;o xo jjX 6 c˛ =2 :
(49)
The two theorems above then give the following convergence rates. Theorem 3. Assuming the source condition (46) and the condition (43) on the noise for ˛ ! 0, jjx ˛;ı xo jjX D O.˛ 1=2 ı C ˛ =2 /:
(50)
Moreover, if ˛ ı 2=.C1/ then jjx ˛;ı xo jjX D O.ı =.C1/ /: In the remainder of this subsection, Theorems 1 and 2 are proved. The first observation is that the normal equations for the problem (44) are K .Kx y/ C ˛x D 0, so that the solution of (44) is x ˛;ı D .K K C ˛I /1 K y:
(51)
Proof of Theorem 1. From (51), it follows that x ˛;ı x ˛;o D .K K C ˛I /1 K , so that jjx ˛;ı xo jj2X 6 rjj jj2Y ; where r is the spectral radius of .K K C ˛I /1 K K.K K C ˛I /1 . One then gets that r 6 sup t >0
t D ˛ 1 sup ; 2 2 .t C ˛/ >0 . C 1/
Noise Models for Ill-Posed Problems
1645
where the substitution t D ˛ is applied. Since the supremum is finite, the theorem follows. Proof of Theorem 2. It follows from (51) that x ˛;o xo D .K K C ˛I /1 K Kxo xo D ˛.K K C ˛I /1 xo : Now, with the source condition (46), one then obtains jjx ˛;o xo jjX D ˛jj.K K C ˛I /1 .K K/=2 zo jjX 6 ˛%jjzo jjX ; where % is the spectral radius of the operator .K K C ˛I /1 .K K/v=2 . Since K K is Hermitian, one has % 6 sup t >0
t =2 =2 6 ˛ .2/=2 sup : t C˛ >0 C 1
Since for 0 < 6 2 the supremum is finite, this gives that % D O ˛ .2/=2 , and the theorem follows.
4.2
Weakly Bounded Noise
Tikhonov regularization may now be considered as the scheme to recover xo from the data y in the weak noise model y D Kxo C :
(52)
Thus, it is assumed that there is a smoothing operator T such that the noise and T satisfy (33) and (34). In particular, h ; T iY D ı;
(53)
and the discussion focuses on what happens when ı ! 0. Formally, Tikhonov regularization does not depend on the noise being strongly or weakly bounded. Thus, xo is estimated by the solution x D x ˛;ı of minimize L.xj˛; y/
over x 2 X;
(54)
with L.xj˛; y/ as in (45), for some positive regularization parameter ˛ yet to be specified. As said, one wants to obtain bounds on the error jjx ˛;ı xo jjX , and it is broken up as
1646
P.N. Eggermont et al.
jjx ˛;ı xo jjX 6 jjx ˛;ı x ˛;o jjX C jjx ˛;o xo jjX ;
(55)
where x ˛;o is the “noiseless” estimator, i.e., the minimizer of L.xj˛; Kxo /. Thus, x ˛;ı x ˛;o is the noise part of the error and x ˛;o xo is the error introduced by the regularization. It is useful to introduce a new norm on X by way of jjxjj2˛;X D jjKxjj2Y C ˛jjxjj2X :
(56)
Assuming again the source condition (46), the noiseless part x ˛;o xo is covered by Theorem 2, but the treatment of the noise part is markedly different. Theorem 4. Under the conditions (33) and (34) on the noise , there exists a constant C on T only such that for ˛ ! 0 jjx ˛;ı x ˛;o jj2˛;X 6 C ˛ 1=4m ı:
(57)
Theorems 2 and 4 above then give the following convergence rates. In Eggermont et al. (2009b) it is shown that they are optimal, following Natterer (1984), assuming in addition that T m K has a continuous inverse. Theorem 5. Assuming the source condition (46) and the conditions (33) and (34) on the noise for ˛ ! 0, jjx ˛;ı xo jjX D O.˛ 1=21=4m ı C ˛ =2 /:
(58)
Moreover, if ˛ ı 4m=.2mC2mC1/ then jjx ˛;ı xo jjX D O.ı 2m=.2mC2mC1//: In the remainder of this subsection, Theorem 4 is proved. The direct computational approach of the previous section does not seem very convenient here, so one proceeds by way of the following equality due to Ribière (1967). Lemma 2. Let x ˛;ı be the minimizer of L.xja; y/. Then, for all x 2 X jjx x ˛;ı jj2˛;X D hKx y; K.x x ˛;ı /iY C ˛hx; x x ˛;ı iX :
(59)
Proof. Look at L.xj˛; y/ L.x ˛;ı j˛; y/ in two ways. First, calculate the quadratic Taylor expansion of L.xj˛; y/ around the point x ˛;ı . So L.xj˛; y/ D L.x ˛;ı j˛; y/ C hL0 .x ˛;ı j˛; y/; x x ˛;ı iX C jjx x ˛;ı jj2˛;X ; where L0 .xj˛; y/ is the Gateaux derivative of L.xj˛; y/,
Noise Models for Ill-Posed Problems
L0 .xj˛; y/ D 2K .Kx y/ C 2˛x:
1647
(60)
Since x ˛;ı is the minimizer of L.xj˛; y/, the Gateaux derivative L0 .x ˛;ı j˛; y/ vanishes. This shows that for all x 2 X jjx x ˛;ı jj2˛;X D L.xj˛; y/ L.x ˛;ı j˛; y/:
(61)
Second, expand L.x ˛;ı j˛; y/ around x, so L.x ˛;ı j˛; y/ D L.xj˛; y/ C hL0 .xj˛; y/; x ˛;ı xiX C jjx ˛;ı xjj2˛;X : With (60) then L.xj˛; y/ L.x ˛;ı j˛; y/ D 2hKx y; K.x ˛;ı x/iY 2˛hx; x ˛;ı xiX ˛jjx ˛;ı xjj2˛;X : Substituting this into (61) gives the required results. Corollary 1. Under the conditions of Lemma 2 jjx ˛;o x ˛;ı jj2˛;X D h ; K.x ˛;ı x ˛;o /iY : Proof. Let "˛;ı x ˛;o x ˛;ı . In Lemma 2, take x D x ˛;o . Then jj"˛;ı jj2˛;X D hKx ˛;o y; K"˛;ı iY C ˛hx ˛;o ; "˛;ı iX :
(62)
Next, since x D x ˛;o is the minimizer of L.xj˛; Kxo /, Lemma 2 gives for all x 2 X , jjx x ˛;o jj2˛;X D hKx Kxo ; K.x x ˛;o /iY C ˛hx; x x ˛;o iX :
(63)
Now take x D x ˛;ı . Then jj"˛;ı jj2˛;X D hKx ˛;ı Kxo ; K"˛;ı iY ˛hx ˛;ı ; "˛;ı iX ; and add this to (62). The result is that 2jj"˛;ı jj2˛;X D hKx ˛;o y Kx ˛;ı C Kxo ; K"˛;ı iY C ˛hx ˛;o x ˛;ı ; "˛;ı iX ; or 2jj"˛;ı jj2˛;X D jjK"˛;ı jj2Y C hy Kxo ; K"˛;ı iY C ˛jj"˛;ı jj2X ; and the corollary follows.
1648
P.N. Eggermont et al.
Proof of Theorem 4. By Corollary 1, one needs to properly bound h ; K.x ˛;o x ˛;ı /iY . By Lemma 1 one has for all ˇ > 0 and all x 2 X jh ; KxiY j 6 ˇ 1=2 ıjjKxjjm;ˇ :
(64)
Now, jjKxjj2m;ˇ D jjKxjj2Y C ˇ 2m jjT m Kxjj2Y 6 jjKxjj2Y C cˇ 2m jjxjj2X ; the last inequality by assumption (34) for a suitable constant c. Consequently, jjKxjjm;ˇ 6 cjjxjj˛;X
for
ˇ D ˛ 1=2m :
(65)
Now apply this to (64) and that to h ; K.x ˛;o x ˛;ı / iY .
5
Regularization Parameter Selection
The rates of convergence for Tikhonov regularization for weakly bounded noise established in Sect. 4 are nice, but practically speaking they only show what is possible in an asymptotic sense under perfect information on the noise and the source condition. In this section, some data-driven methods are explored for selecting the regularization parameter: Morozov’s discrepancy principle (apparently applicable only for strongly bounded noise) and Lepski˘ı’s method in the interpretation of Mathé (2006).
5.1
Discrepancy Principles
First, the data y 2 Y is considered in the classical model y D Kxo C for some unknown xo and noise with jj jjY D ı for “small” ı. The estimator of xo is given by Tikhonov regularization of the equation Kx D y, that is, x ˛;ı D .K K C ˛I /1 K y;
(66)
except of course that this requires a specific choice for ˛. The oldest a posteriori method for choosing the regularization parameter is Morozov’s discrepancy principle (Morozov 1966, 1984) based on the behavior of the residual r.˛/ D jjKx ˛;ı yjjY ; ˛ > 0:
(67)
Since for the “exact” solution, the residual would equal jj jjY , in Morozov’s discrepancy principle one chooses ˛ D ˛M such that
Noise Models for Ill-Posed Problems
1649
r.˛M / D jj jjY :
(68)
Writing Kx ˛;ı y D ˛.KK C ˛I /1 y, one easily shows that r.˛/ is a strictly increasing function of ˛ > 0. Moreover, it can be shown that lim r.˛/ D jjyjjY
˛!1
and
lim r.˛/ D 0;
˛!0
(69)
so that (68) has a unique solution. (In the second part of (69) it is assumed that the range of K is dense in Y . If this is not the case, one must consider r.˛/ D jjKx ˛;ı QyjjY ; where Q is the orthogonal projection operator onto the closure of the range of K in Y ). Before discussing the (im)possibility of adapting the discrepancy principle to weakly bounded noise, it is illustrative to explore why Morozov’s discrepancy principle works in the classical case. Indeed, as discovered by Groetsch (1983), in Morozov’s method, one is minimizing an upper bound for the error jjx ˛;ı xo jjX , To see this, write jjx ˛;ı xo jj2X D jjx ˛;ı jj2X C jjxo jj2X 2hx ˛;ı ; xo iX ; and observe that x ˛;ı D .K K C ˛I /1 K y D K .KK C ˛I /1 y, so that hx ˛;ı ; xo iX D h.KK C ˛I /1 y; Kxo iX D h.KK C ˛I /1 y; yiX h.KK C ˛I /1 y; iX : Now, the last term may be bounded by jj.KK C ˛I /1 yjjY jj jjY . It follows that jjx ˛;ı xo jj2X 6 U .˛/ 6 jjx ˛;ı xo jj2X C 4jj.KK C ˛I /1 yjjY jj jjY ;
(70)
where U .˛/ D jjx ˛;ı jj2X C jjxo jj2X 2h.KK C ˛I /1 y; yiX C 2ıjj.KK C ˛I /1 yjjX : (71) Now, writing jjx ˛;ı jj2X D hy; .KL C ˛I /2 KK yiY , it is a somewhat laborious but straightforward exercise to show that 2ıhy; .KK C ˛I /3 yiY dU ; D 2˛hy; .KK C ˛I /3 yiY d˛ jj.KK C ˛I /1 yjjX so that setting the derivative equal to 0 gives ˛jj.KK C ˛I /1 yjjX D jj jjY :
(72)
1650
P.N. Eggermont et al.
Since ˛.KK C ˛I /1 y D y Kx ˛;ı ;
(73)
this is Morozov’s discrepancy principle. One then derives convergence rates for jjx ˛;ı xo jjX with ˛ D ˛M by theoretically minimizing over ˛, the upper bound of (70) for U .˛/ which by (73) may be written as U .˛/ 6 jjx ˛;ı xo jj2X C 4˛ 1 jjKx ˛;ı yjjY jj jjY :
(74)
Using the estimates from Sect. 4 will do the trick. See Groetsch (1984) or also Gfrerer (1987). Can Morozov’s discrepancy principle be applied to weakly bounded noise? The real question is whether one can derive an upper bound for the error jjx ˛;ı xo jjX based on weakly bounded noise, but this seems to be an open question. However, one can do some plausible reasoning. In the optimal case jjKx ˛;ı yjj2Y jj jj2Y ; so this cannot account for weakly bounded noise. Then it seems to make sense to smooth it out as def
jjy Kx ˛;ı jj2S˛ D hy Kx ˛;ı ; S˛ .y Kx ˛;ı /iY ; where S˛ D K.K K C ˛I /1 K is a smoothing operator that makes sense in this context. A weak discrepancy principle would then read jjy Kx ˛;ı jj2S˛ D C ˛ 1=2m ı 2
(75)
for a “suitable” constant C . However, since jjy Kx ˛;ı jjS˛ is certainly not strictly increasing (it tends to 0 for ˛ ! 0 and ˛ ! 1/, this raises more questions than it answers.
5.2
Lepski˘ı’s Principle
In this section, the adaptation to weakly bounded noise of the method of Lepski˘ı (1990) for choosing the regularization parameter ˛ in Tikhonov regularization is discussed. This method originated in nonparametric regression with random noise (Lepski˘ı 1990), but the method has found a ready interpretation in the classical setting of ill-posed problems, see Bissantz et al. (2007), Mathé and Pereverzev (2006a,b), and Pereverzev and Schock (2005) and the concise interpretation of Mathé (2006).
Noise Models for Ill-Posed Problems
1651
The Lepskii principle in the interpretation of Mathé (2006) may be adapted to the weakly bounded noise setting as follows. Let ‰.˛/ D 2ı˛ .2mC1/=4m ;
˛ > 0:
(76)
This is an overestimate of the contribution of the noise to the solution. One observes that ‰.˛/ is a decreasing function of ˛. In Sect. 4.2, it was shown that jjx ˛;ı x ˛;o jjX 6
1 ‰.˛/: 2
(77)
It then follows that jjx ˛;ı xo jjX 6
1 ‰.˛/ C jjx ˛;o xo jjX ; 2
(78)
and this is 6 ‰.˛/ for all suitably small ˛ since jjx ˛;o xo jjX ! 0 and ‰.˛/ ! 1 for ˛ ! 0. Define ˛star D sup f˛ > 0 W 800
as well as the approximations to
E n1=2 jjx.˛star ; ı/ xo jjRn
and E n1=2 jjx.˛D ; ı/ xo jjRn :
in connection with Lepski˘ı’s principle (see below for ˛D /. The ‰-function used is ‰.˛/ D 2˛ 9=16 n1=2 ı;
(90)
which corresponds to m D 4. So, here the operator T of (33) and (34) is taken to be such that T 1 is differentiation of order 1/2. This almost works theoretically, since for random noise one has
1654
P.N. Eggermont et al.
p Table 1 The mean errors .1= n/jjx.˛; ı/ xo jjRn for various ˛ n 100 200 400
Optimal 5.9047e 2 5.0559e 2 4.1698e 2
Star 6.9908e 2 5.9081e 2 4.9118e 2
Dangerous 9.6766e 02 7.7766e 02 6.5244e 02
Lepski˘ı 1.7657e 1 1.4667e 1 1.1948e 1
0 –0.5 –1 –1.5 –2 –2.5 –3 –3.5 –7
–6.5
–6
–5.5
–5
–4.5
–4
–3.5
Fig.p1 Graphs of the true error (solid), the ‰.ˇ/ function (dotted), 2 ‰.ˇ/ (dash-dotted), and .1= n/jjx.ˇ; ı/ x.˛L ; ı/jjRn (the Lepski˘ı curve) versus log10 .ˇ/ in a typical case
h ; T iL2 .0;1/ < 1
(91)
if T is a smoothing operator of order > 1/2. See Eggermont pand LaRiccia (2009a), Chap. 13. Also, in (90) the weak noise level is not ı but ı= n. The results for a “dangerous” choice, ˛ D ˛D , are also shown. This choice comes about as follows. In Fig. 1 a typical graph is shown of log10 n1=2 jjx.ˇ; ı/ x.˛L ; ı/jjRn ;
0 < ˇ < ˛L ;
which lies below the graph of 2 ‰.ˇ/, but touches it at one point, the abscissa of which is denoted by ˛D . Increasing ˛L would make part of the graph lie above that of 2 ‰.ˇ/. Observe that the logarithm tends to – 1 for ˇ ! ˛L , as it should. In Fig. 1, the graphs of the true error and of ‰.ˇ/ and 2 ‰.ˇ/ are also shown. It seems clear from the graph that the choice ˛ D ˛D is reasonable, but since there is no theoretical backing, it is called the “dangerous” one. It is clear from inspecting Table 1 that the Lepski˘ı principle works, but that the dangerous choice works better (for this example).
Noise Models for Ill-Posed Problems
7
1655
Conclusions and Future Directions
It has been demonstrated that weak noise requires an essentially different treatment compared to strongly bounded or “classical” noise. However, it is clear that only the surface of weak noise models in ill-posed problems has been scratched, most notably with regards to nonlinear problems and moment discretization. This will be discussed in later works. In the classical approach of moment discretization, this leads naturally to reproducing kernel Hilbert spaces (Nashed and Wahba 1974a,b; Nashed 1976, 1981). However, reproducing kernel Hilbert space ideas are hidden in this chapter as well, most notably in the Weak Noise Lemma 1, which should be compared with the Random Sum Lemma 13.2.20 of Eggermont and LaRiccia (2009a). There are two more pressing areas that need study, to wit more flexible recovery schemes and procedures for the selection of the regularization parameter that automatically adapt to generally unknown source conditions. The two are sometimes not easily disentangled. The hypothesis is offered that the “usual” recovery schemes for strongly bounded noise will also work for weakly bounded noise. Thus, total-variation regularization (Rudin et al. 1992; Dobson and Scherzer 1996; Vogel and Oman 1996), inverse space scale methods (Groetsch and Scherzer 1984; Burger et al. 2007), and iterative methods (Eicke et al. 1990; Frankenberger and Hanke 2000), to name a few, should prove promising avenues of investigation for weak noise models. Regarding smoothing parameter selection, things are much less clear. Some selection criteria obviously do not apply, such as Morozov’s discrepancy principle and variants for classical noise (Mathé and Pereverzev 2006a), but others do. Lepski˘ı’s method (Mathé 2006; Mathé and Pereverzev 2006b) may be adapted (but the actual implementation is storage intensive). The L-method works for weak noise models in the sense that the graph of jjKx ˛;ı yjjY versus jjx ˛;ı jjX exhibits the usual L-shape (but identifying the corner is problematic). It would be interesting to see under what conditions methods designed explicitly for random noise apply to weakly bounded noise as well, see Desbat and Girard (1995).
8
Synopsis
1. Let X and Y be Hilbert spaces with inner products h,iX ,h,iY and norms jj jjX , jj jjY . Let KWX !Y be linear and compact. Consider the data y 2 Y , y D Kxo C
with unknown xo 2 X and noise 2 Y .
1656
P.N. Eggermont et al.
2. Goal: Recover xo ! Problem is ill-posed because K 1 is not bounded. 3. Classical view: jj jjY is “small.” Analyze what happens when jj jjY ! 0. 4. Cases not covered: 1. High-frequency noise (not “small”) 2. White noise . … Y Š/ 3. Random discrete noise (not “small”; more data becomes available) 4. Signal-to-noise-ratio < 10 %! 5. New noise model: y D Kxo C with
! 0 weakly in Y: Need rates on the weak convergence. 6. Precise conditions for weakly bounded noise (Sect. 3): T :Y ! Y linear, compact, Hermitian, positive definite ; ı 2 D h ; T iY ! 0; K.X / T m .Y / for some m > 1. (m need not be an integer.) Note: this does not imply jj jjY ! 0! 7. T compact and ! 0 weakly implies h ; T iY ! 0 but T must be related to K, e.g., T D .K K/1=2m , but then . . . . 8. Weak noise Lemma 1: h ; KxiY 6 .2ˇ/1=2 ıfjjKxjj2Y C ˇ 2m jjT m Kxjj2Y g1=2 : So, with ˇ D ˛ 1=2m and T m K bounded h ; KxiY 6 c˛ 1=4m ıjjxjj˛;X ; with jjxjj2˛;X D jjKxjj2Y C ˛jjxjj2X : 9. Ideally suited for Tikhonov regularization: x ˛;ı D arg min jjKx yjj2Y C ˛jjxjj2X Š 10. Ribière’s (in)equality with "˛;ı D x ˛;ı xo (Lemma 2) jj"˛;ı jj2˛;X D h ; K"˛;ı iY C ˛hxo ; "˛;ı iX : Two terms ! 11. h ; K"˛;ı iY : see (h).
Noise Models for Ill-Posed Problems
1657
12. ˛hxo ; "˛;ı iX 6 c˛ .C1/=2 k"˛;ı k˛;X if one has a source condition xo D .K K/=2zo
with
0 < 6 2;
for some zo 2 X (from “classic” Tikhonov regularization) 13. Leads to k"˛;ı k2˛;X 6 cf˛ 1=2m ı C ˛ .C1/=2 gk"˛;ı k˛;X ; and rates follow!
References Bissantz N, Hohage T, Munk A, Ruymgaart F (2007) Convergence rates of general regularization methods for statistical inverse problems and applications. SIAM J Numer Anal 45:2610–2626 Burger M, Resmerita E, He L (2007) Error estimation for Bregman iterations and inverse scale space methods in image restoration. Computing 81:109–135 Cavalier L, Golubev GK (2006) Risk hull method and regularization by projections of ill-posed inverse problems. Ann Stat 34:1653–1677 Cavalier L, Golubev GK, Picard D, Tsybakov AB (2002) Oracle inequalities for inverse problems. Ann Stat 30:843–874 Desbat L, Girard D (1995) The “minimum reconstruction error” choice of regularization parameters: some more efficient methods and their application to deconvolution problems. SIAM J Sci Comput 16:1387–1403 Dobson DC, Scherzer O (1996) Analysis of regularized total variation penalty methods for denoising. Inverse Probl 12:601–617 Duan H, Ng BP, See CMS, Fang J (2008) Broadband interference suppression performance of minimum redundancy arrays. IEEE Trans Signal Process 56:2406–2416 Eicke B (1992) Iteration methods for convexly constrained ill-posed problems in Hilbert space. Numer Funct Anal Optim 13:413–429 Eggermont PPB, LaRiccia VN (2009a) Maximum penalized likelihood estimation. Volume II: Regression. Springer, New York Eggermont PPB, LaRiccia VN, Nashed MZ (2009) On weakly bounded noise in ill-posed problems. Inverse Probl 25:115018 (14pp) Engl HW, Kunisch K, Neubauer A (1989) Convergence rates for Tikhonov regularisation of nonlinear ill-posed problems. Inverse Probl 5:523–540 Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Kluwer, Dordrecht Frankenberger H, Hanke M (2000) Kernel polynomials for the solution of indefinite and ill-posed problems. Numer Algorithms 25:197–212 Franklin JN (1970) Well-posed stochastic extensions of ill-posed linear problems. J Math Anal Appl 31:682–716 Gfrerer H (1987) An a posteriori parameter choice for ordinary and iterated Tikhonov regularization of ill-posed problems leading to optimal convergence rates. Math Comput 49:523–542 Grenander U (1981) Abstract inference. Wiley, New York Groetsch CW (1983) Comments on Morozov’s discrepancy principle. In: Hämmerlin G, Hoffman KN (eds) Improperly posed problems and their numerical treatment. Birkhäuser, Basel Groetsch CW (1984) The theory of Tikhonov regularization for Fredholm equations of the first kind. Pitman, Boston Groetsch CW, Scherzer O (2002) Iterative stabilization and edge detection. Contemp Math 313:129–141. American Mathematical Society, Providence
1658
P.N. Eggermont et al.
Hadamard J (1902) Sur les problèmes aux derivées partièlles et leur signification physique. Princet Univ Bull 13:49–52 Hanke M, Scherzer O (2001) Inverse problems light: numerical differentiation. Am Math Mon 108:512–521 Kaipio J, Somersalo E (2005) Statistical and computational inverse problems. Springer, Berlin Kirsch A (1996) An introduction to the mathematical theory of inverse problems. Springer, Berlin Kress R (1999) Linear integral equations. Springer, Berlin Lepski˘ı OV (1990) On a problem of adaptive estimation in Gaussian white noise. Theory Probl Appl 35:454–466 Mair BA, Ruymgaart FH (1996) Statistical inverse estimation in Hilbert scales. SIAM J Appl Math 56:1424–1444 Mathé P (2006) The Lepski˘ı principle revisited. Inverse Probl 22:L11–L15 Mathé P, Pereverzev SV (2006a) The discretized discrepancy principle under general source conditions. J Complex 22:371–381 Mathé P, Pereverzev SV (2006b) Regularization of some linear ill-posed problems with discretized random noisy data. Math Comput 75:1913–1929 Morozov VA (1966) On the solution of functional equations by the method of regularization. Soviet Math Dokl 7:414–417 Morozov VA (1984) Methods for solving incorrectly posed problems. Springer, New York Nashed MZ (1976) On moment-discretization and least-squares solutions of linear integral equations of the first kind. J Math Anal Appl 53:359–366 Nashed MZ (1981) Operator-theoretic and computational approaches to ill-posed problems with applications to antenna theory. IEEE Trans Antennas Propag 29:220–231 Nashed MZ, Wahba G (1974a) Regularization and approximation of linear operator equations in reproducing kernel spaces. Bull Am Math Soc 80:1213–1218 Nashed MZ, Wahba G (1974b) Convergence rates of approximate least squares solutions of linear integral and operator equations of the first kind. Math Comput 28:69–80 Natterer F (1977) Regularisierung schlecht gestellter Probleme durch Projektionsverfahren. Numer Math 28:329–341 Natterer F (1984) Error bounds for Tikhonov regularization in Hilbert scales. Appl Anal 18:29–37 Neubauer A (1997) On converse and saturation results for Tikhonov regularization of linear illposed problems. SIAM J Numer Anal 34:517–527 Pereverzev SV, Schock E (2005) On the adaptive selection of the parameter in regularization of ill-posed problems. SIAM J Numer Anal 43:2060–2076 Phillips BL (1962) A technique for the numerical solution of certain integral equations of the first kind. J Assoc Comput Mach 9:84–97 Ribière G (1967) Régularisation d’opérateurs. Rev Informat Recherche Opérationnelle 1:57–79 Rudin LI, Osher SJ, Fatemi E (1992) Nonlinear total variation based noise removal algorithms. Physica D 60:259–268 Sudakov VN, Khalfin LA (1964) A statistical approach to the correctness of the problems of mathematical physics. Dokl Akad Nauk SSSR 157:1058–1060 Tikhonov AN (1943) On the stability of inverse problems. Dokl Akad Nauk SSSR 39:195–198 (in Russian) Tikhonov AN (1963) On the solution of incorrectly formulated problems and the regularization method. Dokl Akad Nauk SSSR 151:501–504 Tikhonov AN, Arsenin VYa (1977) Solutions of ill-posed problems. Wiley, New York Tikhonov AN, Goncharsky AV, Stepanov VV, Yagola AG (1995) Numerical methods for the solution of ill-posed problems. Kluwer, Dordrecht Tronicke J (2007) The influence of high frequency uncorrelated noise on first-break arrival times and crosshole traveltime tomography. J Environ Eng Geophys 12: 172–184 Vogel CR, Oman ME (1996) Iterative methods for total variation denoising. SIAM J Sci Comput 17:227–238 Wahba G (1973) Convergence rates of certain approximate solutions to Fredholm integral equations of the first kind. J Approx Theory 7:167–185
Sparsity in Inverse Geophysical Problems Markus Grasmair, Markus Haltmeier, and Otmar Scherzer
Contents 1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Case Example: Ground Penetrating Radar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Variational Regularization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Sparse Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Convex Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Nonconvex Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Numerical Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Iterative Thresholding Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Second Order Cone Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Application: Synthetic Focusing in Ground Penetrating Radar . . . . . . . . . . . . . . . . . . . 5.1 Mathematical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Migration Versus Nonlinear Focusing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Application to Real Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Numerical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Explicit Approximation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iteratively Reweighted Least Squares Method (IRLS) . . . . . . . . . . . . . . . . . . . . . . . . . . Accelerated Thresholding Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Augmented Lagrangian Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1660 1661 1661 1664 1665 1669 1670 1670 1671 1672 1672 1676 1678 1680 1681 1682 1683 1684 1684 1685
M. Grasmair Department of Mathematics, Norwegian University of Science and Technology, Trondheim, Norway e-mail: [email protected] M. Haltmeier Institute of Mathematics, University of Innsbruck, Innsbruck, Austria e-mail: [email protected] O. Scherzer Computational Science Center, University of Vienna, Vienna, Austria e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_25
1659
1660
M. Grasmair et al.
Abstract
Many geophysical imaging problems are ill-posed in the sense that the solution does not depend continuously on the measured data. Therefore, their solutions cannot be computed directly but instead require the application of regularization. Standard regularization methods find approximate solutions with small L2 norm. In contrast, sparsity regularization yields approximate solutions that have only a small number of nonvanishing coefficients with respect to a prescribed set of basis elements. Recent results demonstrate that these sparse solutions often much better represent real objects than solutions with small L2 norm. In this survey, recent mathematical results for sparsity regularization are reviewed. As an application of the theoretical results, synthetic focusing in Ground Penetrating Radar is considered, which is a paradigm of inverse geophysical problem.
1
Introduction
In a plethora of industrial problems, one aims at estimating the properties of a physical object from observed data. Often the relation between the physical object and the data can be modeled sufficiently well by a linear equation Au D v;
(1)
where u is a representation of the object in some Hilbert space U , and v a representation of the measurement data, again in a Hilbert space V . Because the operator A: U ! V in general is continuous, the relationship (1) allows one to easily compute data v from the properties of the object u, provided they are known. This is the so called forward problem. In many practical applications, however, one is interested in the inverse problem of estimating the quantity u from measured data v. A typical feature of inverse problems is that the solution of (1) is very sensitive to perturbations in v. Because in practical applications only an approximation v ı of the true data v is given, the direct solution of Eq. 1 by applying the inverse operator is therefore not advisable (see Engl et al. 1996; Scherzer et al. 2009). By incorporating a priori information about the exact solution, regularization methods allow to calculate a reliable approximation of u from the observed data v ı . In this chapter, the main interest lies in sparsity regularization, where the a priori information is that the true solution u is sparse in the sense that only few coefficients hu; ¥ i with respect to some prescribed basis .¥ /2ƒ are nonvanishing. In the discrete setting of compressed sensing, it has recently been shown that sparse solutions can be found by minimizing the `1 -norm of the coefficients hu; ¥ i, see Donoho and Elad (2003) and Candès et al. (2006). Minimization of the `1 norm for finding a sparse solutions has, However, been proposed and studied much earlier for certain geophysical inverse problems (see Claerbout and Muir 1973; Levy and Fullagar 1981; Oldenburg et al. 1983; Santosa and Symes 1986).
Sparsity in Inverse Geophysical Problems Fig. 1 Collecting GPR data from a flying helicopter. At each position on the flight path , the antenna emits a short radar pulse. The radar waves get reflected, and the scattered signals are collected in radargrams
1661 Antenna
x3
Flight path Γ x1 x2 Snow
Target
Subsurface
1.1
Case Example: Ground Penetrating Radar
The case example of a geophysical inverse problem studied in this chapter is Ground Penetrating Radar (GPR), which aims at finding buried objects by measuring reflected radar signals (Daniels 2004). The reflected signals are detected in zero offset mode (emitting and detecting antenna are at the same position) and used to estimate the reflecting objects. The authors’ interest in GPR has been raised by the possibility of locating avalanche victims by means of a GPR system mounted on a flying helicopter (Haltmeier et al. 2005; Frühauf et al. 2009). The basic principle of collecting GPR data from a helicopter is shown in Fig. 1. In Sect. 5.1, it is shown that the imaging problem in GPR reduces to solving Eq. 1, with A being the circular Radon transform. The inversion of the circular Radon transform also arises in several other up-to-date imaging modalities, such as in SONAR, seismic imaging, ultrasound tomography, and photo-/thermo-acoustic tomography (see, e.g., Norton and Linzer 1981; Andersson 1988; Bleistein et al. 2001; Finch and Rakesh 2007; Patch and Scherzer 2007; Kuchment and Kunyansky 2008; Scherzer et al. 2009; Symes 2009 and the reference therein).
2
Variational Regularization Methods
Let U and V be Hilbert spaces and let A W U ! V a bounded linear operator with unbounded inverse. Then, the problem of solving the operator equation Au D v is ill-posed. In order to (approximately) solve this equation in a stable way, it is therefore necessary to introduce some a priori knowledge about the solution u, which can be expressed via smallness of some regularization functional
1662
M. Grasmair et al.
R W U ! Œ0; C1 . In classical regularization theory, one assumes that the possible solutions have a small energy in some Hilbert space norm; typically, an L2 or H 1 -norm is used, and defines R as the square of this norm. In contrast, in this chapter the situation of sparsity constraints is considered, where one assumes that the possible solutions have a sparse expansion with respect to a given basis. In the following, u- denotes any R-minimizing solution of the equation A u D v, provided that it exists, that is, u- 2 arg min fR.u/ W Au D vg : In applications, it is to be expected that the measurements v one can obtain are disturbed by noise. That is, one is not able to measure the true data v but only has some noisy measurements v ı available. In this case, solving the constrained minimization problem R.u/ ! min subject to A u D v ı is not suitable, because the ill-posedness of the equation will lead to unreliable results. Even more, in the worst case it can happen that v ı is not contained in the range of A, and thus the equation Au D v ı has no solution at all. Thus, it is necessary to restrict oneself to solving the given equation only approximately. Three methods are considered for the approximate solution, all of which require knowledge about, or at least some estimate of, the noise level ı W D v v ı . Residual method:
Fix > 1 and solve the constrained minimization problem R .u/ ! min
subject to Au uı ı:
Tikhonov regularization with discrepancy principle: Tikhonov functional
(2)
Fix 1 and minimize the
2 T˛;uı .u/ WD Au v ı C ˛R .u/;
(3)
where ˛ > 0 is chosen in such a way that Morozov’s discrepancy principle is satisfied, that is, ı Au v ı D ı with uı 2 arg min T˛;vı .u/: ˛ ˛ u
Tikhonov regularization with a priori parameter choice: Fix C > 0 and minimize the Tikhonov functional (3) with a parameter choice ˛ D C ı:
(4)
The residual method aims for the minimization of the penalty term R over all elements u that generate approximations of the given noisy data v ı ; the size of the permitted defect is dictated by the assumed noise level ı. In
Sparsity in Inverse Geophysical Problems
1663
particular, the true solution u- is guaranteed to be among the feasible elements in the minimization problem (2). The additional parameter 1 allows for some incertitude concerning the precise noise level; if is strictly greater than 1, an underestimation of the noise would still yield a reasonable result. If the regularization functional R is convex, the residual method can be shown to be equivalent to Tikhonov regularization with a parameter choice according to Morozov’s discrepancy principle, provided the size of the signal is larger than the noise level, that is, the signal-to-noise ratio is larger than t. In this case, the regularization parameter a in (3) plays the role of a Lagrange parameter for the solution of the constrained minimization problem (2). This equivalence result is summarized in the following theorem (see Ivanov et al. 2002, Theorems 3.5.2, 3.5.5): Theorem 1. Assume that the operator A W U ! V is linear and has a dense range and that the regularization term R is convex. In addition, assume that R.u/ D 0 if and only if u D 0. Then the residual method and Tikhonov regularization with an a posteriori parameter choice by means of the discrepancy principle are equivalent in the following sense: ı v > ı. Then uı solves the constrained problem Let v ı 2 V and ı > 0 satisfy ı ı (2), if and only if Au v D ı and there exists some ˛ > 0 such that uı minimizes the Tikhnonov functional (3). In order to show that the methods introduced above are indeed regularizing, three properties have to be necessarily satisfied, namely, existence, stability, and convergence. In addition, convergence rates can be used to quantify the quality of the method: • Existence: For each regularization parameter ˛ > 0 and every v ı 2 V the regularization functional T˛;vı attains its minimum. Similarly, the minimization problem (2) has a solution. • Stability is required to ensure that, for fixed noise level ı, the regularized solutions depend continuously on the data v ı . • Convergence ensures that the regularized solutions converge to u- as the noise level decreases to zero. • Convergence rates provide an estimate of the difference between the minimizers of the regularization functional and u- . Typically, convergence rates are formulated in terms of the Bregman distance (see Burger and Osher 2004; Resmerita 2005; Hofmann et al. 2007; Scherzer et al. 2009) which, for a convex and differentiable regularization term R with subdifferential R and 2 @R.u- / is defined as ˝ ˛ D u; u- D R .u/ R u- ; u u- :
1664
M. Grasmair et al.
That is, D u; u- measures the distance between the tangent and the convex function R. In general, convergence with respect to the Bregman distance does not imply convergence with respect to the norm, strongly reducing the significance of the derived rates. In the setting of sparse regularization to be introduced below, however, it is possible to derive convergence rates with respect to the norm on U .
3
Sparse Regularization
In the following, the focus lies on sparsity promoting regularization methods. To that end, it is assumed that . /2 is an orthonormal basis of the Hilbert space U , for instance a wavelet or Fourier basis. For u 2 U , the support of u with respect to the basis . /2 a is denoted by supp .u/ WD f 2 W h ; ui ¤ 0g If jsupp.u/j s for some s 2 N, then the element u is called s-sparse. It is called sparse, if it is s-sparse for some s 2 N, that is, jsupp.u/j < 1. Given weights w ; 2 ƒ, bounded below by some constant wmin > 0, one defines for 0 < q 2 the `q -regularization functional Rq W U ! R [ f1g, Rq .u/ WD
X
w jh ; uijq :
2ƒ
If q D 2, then the regularization functional is simply the weighted squared Hilbert space norm on U . If q is smaller than 2, small coefficients h¥ ; ui are penalized comparatively stronger, while the penalization of large coefficients becomes less pronounced. As a consequence, the reconstructions resulting by applying any of the above introduced regularization methods will exhibit a small number of significant coefficients, while most of the coefficients will be close to zero. These sparsity enhancing properties of `q -regularization become more pronounced as the parameter q decreases. If one choose q at most 1, then the reconstructions are necessarily sparse in the above, strict sense, that is, the number of nonzero coefficients is at most finite (see Grasmair 2010): Proposition 1. Let q 1; ˛ > 0; v ı 2 V . Then every minimizer of the Tikhonov functional T˛;vı with regularization term Rq is sparse. There are compelling reasons for using an exponent q 1 in applications, as this choice entails the convexity of the ensuing regularization functionals. In contrast, a choice q < 1 leads to nonconvex minimization problems and, as a consequence, to numerical difficulties in their minimization. In the convex case q 1, there are several possible strategies for computing the minimizers of regularization functional
Sparsity in Inverse Geophysical Problems
1665
T˛;vı . Below, in Sect. 4, two different, iterative methods are considered: an Iterative Thresholding Algorithm for regularization with a priori parameter choice and 1 q 2 (Daubechies et al. 2004), and a log-barrier method for Tikhonov regularization with an a posteriori parameter choice by the discrepancy principle in the case q D 1 (Candès and Romberg 2005). Iterative thresholding algorithms have also been studied for nonconvex situations, but there the convergence to global minima has not yet been proven (Bredies and Lorenz 2014).
3.1
Convex Regularization
Now the theoretical properties of `q type regularization methods with q 1 are studied, in particular the questions of existence, stability, convergence, and convergence rates. In order to be able to take advantage of the equivalence result Theorem 1, it is assumed in the following that the operator A W U ! V has dense range. The question of existence is easily answered (Grasmair et al. 2008, 2011b): Proposition 2 (Existence). For every ˛ > 0 and v ı 2 V the functional T˛;vı has a minimizer in U. Similarly, the problem of minimizing Rq .u/ subject to the constraint Au vı ı admits a solution in U. Though the previous lemma states the existence of minimizers for all q 1, there is a difference between the cases q D 1 and q > 1. In the latter case, the regularization functional T˛;vı is strictly convex, which implies that the minimizer must be unique. For q D 1, the regularization functional is still convex, but the strict convexity holds only, if the operator A is injective. Thus, it can happen that one does not obtain a single approximate solution, but a whole (convex and closed) set of minimizers. Because of this possible nonuniqueness, the stability and convergence results have to be formulated in terms of subsequential convergence. Also, one has to differentiate between a priori and a posteriori parameter selection methods. In the latter case, the stability and convergence results can be formulated solely in terms of the noise level ı. In the case of an a priori parameter choice, it is in addition necessary to take into account the actual choice of a in dependence of ı. For the following results see Lorenz (2008) and Grasmair et al. (2011b). Proposition 3 (Stability). Let ı > 0 be fixed and let vk ! v ı . Consider one of the following settings: Residual method: Let uk 2 U be solutions of the residual method with data vk and noise level ı. Discrepancy principle: Let uk 2 U be solutions of Tikhonov regularization with data vk and an a posteriori parameter choice according to the discrepancy principle for noise level ı.
1666
M. Grasmair et al.
A priori parameter choice: Let ˛ > 0 be fixed, and let uk 2 U be solutions of Tikhonov regularization with data vk and regularization parameter ˛. Then the sequence .uk /k2N has a subsequence converging to a regularized solution uı obtained with data v ı and the same regularization method. If uı is unique, then the whole sequence .uk /k2N converges to uı . Proposition 4 (Convergence). Let ık ! 0 and let vk 2 V satisfy kuk vk ık : Assume that there exists u 2 U with Au D v and Rq .u/ < C1. Consider one of the following settings: Residual method: Let uk 2 U be solutions of the residual method with data vk and noise level ık . Discrepancy principle: Let uk 2 U be solutions of Tikhonov regularization with data vk and an a posteriori parameter choice according to the discrepancy principle with noise level ık . A priori parameter choice: Let ˛k > 0 satisfy ˛k ! 0 and ık2 =˛k ! 0, and let uk 2 U be solutions of Tikhonov regularization with data vk and regularization parameter ˛k . Then the sequence .uk /k2N has a subsequence converging to an Rq -minimizing solution u- of the equation Au D v. If u- is unique, then the whole sequence .uk /k2N converges to u- . Note that the previous result in particular implies that an Rq -minimizing solution uof Au D v indeed exists. Also, the uniqueness of u- is trivial in the case q > 1, as then the functional Rq is strictly convex. Thus, one obtains in this situation indeed convergence of the whole sequence .uk /k2N . Though it is known now that approximative solutions converge to true solutions of the considered equation as the noise level decreases to zero, no estimate for the speed of the convergence is obtained. Indeed, in general situations the convergence can be arbitrarily slow. If, however, the Rq -minimizing solution u satisfies a socalled source condition, then one can obtain sufficiently good convergence rates in the strictly convex case q > 1. If, in addition, the solution u- is sparse and the operator A is invertible on the support of u- , then the convergence rates improve further. Before stating the convergence rates results, the authors recall the definition of the source condition and its relation to the well-known Karush-Kuhn-Tucker condition used in convex optimization. Definition 1. The Rq -minimizing solution u- of the equation A u D v satisfies the source condition, if there exists 2 V such that A 2 @Rq .u- /. Here @Rq .u- / denotes the subdifferential of the function Rq at u- and A W V ! U is the adjoint of A.
Sparsity in Inverse Geophysical Problems
1667
In other words, if q > 1 one has ˛ ˇ˝ ˛ˇq1 ˝ ; h ; A¥ i D q sign u- ; ¥ ˇ u- ; ¥ ˇ
2 ƒ;
and if q D 1 one has ˛ ˝ h ; A¥ i D sign u- ; ¥ if 2 supp u- ; if … supp u- : h ; A¥ i 2 Œ1; C1 The conditions A 2 @Rq .u- / for some 2 V and Au- D v are nothing more than the Karush-Kuhn- Tucker conditions for the constrained minimization problem Rq .u/ min
subject to Au D v:
In particular, it follows that uQ 2 U is an Rq -minimizing solution of the equation A u D v whenever uQ satisfies the equation AQu D v and one has ran A \ @Rq .Qu/ ¤ Ø (Ekeland and Temam 1974, Proposition 4.1). The following convergence rates result can be found in Lorenz (2008) and Grasmair et al. (2011b). It is based on results concerning convergence rates with respect to the Bregman distance (see Burger and Osher 2004) and the fact that, for `q -regularization, the norm can be bounded from above, locally, by the Bregman distance. Proposition 5. Let 1 < q < 2and assume that u- satisfies the source condition. ı ı Denote, for v 2 V satisfying v v ı, by uı WD u.v ı / the solution with data v ı of either the residual method, or Tikhonov regularization with Morozov’s discrepancy principle, or Tikhonov regularization with an a priori parameter choice ˛ D Cı for some fixed C > 0. Then p ı u u- D O ı : In the case of an a priori parameter choice, one additionally has that ı Au v D O .ı/: The convergence rates provide (asymptotic) estimates of the accuracy of the approximative solution in dependence of the noise level ı. Therefore, the optimization of the order of convergence is an important question in the field of inverse problems. In the case of Tikhonov regularization with a priori parameter choice, the rates can indeed be improved, if the stronger source condition A A 2 @Rq .u- / for some 2 U holds. Then, one obtains with a parameter choice ˛ D C ı 2=3 a rate of order O.ı 2=3 / (see Groetsch 1984; Resmerita 2005). For quadratic Tikhonov regularization, it has been shown that this rate is the best possible one. That is,
1668
M. Grasmair et al.
except in the trivial case u- D 0, there exists no parameter selection method, neither a priori nor a posteriori, that can yield a better rate than O.ı 2=3 / (see Neubauer 1997). This saturation result poses a restriction on the quality of reconstructions obtainable with quadratic regularization. In the nonquadratic case q < 2, the situation looks different. If the solution uis sparse, then the convergence rates results can be improved beyond the quadratic bound of O.ı 2=3 /. Moreover, they also can be extended to the case q D 1. For the improvement of the convergence rates, an additional injectivity condition is needed, which requires the operator A to be injective on the (finite dimensional) subspace of U spanned by the basis elements ¥ ; 2 supp.u- /. This last condition is trivially satisfied, if the operator A itself is injective. There exist, however, also interesting situations, where the linear equation A u D v is vastly underdetermined, but the restriction of A to all sufficiently low-dimensional subspaces spanned by the basis elements ¥ is injective. These cases have recently been well studied in the context of compressed sensing (Donoho and Elad 2003; Candès et al. 2006). The first improved convergence rates have been derived in Grasmair et al. (2008, 2011b). Proposition 6. Let 1 q 2 and assume that u- satisfies the source condition. In addition, assume that u- is sparse and that the restriction of the operator A to spanf¥ W 2 supp.u- /g is injective. Then, with the notation of Proposition 5, one has ı u u- D O ı 1=q : The most interesting situation is the case q D 1. Here, one obtains a linear convergence of the regularized solutions to u- . That is, the approximative inversion of A is not only continuous but in fact Lipschitz continuous; the error in the reconstruction is of the same order as the data error. In addition, the source condition A 2 @Rq .u- / in some sense becomes weakest for q D 1, because then the subdifferential is set-valued and therefore larger than in the strictly convex case. Moreover, the source condition for q > 1 requires that the support of A equals the support of u- , which strongly limits the applicability of the convergence rates result. While Proposition 6 concerning convergence rates in the presence of a sparsity assumption and restricted injectivity holds for all 1 q 2, the rates result without these assumptions, Proposition 5, requires that the parameter q is strictly greater than 1. The following converse result shows that, at least for Tikhonov regularization with an a priori parameter choice, a similar relaxation of the assumptions by dropping the requirement of restricted injectivity is not possible for q D 1; the assumptions of sparsity and injectivity of A on supp.u- / are not only sufficient but also necessary for obtaining any sensible convergence rates (see Grasmair et al. 2011a).
Sparsity in Inverse Geophysical Problems
1669
Proposition 7. Let q D 1 and assume that u- is the unique R1 -minimizing solution of the equation Au = v. Denote, for v ı 2 V satisfying v ı v ı, by uı WD u.v ı / the solution with data v ı of Tikhonov regularization with an a priori parameter choice ˛ D C ı for some fixed C > 0. If the obtained data error satisfies ı Au v D O .ı/; then u- is sparse and the source condition holds. In particular, also ı u u- D O .ı/:
3.2
Nonconvex Regularization
In the following, the properties of `q regularization with a sub-linear regularization term, that is, 0 < q < 1, are studied. In this situation, the regularization functional is nonconvex, leading to both theoretical and numerical challenges. Still, nonconvex regularization terms have been considered for applications, because they yield solutions with even more pronounced sparsity patterns than `1 regularization. From the theoretical point of view, the lack of convexity prohibits the application of Theorem 1, which states that the residual method is equivalent to Tikhonov regularization with Morozov’s discrepancy principle. Indeed, it seems that an extension of said result to nonconvex regularization functionals has not been treated in the literature so far. Even more, though corresponding results have recently been formulated for the residual method, the question, whether the discrepancy principle yields stable reconstructions, has not yet been answered. For these reasons, the discussion of nonconvex regularization methods is limited to the two cases of the residual method and Tikhonov regularization with an a priori parameter choice. Both methods allow the derivation of basically the same, or at least similar, results as for convex regularization, the main difference being the possible nonuniqueness of the Rq -minimizing solutions of the equation Au D v (see Grasmair 2009; Grasmair et al. 2011b; Zarzer 2009). Proposition 8. Consider either the residual method or Tikhonov regularization with an a priori parameter choice. Then Propositions 2–4 concerning existence, stability, and convergence remain to hold true for 0 < q < 1. Also the convergence rates result in the presence of sparsity, Proposition 6, can be generalized to nonconvex regularization The interesting point is that the source condition needed in the convex case apparently is not required any more. Instead, the other conditions of Proposition 6, uniqueness and sparsity of u- and restricted injectivity of A, are already sufficient for obtaining linear convergence (see Grasmair 2010; Grasmair et al. 2011b).
1670
M. Grasmair et al.
Proposition 9. Let 0 < q < 1 and assume that u- is the unique Rq -minimizing solution of the equation A u D v. Assume moreover that u- is sparse and that the restriction of the operator A to spanf¥ W 2 supp.y - /g is injective. Denote, for v ı 2 V satisfying v ı v ı, by uı WD .v ı / the solution with data v ı of either the residual method or Tikhonov regularization with an a priori parameter choice ˛ D C ı for some fixed C > 0. Then ı u u- D O .ı/: In the case of Tikhonov regularization with an a priori parameter choice, one additionally has that ı Au u D O .ı/:
4
Numerical Minimization
4.1
Iterative Thresholding Algorithms
In Daubechies et al. (2004), an iterative algorithm has been analyzed that can be used for minimizing the Tikhonov functional T˛;vı for fixed ˛ > 0, that is, for an a priori parameter choice. To that end, the authors define for b > 0 and 1 q 2 the function Fb;q W R ! R Fb;q .t/ WD t C
bq sign .t/ jtjq1 : 2
If q > 1, the function Fb;q is a one-to-one mapping from R to R. Thus, it has an inverse Sb;q WD .Fb;q /1 W R ! R. In the case q D 1 one defines 8 ˆ t b=2 if t b=2; ˆ ˆ < Sb;1 .t/ WD 0 if jtj < b=2; ˆ ˆ ˆ : t C b=2 if t b=2:
(5)
Using the functions Sb;q , for b D .b /2ƒ 2 Rƒ >0 and 1 q 2, the Shrinkage Operator Sb;q W U ! U , is defined as Sb;q .u/ WD
X 2ƒ
Sb ;q .hu; i/ :
(6)
Sparsity in Inverse Geophysical Problems
1671
Proposition 10. Let v ı 2 V; ˛ > 0, and 1 q 2, and denote w WD .w /2ƒ . Let > 0 be such that kA Ak < 1. Choose any u0 2 U and define inductively unC1 WD S˛w;q un C A v ı Aun :
(7)
Then the iterates un , defined by the thresholding iteration (7), converge to a minimizer of the functional T˛;vı as n ! 1. The method defined by the iteration (7) can be seen as a forward-backward splitting algorithm for the minimization of T˛;vı , the inner update u 7! u C A 2 .v ı Au/ being gradient descent step for the functional Au Av ı and the shrinkage operator a gradient descent step for ˛Rq . More details on the application of forward-backward splitting methods to similar problems can, for instance, be found in Combettes and Wajs (2005).
4.2
Second Order Cone Programs
In the case of an a posteriori parameter choice (or the equivalent residual method), the iterative thresholding algorithm (7) cannot be applied directly, as the regularization parameter ˛ > 0 is not known in advance. One can show, however, that the required parameter a depends continuously on 5 (see Bonesky 2009). Thus, it is possible to find the correct parameter iteratively, starting with some initial guess ˛ > 0 and computing some uO 2 arg minu T˛;vı .u/. Depending on the size of the residual AOu v ı , one subsequently either increases or decreases ˛ and computes the minimizer of T˛;vı using the new regularization parameter. This procedure of updating ˛ and minimizing T˛;vı is stopped, as soon as the residual satisfies AOu v ı ı. In the important case q D 1, a different solution algorithm has been established, which takes advantage of the fact that the constrained minimization problem 2 R1 .u/ ! min subject to Au v ı ı can be rewritten as a second-order cone program (SOCP) (Candès and Romberg 2005). To that end the authors introduce an additional variable a D .a /2ƒ 2 `2 .ƒ/ and minimize †2ƒ w a subject to 2 the constraints a jhu; ¥ ij for all 2 ƒ and Au v ı ı 2 . The former bound consisting of the two linear constraints a ˙ hu; ¥ i, the authors arrive at the SOCP
S .u; a/ WD
X 2ƒ
a C hu; ¥ i w a ! min
subject to
0;
0; a hu; ¥ i 2 2 ı ı Au v 0:
If the pair (u, a) solves (8), then u is a solution of the residual method.
(8)
1672
M. Grasmair et al.
The solutions of the program (8) can be computed using a log-barrier method, defining for > 0 the functional P
S .u; a/ WD
w a
2ƒ
P
P
log .a C hu; ¥ i/ 2 log .a hu; ¥ i/ log Au v ı ı 2 : 2ƒ
2ƒ
As ! 1, the minimizers of S .u; a/ converge to a solution of (8). Moreover, one can show that the solution .uı ; aı / of (8) and the minimizer uı ; a ı of S satisfy the relation S uı ; a ı < S uı ; aı C .jƒj C 1/ = ;
(9)
that is, the value of the minimizer of the relaxed problem S lies within .jƒj C 1/ =
of the optimal value of the original minimization problem (Renegar 2001). In order to solve (8), one alternatingly minimizes S and increases the parameter
. That is, one chooses some parameter > 1 defining the increase of and starts with k D 1 and some initial parameter .1/ > 0. Then one iteratively computes .uk ; ak / 2 arg min S .k/, set .kC1/ WD .k/ and increases k until the value .jƒj C 1/ = .k/ is smaller than some predefined tolerance—according to (9), this implies that also the value S.uk ; ak / is within the same tolerance of the actual minimum. For the minimization of S .k/, which has to take place in each iteration step, one can use a Newton method combined with a line search that ensures that one does not leave the domain of S .k/ and that the value of S .k/ actually decreases. More details on the minimization algorithm can be found in Candès and Romberg (2005).
5
Application: Synthetic Focusing in Ground Penetrating Radar
In this section, sparsity regularization is applied to data obtained with Ground Penetrating Radar (GPR) mounted on a flying helicopter (see Fig. 1). As stated in Sect. 1, the imaging problem will be written as the inversion of the circular Radon transform.
5.1
Mathematical Model
For simplicity of presentation, polarization effects of the electromagnetic field are ignored and a small isotropic antenna is assumed. In this case, each component of the electromagnetic field E.xant Ix; t/ induced by an antenna that is located at xant 2 R3 is described by the scalar wave equation
Sparsity in Inverse Geophysical Problems
1673
w^ b (2πv )
wb(t )
–2
–1
0
1
2
[ns]
–1500
–500
500
1500 [Mz]
Fig. 2 Ricker wavelet (second derivative of a small Gaussian) with a central frequency of b D 500 MHz in the time domain (left) and in the frequency domain (right)
1 c .x/
@2 2 t
E xant I x; t D ı3D x xant wb .t/;
.x; t/ 2 R3 R:
(10)
Here ı3D denotes the three-dimensional delta distribution, wb represents the temporal shape of the emitted radar signal (impulse response function of the antenna) with bandwidth b, and c(x) denotes the wave speed. GPR systems are designed to generate ultrawideband radar signals, where the bandwidth b is approximately equal to the central frequency, and the pulse duration is given by D 1=b. Usually, wb is well approximated by the second derivative of a small Gaussian (Ricker wavelet), see Daniels (2004). Figure 2 shows a typical radar signal emitted by a radar antenna at 500 MHz and its Fourier transform.
Born Approximation Scattering of the radar signals occurs at discontinuities of the function c. In the sequel, it is assumed that 1 c .x/
2
D
1 1 C u3D .x/ ; 2 c0
where c0 is assumed to be constant (the light speed) and u3D is a possibly nonsmooth function. Moreover, the following decomposition is made: E .xant I x; t/ D E0 .xant I x; t/ C Escat .xant I x; t/; .x; t/ 2 R3 R; where E0 denotes the incident field (the solution of the wave equation (10) with c replaced by c0 ), and Escat is the scattered field.
1674
M. Grasmair et al.
From (10) it follows that the scattered field satisfies
ant 1 2 u3D .x/ @2 E .xant I x; t/ x @ E I x; t D : scat t @t 2 c02 c02
The Born approximation consist in replacing the total field E in the above equation by the incident field E0 . This results in the approximation Escat ' EBorn , where EBorn solves the equation
u3D .x/ @2 E0 .xant I x; t/ 1 2 @ EBorn .xant I x; t/ D ; .x; t/ R3 R: 2 t @t 2 c0 c02
(11) Together with the initial condition Escat .xant I x; t/ D 0 for t < t0 , Eq. 11 can be solved explicitly via Kirchhoff’s formula, see Courant and Hilbert (1962, p. 692), 1 EBorn xant I x; t D 4c02
jx yj E0 xant I y; t c0 u3D .y/ d y: 3 jx yj R
Z
The identity wb .t jy xant j =c0 / .wb t ı1D / .t jy xant j =c0 / E0 xant I y; t D D ; ant 4jy x j 4 jy xant j with ı1D denoting the one-dimensional delta distribution, leads to w00 .t/ EBorn xant I x; t D b 2 2 t 16 c0
Z
ı1D
jy xant j jx yj c0 c0 d y: ant jx yj jy x j
t
u3D R3
(12)
In GPR, the data are measured in zero offset mode, which means that the scattered field is only recorded at location x D xant . In this situation, Eq. 12 simplifies to EBorn
w00 .t/ x I x ; t D b 2 t 32 c0 ant
ant
Z
ı1D u
3D
.y/
c0 t jy xant j jy xant j2
d y;
R3
R where the formula ' .x/ ı1D .ax/ dx D '.0/ has been used. By partitioning the jaj 3 above integral over y 2 R into integrals over spheres centered at xant , and using the definition of the one-dimensional delta distribution, one obtains that ant ant R3D u3D .xant ; c0 t=2/ 00 EBorn x I x ; t D wb t 32 2 c03 .t=2/2
(13)
Sparsity in Inverse Geophysical Problems
1675
with
R3D u3D
ant x ; r WD
Z u3D .y/ dS .y/
(14)
jxant yjDr
denoting the (three-dimensional) spherical Radon transform. This is the basic equation of GPR, that relates the unknown function u3D with the scattered data measured in zero offset mode.
The Radiating Reflectors Model In the presented application (see Fig. 1), the distances between the antenna position xant and the positions y of the reflectors are relatively large. In this case, multiplication by t and convolution with w0b in (13) can be (approximately) interchanged, that is, R3D u3D .xant ; c0 t / : 4c0 t (15) One notes that ˆ is the solution at position xant of the wave equation 8c02 t EBorn xant I xant ; 2t ' ˆ xant ; t DW w00b t
1 2 @ c02 t
ˆ .x; t/ D w00b .t/ u3D .x/ ; .x; t/ 2 R3 R:
(16)
Equation (16) is named the radiating (or exploding) reflectors model, as the inhomogeneity u3D now appears as active source in the wave equation.
Formulation of the Inverse Problem Equation (15) relates the unknown function u3D .x/ with the data ˆ.xant ; t/. Due to the convolution with the function w0b , which does not contain high frequency components (see Fig. 2), the exact reconstruction of u3D is hardly possible. It is therefore common to apply migration, which is designed to invert the spherical Radon transform. When applying migration to the data defined in (15), one reconstructs a bandlimited approximation of u3D . Indeed, from Haltmeier et al. (2009, Proposition 2.2), it follows that (see also Haltmeier and Zangerl 2010) ant .x ; c0 t/ R3D u3D b ; ˆ xant ; t D t
(17)
where u3D b .x/ WD
8c0
Z R3
w000 b .jyj/ 3D u .x y/ d y; x 2 R3 : jyj
(18)
1676
M. Grasmair et al.
Therefore, the data tˆ.xant ; t/ can be viewed as the spherical Radon transform of the band-limited reflectivity function u3D b .x/, and the application of migration to the data tˆ.xant ; t/ will reconstruct the function u3D b .x/. A characteristic of the presented application (see Fig. 1) is that the radar antenna is moved along a one dimensional path, that is, only the two-dimensional data set v .x ant ; t / WD tˆ ..x ant ; 0; 0/; t / with .x ant ; t/ 2 R .0; 1/; is available from which one can recover at most a function with two degrees of freedom. Therefore, it is assumed that the support of the function u2D is b approximately located in the plane f.x1 ; x2 ; x3 / W x3 D 0g, that is, 2 2D u3D b .x1 ; x2 ; x3 / D ub .x1 ; x2 / ı1D .x3 / with x D .x1 ; x2 ; x3 / 2 R R:
Together with (17) this leads to the equation ant v .xant ; t / D R2D u2D .x ; c0 t/ ; .x ant ; t / 2 R .0; 1/; b
(19)
where .R2D u/ .x ant ; r/ WD
R j.x ant ;0/yjDr
u .y/ ds .y/; .x ant ; r/ 2 R .0; 1/;
(20)
denotes the circular Radon transform (the spherical Radon in two dimensions). Equation (19) is the final equation that will be used to reconstruct the bandlimited ant reflectivity function u2D b .x1 ; x2 / from data v.x ; r/.
5.2
Migration Versus Nonlinear Focusing
ant ant If the values R2D u2D 2 R and all r > 0, then b .x ; r/ in (19) were known for all x 2D ub could be reconstructed by means of explicit reconstruction formulas. At least two types of theoretically exact formulas for recovering u2D have been derived: b Temporal back-projection and Fourier domain formulas (Stolt 1978; Norton and Linzer 1981; Fawcett 1985; Andersson 1988). These formulas and their variations are known as migration, backprojection, or synthetic focusing techniques.
The Limited Data Problem ant ant In practice, it is not appropriate to assume R2D u2D 2 R, b .x ; t/ is known for all x and the antenna positions and acquisition times have to be restricted to domains .X; X / and .0; R=c0 /, respectively. We model the available partial data by ant .x ; r / ; with .x ant ; r/ 2 .X; X / .0; R/; vcut .x ant ; r/ WD wcut .x ant ; r/ R2D u2D b (21)
Sparsity in Inverse Geophysical Problems
1677
where wcut is a smooth cutoff function that vanishes outside the domain .X; X / .0; R/. Without a priori knowledge, the reflectivity function u2D b cannot be exactly reconstructed from the incomplete data (21) in a stable way (see Louis and Quinto 2000). It is therefore common to apply migration techniques just to the partial data and to consider the resulting image as approximate reconstruction. Applying Kirchhoff migration to the partial data (21) leads to ukm .x1 ; x2 / WD R2D vcut .x1 ; x2 / WD
Z
X X
q 2 ant 2 ant vcut x ; .x x1 / C x2 dx ant :
With Kirchhoff migration, the horizontal resolution at location (0; d ) is given by c0 d =.2Xb/ (see Borcea et al. 2005, Appendix A.1 for a derivation). Incorporating a priori knowledge via nonlinear inversion, however, may be able to increase the resolution. Below it is demonstrated that this is indeed the case for sparsity regularization using a Haar wavelet basis. A heuristic reason is that sparse objects (reconstructed with sparse regularization) tend to be less blurred than images reconstructed by linear methods.
Application of Sparsity Regularization For the sake of simplicity, only Tikhonov regularization with R1 penalty term and uniform weights is considered, leading to the regularization functional X 2 T˛;vı .u/ WD R2D u v ı C ˛ jh¥ ; uij;
(22)
2ƒ
where .¥ /2ƒ is a Haar wavelet basis and ˛ is the regularization parameter. Here u and v ı are elements of the Hilbert spaces n o U WD u 2 L2 R2 W supp .u/ .X; X/ .0; R/ ; V WD L2 ..X; X / .0; R//: The circular Radon transform R2D , considered as operator between U and V , is easily shown to be bounded linear (see, e.g., Scherzer et al. 2009, Lemma 3.79) For the minimization of (22), we apply the iterative thresholding algorithm (7), which in this context reads as unC1 WD S˛;1 un C R2D v ı R2D un :
(23)
operator defined by (6) and (5), and is a positive Here S˛;1 is the shrinkage parameter such that R2D R2D < 1.
1678
5.3
M. Grasmair et al.
Numerical Examples
In the numerical examples, X D 2 m and R D 12 m. The scatterer u is the characteristic function of a small disk located at position .0; d / with d D 7 m, see Fig. 3. It is assumed that the emitted radar signal is a Ricker wavelet wb with a central frequency of 250 MHz (compare with Fig. 2). The data v.x ant , r/ are generated by numerically convolving R2D with the second derivative of the Ricker wavelet. The reconstructions obtained with Kirchhoff migration and with sparsity regularization are depicted in Fig. 4. Both methods show good resolution in the vertical direction (often called axial or range resolution). The horizontal resolution (lateral or cross-range resolution) of the scatterer, however, is significantly improved by sparsity regularization. This shows that sparsity regularization is indeed able to surpass the resolution limit c0 d =.2 b/ of linear reconstruction techniques.
−X
X X1 - axis
{X2 = 7}
{X1 = 0} X2 - axis
Fig. 3 Geometry in the numerical experiment. Data v.x ant , r/, caused by a small scatterer positioned at location (0, 7{m}), are simulated for (x ant , r/ 2 (X, X/ (0, R) with X D 2 m and R D 12 m The reconstructions obtained with Kirchhoff migration and with sparsity regularization are depicted in Fig. 4. Both methods show good resolution in the vertical direction (often called axial or range resolution). The horizontal resolution (lateral or cross-range resolution) of the scatterer, however, is significantly improved by sparsity regularization. This shows that sparsity regularization is indeed able to surpass the resolution limit c0 d = (2Xb) of linear reconstruction techniques
Sparsity in Inverse Geophysical Problems
1679
1
1 Original Kirchhoff Sparsity
0.5
Original Kirchhoff
0.8
Sparsity 0.6 0.4
0
0.2 –0.5 0 –1 5.5
6
6.5
7
7.5
8
8.5
–0.2 –2 –1.5 –1 –0.5
0
0.5
1
1.5
2
Fig. 4 Exact data experiment. Top left: Data. Top middle: Reconstruction by Kirchhoff migration. Top right: Reconstruction with sparsity regularization. Bottom: Vertical and horizontal profiles of the reconstructions
In order to demonstrate the stability with respect to data perturbations, we also perform reconstructions after adding Gaussian noise and clutter. Clutter occurs from multiple reflections on fixed structures and reflections resulting from the inhomogeneous background (Daniels 2004). A characteristic property of clutter is that is has similar spectral characteristics as the emitted radar signal. The reconstruction results from data with clutter and noise added are depicted in Fig. 5. Again, sparsity regularization shows better horizontal resolution than Kirchhoff migration. Moreover, the image reconstructed with sparsity regularization is less noisy.
1680
M. Grasmair et al.
1
1
Original
Original
0.5
Kirchhoff Sparsity
0.8
Kirchhoff Sparsity
0.6 0.4
0
0.2 –0.5 0 –1 5.5
6
6.5
7
7.5
8
8.5
–0.2 –2 –1.5 –1 –0.5
0
0.5
1
1.5
2
Fig. 5 Noisy data experiment. Top left: Data. Top middle: Reconstruction by Kirchhoff migration. Top right: Reconstruction with sparsity regularization. Bottom: Vertical and horizontal profiles of the reconstructions
5.4
Application to Real Data
Radar measurements were performed with a 400 MHz antenna (RIS One GPR instrument). The investigated area was a complex avalanche deposit near Salzburg, Austria. The recorded data are shown in Fig. 6. In the numerical reconstruction, an aperture of X D 3:3 m and a time window of R=c0 D 50 ns are chosen. The extracted data are depicted in the left image in Fig. 7. One clearly sees a diffraction hyperbola stemming from a scatterer in the subsurface. Moreover, the data agree very well with the simulated data depicted in the left image in Fig. 5.
Sparsity in Inverse Geophysical Problems
1681 −X
Fig. 6 Measured radar data. For the numerical reconstruction only the partial data ˆ((x ant ; 0; 0/, t /, with (x ant , t / 2 (X, X/ (0, R=c0 / where X D 3:3 m and R=c0 D 50 ns, have been used
X flight-path
R / c0
time
The reconstruction results with Kirchhoff migration and with sparsity regularization are depicted in Fig. 7. The regularization parameter as 0.02, and ˛ is chosen the scaling parameter is chosen in such a way, that R2D R2D is only slightly smaller than 1.
Appendix: Numerical Methods In the main article, we have presented two methods for the minimization of sparsity functionals of the form T˛;vı .u/ D kAu v ı k2 C ˛R.u/
(24)
with R.u/ D
X
w jh ; uij;
2ƒ
namely, iterative thresholding and second order cone programs. In this appendix, we will discuss some additional methods. Numerical methods for sparsity minimization can be divided into two categories: first, methods that attempt to minimize the (non-smooth) functional (24) directly and, second, methods that approximate the functional (24) with a differentiable one and then try to find a minimum of the approximation. In contrast to the nonsmooth original problem, here the corresponding optimality condition will be a single valued equation with a unique solution. Both methods introduced in the main article treated the direct problem. We now discuss first two approximation approaches and then two direct methods.
1682
M. Grasmair et al.
1.2
1 Kirchhoff Sparsity
Kirchhoff Sparsity
1
0.5
0.8 0.6
0 0.4 0.2
–0.5
0 –1 9
10
11
12
13
14
15
–0.2 –4
–2
0
2
4
Fig. 7 Reconstruction from real data. Top left: Data. Top middle: Reconstruction by Kirchhoff migration. Top right: Reconstruction with sparsity regularization. Bottom: Vertical and horizontal profiles of the reconstructions
Explicit Approximation Methods Usually, the approximating functionals can be written in the form T˛;;vı .u/ D kAu v ı k2 C ˛R .u/
(25)
with R .u/ D
X
w .h ; ui; /:
2ƒ
Here we assume that the function properties:
W R R 0 ! R has the following
• The function satisfies .s; / ! jsj for every s 2 R. • For every > 0 the function .; / is convex and differentiable.
Sparsity in Inverse Geophysical Problems
Typical examples of functions 1. 2. 3.
1683
are
p .s; / D s 2 C 2 , .s; / D s 2 =2 für jsj und .s; / D jsj1C ,
.s; / D jsj =2 für jsj ,
to name but a few. The advantage of these approaches is that the subdifferential of T˛;;vı is at most single valued, and therefore the minimization can be performed with gradient based algorithms. Consider now > 0 fixed. Minimization of the functional T˛;;vı can be performed with gradient-type methods, suchpas Landweber iteration, or (quasi)Newton methods. In the case of .s; / D s 2 C 2 the gradient based method looks as follows: h i X h ; u.n/i u.nC1/ D u.n/ n 2A .Au.n/ v ı / ˛ w p
: h ; u.n/ i2 C 2 2ƒ
(26)
Here n is some positive step-size that can, for instance, be defined by a line-search.
Iteratively Reweighted Least Squares Method (IRLS) This approach is based on the identity R.u/ D
X 2ƒ
w h ; ui2 : jh ; uij
Thus, `1 -regularization can be considered as quadratic regularization with weights w =jh ; uij depending on the minimizer u. We now define S.u; uQ ; / WD
X
w .jh ; uQ ij/h ; ui2 ;
2ƒ
where the functions satisfy .t/ ! 1=t for ! 0. A typical choice is 1 .t/ D p : 2 t C 2 Starting with some initial guess u.0/ of the minimizer of T˛;vı , one then chooses some sequence of positive numbers n ! 0 and defines inductively h i u.nC1/ WD arg min kAu v ı k2 C ˛S.u; u.n/ ; n / : u
1684
M. Grasmair et al.
Note that in each iteration one has to minimize a quadratic functional, which amounts to solving a linear equation. This and related methods have been considered by several authors. A theoretical analysis can, for instance, be found in Daubechies et al. (2010).
Accelerated Thresholding Methods We recall first the iterative thresholding algorithm defined by unC1 WD S˛w;1 un C A .v ı Aun / ; where the thresholding operator S is as in Sect. 4.1. In Beck and Teboulle (2009), an improvement of the iterative thresholding algorithm has been proposed, where the iterates are defined as linear combinations of subsequent iterates. More precisely, given some starting point u0 2 U , the iteration is defined by yn D S˛w;1 un C A .v ı Aun / ; p 1 C 1 C 4tn2 ; tnC1 D 2 tn 1 unC1 D yn C .yn yn1 /; tnC1 with the initialization t0 D 1. We note that the first step of this algorithm is an ordinary thresholding step, and thus an initialization of y1 is not necessary. It has been shown in Beck and Teboulle (2009) that this algorithm converges provided that the shrinkage parameter is chosen in such a way that kA Ak < 1. This is precisely the same constraint as in the classical iterative thresholding algorithm. In addition, the estimate T˛;vı .uk / T˛;vı .uı˛ /
2ku0 uı˛ k2 .k C 1/2
holds, which means that the accuracy of the iterate uk , measured in terms of the energy T˛;vı , is of order O.1=k 2 /. In contrast, the classical iterative thresholding algorithm only yields an accuracy of order O.1=k/.
Augmented Lagrangian Methods In order to define augmented Lagrangian methods for the minimization of T˛;vı , we consider the equivalent formulation as the constrained optimization problem krk2 C ˛R.u/ ! min
subject to Au C r D v ı :
Sparsity in Inverse Geophysical Problems
1685
The augmented Lagrangian of this optimization problem is the mapping Lˇ W U V V ! .1; C1 , Lˇ .u; r; / D krk2 C ˛R.u/ h ; Au C r v ı i C ˇkAu C r v ı k2 ; depending on the parameter ˇ > 0. In order to find a saddle-point of Lˇ , we perform the iteration (see Yang and Zhang 2011) unC1 D S˛w=2ˇ;1 un C A .v ı C rn Aun n =.2ˇ// ; rnC1 D
1 n =2 ˇ.AunC1 v ı / ; 1Cˇ
nC1 D n ˇ AunC1 C rnC1 v ı : We do note that the update step for r is simply an explicit computation of the minimum of Lˇ .unC1 ; ; n /. Similarly, the update step for u can be interpreted as an approximate minimization of Lˇ .; rn ; n /. Therefore, the iteration can be regarded as an inexact alternating directions method. It has been shown in Yang and Zhang (2011) that this algorithm converges, if the parameters , > 0 satisfy the relation kA Ak C 2 < 2. In addition to this algorithm, augmented Lagrangian methods based on a dual problem have also been proposed in Yang and Zhang (2011). We refer to that paper for further information. Acknowledgements This work has been supported by the Austrian Science Fund (FWF) within the national research networks Industrial Geometry, project 9203-N12, Variational Imaging on Manifolds, project 11704, and Photoacoustic Imaging in Biology and Medicine, project S10505N20. The authors thank Sylvia Leimgruber (alpS – Center for Natural Hazard Management in Innsbruck) and Harald Grossauer (University Innsbruck) for providing real life data sets.
References Andersson LE (1988) On the determination of a function from spherical averages. SIAM J Math Anal 19(1):214–232 Beck A, Teboulle M (2009) Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans Image Process 18(11):2419–2434 Bleistein N, Cohen JK, Stockwell JW Jr (2001) Mathematics of multidimensional seismic imaging, migration, and inversion. Interdisciplinary applied mathematics: Geophysics and planetary sciences, vol 13. Springer, New York Bonesky T (2009) Morozov’s discrepancy principle and Tikhonov-type functionals. Inverse Probl 25(1):015015 Borcea L, Papanicolaou G, Tsogka C (2005) Interferometric array imaging in clutter. Inverse Probl 21(4):1419–1460 Bredies K, Lorenz DA (2014) Minimization of non-smooth, non-convex functionals by iterative thresholding. J Optim Theory Appl doi:10.1007/s10957-014-0614-7
1686
M. Grasmair et al.
Burger M, Osher S (2004) Convergence rates of convex variational regularization. Inverse Probl 20(5):1411–1421 Candès EJ, Romberg J (2005) `1 -MAGIC: recovery of sparse signals via convex programming. Technical report, 2005. Available at http://www.acm.caltech.edu/l1magic Candès EJ, Romberg J, Tao T (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans Inf Theory 52(2): 489–509 Claerbout J, Muir F (1973) Robust modeling of erratic data. Geophysics 38:826–844 Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model Simul 4(4):1168–1200 Courant R, Hilbert D (1962) Methods of mathematical Physics, vol 2. Wiley-Interscience, New York Daniels D (2004) Ground penetrating radar. The Institution of Electrical Engineers, London Daubechies I, Defrise M, De Mol C (2004) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun Pure Appl Math 57(11):1413–1457 Daubechies I, DeVore R, Fornasier M, Güntürk CS (2010) Iteratively reweighted least squares minimization for sparse recovery. Commun Pure Appl Anal 63(1):1–38 Donoho DL, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via I 1 minimization. Proc Natl Acad Sci USA 100(5):2197–2202 Ekeland I, Temam R (1974) Analyse convexe et problèmes variationnels. Collection Études Mathématiques. Dunod, Paris Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Mathematics and its applications. Kluwer Academic, Dordrecht Fawcett JA (1985) Inversion of n-dimensional spherical averages. SIAM J Appl Math 45(2): 336–341 Finch D, Rakesh (2007) The spherical mean value operator with centers on a sphere. Inverse Probl 23(6):37–49 Frühauf F, Heilig A, Schneebeli M, Fellin W, Scherzer O (2009) Experiments and algorithms to detect snow avalanche victims using airborne ground-penetrating radar. IEEE Trans Geosci Remote Sens 47(7):2240–2251 Grasmair M (2009) Well-posedness and convergence rates for sparse regularization with sublinear l q penalty term. Inverse Probl Imaging 3(3):383–387 Grasmair M (2010) Non-convex sparse regularisation. J Math Anal Appl 365:19–28 Grasmair M, Haltmeier M, Scherzer O (2008) Sparse regularization with l q penalty term. Inverse Probl 24(5):055020 Grasmair M, Haltmeier M, Scherzer O (2011a) Necessary and sufficient conditions for linear convergence of `1 -regularization. Commun Pure Appl Math 64(2):161–182 Grasmair M, Haltmeier M, Scherzer O (2011b) The residual method for regularizing ill-posed problems. Appl Math Comput 218(6):2693–2710 Groetsch CW (1984) The theory of Tikhonov regularization for Fredholm equations of the first kind. Pitman, Boston Haltmeier M, Zangerl G (2010) Spatial resolution in photoacoustic tomography: effects of detector size and detector bandwidth. Inverse Probl 26(12):125002 Haltmeier M, Kowar R, Scherzer O (2005) Computer aided location of avalanche victims with ground penetrating radar mounted on a helicopter. In: Lenzen F, Scherzer O, Vincze M (eds) Digital imaging and pattern recognition. Proceedings of the 30th workshop of the Austrian Association for Pattern Recognition, Obergugl, pp 1736–1744 Haltmeier M, Scherzer O, Zangerl G (2009) Influence of detector bandwidth and detector size to the resolution of photoacoustic tomagraphy. In: Breitenecker F, Troch I (eds) Argesim report no. 35: Proceedings Mathmod’09, Vienna, pp 1736–1744 Hofmann B, Kaltenbacher B, Pöschl C, Scherzer O (2007) A convergence rates result in Banach spaces with non-smooth operators. Inverse Probl 23(3):987–1010 Ivanov VK, Vasin VV, Tanana VP (2002) Theory of linear ill-posed problems and its applications. Inverse and ill-posed problems series, 2nd edn. (Translated and revised from the 1978 Russian original). VSP, Utrecht
Sparsity in Inverse Geophysical Problems
1687
Kuchment P, Kunyansky LA (2008) Mathematics of thermoacoustic and photoacoustic tomography. Eur J Appl Math 19:191–224 Levy S, Fullagar T (1981) Reconstruction of a sparse spike train from a portion of its spectrum and application to high-resolution deconvolution. Geophysics 46:1235–1243 Lorenz D (2008) Convergence rates and source conditions for Tikhonov regularization with sparsity constraints. J Inverse Ill-Posed Probl 16(5):463–478 Louis AK, Quinto ET (2000) Local tomographic methods in sonar. In: Colton D, Engl HW, Louis AK, McLaughlin JR, Rundell W (eds) Surveys on solution methods for inverse problems. Springer, Vienna, pp 147–154 Neubauer A (1997) On converse and saturation results for Tikhonov regularization of linear ill-posed problems. SIAM J Numer Anal 34:517–527 Norton SJ, Linzer M (1981) Ultrasonic reflectivity imaging in three dimensions: exact inverse scattering solutions for plane, cylindrical and spherical apertures. IEEE Trans Biomed Eng 28(2):202–220 Oldenburg D, Scheuer T, Levy S (1983) Recovery of the acoustic impedance from reflection seismograms. Geophysics 48:1318–1337 Patch SK, Scherzer O (2007) Special section on photo- and thermoacoustic imaging. Inverse Probl 23:S1–S122 Renegar J (2001) A mathematical view of interior-point methods in convex optimization. MPS/SIAM series on optimization. SIAM, Philadelphia Resmerita E (2005) Regularization of ill-posed problems in Banach spaces: convergence rates. Inverse Probl 21(4):1303–1314 Santosa F, Symes WW (1986) Linear inversion of band-limited reflection seismograms. SIAM J Sci Comput 7(4):1307–1330 Scherzer O, Grasmair M, Grossauer H, Haltmeier M, Lenzen F (2009) Variational methods in imaging. Applied mathematical sciences, vol 167. Springer, New York Stolt RH (1978) Migration by Fourier transform. Geophysics 43:23–48 Symes WW (2009) The seismic reflection inverse problem. Inverse Probl 15(12):123008 Yang J, Zhang Y (2011) Alternating direction algorithms for `1 -problems in compressive sensing. SIAM J Sci Comput 33(1):250–278 Zarzer CA (2009) On Tikhonov regularization with non-convex sparsity constraints. Inverse Probl 25:025006
Multiparameter Regularization in Downward Continuation of Satellite Data Shuai Lu and Sergei V. Pereverzev
Contents 1 2 3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Functional Analysis Point of View on Satellite Geodetic Problems . . . . . . . . . . . . . . An Appearance of a Multiparameter Regularization in the Geodetic Context: Theoretical Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Computational Aspects of Some Multiparameter Regularization Schemes . . . . . . . . . . 5 Numerical Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1690 1693 1699 1703 1706 1709 1710
Abstract
This chapter discusses the downward continuation of the spaceborne gravity data. We analyze the ill-posed nature of this problem and describe some approaches to its treatment. This chapter focuses on the multiparameter regularization approach and show how it can naturally appear in the geodetic context in the form of the regularized total least squares or the dual regularized total least squares, for example. The numerical illustrations with synthetic data demonstrate that multiparameter regularization can indeed produce a good accuracy approximation.
S. Lu () • S.V. Pereverzev Institute for Computational and Applied Mathematics, Austrian Academy of Sciences, Linz, Austria e-mail: [email protected]; [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_27
1689
1690
1
S. Lu and S.V. Pereverzev
Introduction
A principal aim of satellite gravity field determination is the derivation of the disturbing gravity potential or geoidal undulations at the Earth’s surface. However, determining a satellite-only gravitational model from satellite tracking measurements is ill-posed due to the nature of downward continuation. To have a better understanding of the ill-posed nature of this problem assume that data G.t/, derived from satellite gravimetry, are given at a spherical surface of the satellite orbit r D ft 2 R3 W jtj D rg. Then the problem basically is to determine disturbing gravity potential U .x/, harmonic outside the geocentric reference sphere R of the radius R < r, which is a spherical approximation of the geoid. On the other hand, the upward continuation of U .t/ from the reference sphere R into 3 external space ext R D ft 2 R W jtj > Rg can be found by solving the spherical Dirichlet problem, which reads U .t/ D 0 for jtj > R; U .t/ D F .t/ for jtj D R;
(1)
where F .t/ is the disturbing gravity potential at the reference sphere, in which we are interested in, and we assume U .t/ being regular at infinity, i.e., U .t/ D O jt1j and jrt U .t/j D O jt1j2 for jtj ! 1. The solution to the problem (1) is represented by the Abel-Poisson integral (Kellogg 1967) U .t/ D
1 4R
Z R
jtj2 R2 F ./d R ./: jt j3
(2)
Then the disturbing gravity potential F .t/ at the reference sphere can be found using the inverse of Abel-Poisson’s integral (2) for known values of U .t/ D G.t/ at the satellite orbit r , resulting in a Fredholm integral equation of the first kind ADW C F .t/ WD
1 4R
Z R
.r 2
r 2 R2 F ./d R ./ D G.t/; C R2 2t /3=2
t 2 r : (3)
Now one can easily recognize that for r > R the kernel of the downward continuation integral operator ADW C is a bounded continuous function, and therefore, ADW C is a compact operator between Hilbert spaces L2 .R / and L2 .r /. Hence, its inverse cannot be a bounded operator from L2 .r / to L2 .R / (see, e.g., Engl et al. 1996). Remembering Hadamard’s definition of a well-posed problem (existence, uniqueness, and continuity of the inverse), we consequently see that the problem of downward continuation (3) is ill-posed as it violates the third condition. Moreover, a straightforward calculation (see, e.g., Freeden et al. 1997) shows that the operator A D ADW C admits the singular value decomposition (SVD)
Multiparameter Regularization in Downward Continuation of Satellite Data
AD
1 X
1691
aj uj hvj ; iL2 .R / ;
(4)
j D1
where aj D
R k r
; uj D uj .t/ D 1r Yk;i rt ; vj D vj ./ D R1 Yk;i R ; j D i C k 2 ; i D 1; 2; : : : ; 2k C 1; k D 0; 1; : : : ;
(5)
and {Yk;i } is a system of spherical harmonics L2 -orthonormalized with respect to the unit sphere 1 R3 . Therefore, the problem of downward continuation (3) can be formally classified to be exponentially or severely ill-posed (see, e.g., Louis 1989), since the singular values aj of the operator A D ADW C converge to zero exponentially fast for j ! 1. Ill-posed nature of the downward continuation can be seen as a background to the severely ill-posedness of other satellite geodetic problems such as satellite gravity gradimetry, e.g., where the satellite data provide information about secondorder partial derivatives of the gravity potential U .t/ at a satellite altitude (for more details on satellite gravity gradiometry the reader is referred to Rummel et al. (1993) and Freeden (1999) as well as chapter Geodetic Boundary Value Problem of this work). Using second-order radial derivatives on the orbital sphere r , the spherical framework of the satellite gravity gradiometry (SGG) can also be mathematically formulated in the form of a first kind Fredholm integral equation with the operator A D AS GG W L2 .R / ! L2 .r / defined by AS GG F .t/ WD
1 4R
Z R
@2 @r 2
r 2 R2 F ./d R ./; .r 2 C R2 2t /3=2
t 2 r : (6)
Although the kernel of this integral operator is less smooth than the downward continuation kernel, its SVD has the form (4) with exponentially decreasing singular values aj D
R r
k
.k C 1/.k C 2/ ; j D i Ck 2 ; r2
i D 1; 2; : : : ; 2kC1;
k D 0; 1; : : : : (7)
(see, e.g., Freeden 1999; Pereverzev and Schock 1999). Therefore, SGG-problem can be formally classified to be severely ill-posed as well. Thus, due to the ill-posed nature of downward continuation, in satellite geodesy we have to deal with exponentially ill-posed integral equations. In any discretized version of such an equation, this ill-posedness is reflected in the ill-conditioning of the corresponding matrix equation Ax D y;
(8)
1692
S. Lu and S.V. Pereverzev
and the crux of the difficulty is that usually in practice we are confronted with satellite data y" blurred by observation error 2 D y y2 . Note that polar gaps could be an extra error source (see, e.g., Boumann 2000). Ill-conditioned matrix equations are not new in geodesy, especially when satellite observations are used for gravity field estimation (see, e.g., Kusche and Klees 2002). A number of methods have been proposed to reconstruct the solution x of problem (8) from noisy observation y" . One of the most widely used methods was and still is Tikhonov regularization, which estimates x - as the minimizer x"˛ of the functional ˆ.˛I x/ D jjAx y jj2 C ˛jjBxjj2 ;
(9)
where jj jj is the norm induced by an appropriate scalar product h; i, B is a symmetric positive (or semi-positive) definite matrix, and ˛ > 0 is the regularization parameter to be chosen properly. In satellite geodesy, B is almost always chosen by using the formulation of Bayesian statistics, where a problem (8) with noisy data is written in the form of a standard Gauss-Markov model y D Ax - C ;
(10)
and the observation error " is assumed to be a random vector with zero expectation E D 0 and the covariance matrix cov " D ı 2 P ; here ı is a small positive number used for measuring the noise intensity. In the next section, we discuss how the Bayesian statistics converts a priori information about noise convariance structure given in the form of a matrix P into the choice of a matrix B in (9). At the same time, several authors (see, e.g., Klees et al. 2003) note that the Bayesian approach cannot be used at all if the covariance matrix cov " will not be known exactly. On the other hand, for new satellite missions one cannot expect to have a good description of the noise covariance. Indeed, as it has been indicated by Kusche and Klees (2002), first, the measurement equipment on board of a satellite will never be validated in orbit, and second, the measurements will also be contaminated with an aliasing signal caused by the unmodeled high frequencies of the gravitational field. It is also worth to note that numerical experiments reported in Bauer and Pereverzev (2006) show that the use of a rough approximation of the covariance operator in Tikhonov regularization leads to a very poor performance. Therefore, one needs algorithms that are capable of dealing with different noise models. One of such algorithms has recently been discussed in Bauer et al. (2007), but it has been designed only for estimating bounded linear functionals of the solution x - . The goal of the present chapter is to discuss another way of the regularization, which is based on the minimization of a Tikhonov-type functional with several regularization parameters. ˆ.˛1 ; ˛2 ; : : : ; ˛l I x/ D jjAx y jj2 C
l X i D1
˛i jjBi xjj2 :
(11)
Multiparameter Regularization in Downward Continuation of Satellite Data
1693
The application of such a multiple parameter regularization in satellite geodesy has been advocated in Xu (1992) and Xu and Rummel (1994). In these papers, the matrices Bi have been chosen to be equal to 0
1
0
C C C C C; C C A
B 0 B B 0 B Bi D B B Ii B @ 0
(12)
0 where Ii is an identity matrix whose dimension is equal to the numbers of the harmonic coefficients of the corresponding degree. On the other hand, in Xu et al. (2006) it has been observed that multiple parameter regularization (11) and (12) only marginally improved its single parameter version (9), with B equals to the identity matrix I . At this point, it is worth to note that in the regularization theory (see, e.g., Engl et al. 1996) one chooses a matrix B in (9) on the base of a priori knowledge about the solution smoothness. For example, if it is known a priori that x - admits a representation x - D B p* for p > 0 and jjjj , then the theory suggests to use such B in (9). It is interesting to note that in view of the paper Svensson (1983) (see also Freeden and Pereverzev 2001), it is reasonable to believe that in an appropriate scale of Sobolev smoothness classes the Earth’s gravitational potential has a smoothness index greater than 3/2. In Sect. 2, we show how this information can be transformed in an appropriate choice of the regularizing matrix B. This choice together with above mentioned conclusion of Xu et al. (2006) suggests the following form of Tikhonov multiple regularization functional ˆ.˛; ˇI x/ D jjAx y jj2 C ˛jjBxjj2 C ˇjjxjj2 ;
(13)
where B is used in the description of the solution smoothness. In Sect. 3, we discuss a choice of the regularization parameters ˛, ˇ that allows an optimal order of the reconstruction accuracy under standard smoothness assumptions. Numerical aspects of the implementation of this parameter choice are discussed in Sect. 4. In Sect. 5, we present some numerical experiments illustrating theoretical results.
2
A Functional Analysis Point of View on Satellite Geodetic Problems
Due to the huge number of observations and unknowns, it is reasonable to analyze (8) and (10) as operator equations in Hilbert spaces with the design operator A acting compactly from the solution space X into the observation space Y . In
1694
S. Lu and S.V. Pereverzev
this context, the covariance P can be seen as a bounded self-adjoint nonnegative operator from Y to Y such that for any f , g 2 Y there holds EŒhf; i hg; i D ı 2 hPf; gi, where h; i; means the scalar product of corresponding Hilbert space. If AD
nA X
ai ui hvi ; i;
nA D rank.A/ D dim Range.A/;
(14)
i D1
is the SVD of the design operator, then it is natural to assume that for random noise " the Fourier coefficients hui , "i; are uncorrelated random variables. This assumption allows us to treat the covariance P as a diagonal operator with respect to the system {ui }, since the observation error " is assumed to be zero-mean such that Ehui ; i D 0, i D 1; 2; : : :, and for i ¤ j hP ui ; uj i D ı 2 EŒhui ; ihuj ; i D ı 2 Ehui ; iEhuj ; i D 0: Thus, P D
nP X
pi ui hui ; i;
nP D rank.P/;
(15)
i D1
where pi D ı 2 EŒhui ; i 2 . In agreement with the Bayesian approach, not only the covariance P is introduced as a priori information, but also the expectation x0 D Ex - , which gives one more observation equation x - D x0 C ;
E D 0;
cov D 2 Q;
where the covariance Q is also assumed to be a bounded self-adjoint nonnegative operator, i.e., Q D QT 0, P and > 0. Keeping in mind that x - 2 Range.AT /, it is natural to assume that D i i i with uncorrelated random Fourier coefficients i D hi , i. Therefore, as in the case of cov 2, QD
nQ X
qi vi hvi ; i;
qi D 2 EŒhvi ; i 2 :
(16)
i D1
Then within the framework of the Bayesian approach, the estimate xO of the unknown element x - follows from the normal equation (see, e.g., Kusche and Klees 2002) .ı 2 AT P 1 A C 2 Q1 /xO D ı 2 AT P 1 y C 2 Q1 x0 :
(17)
At this point it is worth to mention that, as it has been noted in Xu et al. (2006), to obtain a stabilized solution of the geopotential from space geodetic observations,
Multiparameter Regularization in Downward Continuation of Satellite Data
1695
one of the most widely used methods was and still is to employ Kaula’s rule of thumb on the spherical harmonic coefficients, which can be written as follows: xO D .AT P 1 A C K/1 AT P 1 y ;
(18)
where K is diagonal with its elements being inversely proportional to Kaula’s rule of degree variances. The solution of type (18) has often been interpreted in terms of Bayesian inference, since it solves a particular case of (17), where x0 D 0 and QD
2 1 K : ı2
At the same time, Xu (1992) argued that because Kaula’s rule of thumb reflects only possible magnitudes of harmonic coefficients, and because the expected values of these coefficients are indeed not equal to zero (x0 ¤ 0), it is questionable to interpret (18) from the Bayesian point of view. Instead, Xu (1992) and Xu et al. (2006) interpreted (18) as a regularization of (10). Note that in view of (14)–(16), by introducing ˛D
ı2 ; 2
P D
nP X
pi vi hvi ; i;
B D .Q1 P /1=2 ;
(19)
i D1
we can reduce (17) with x0 D 0 to .AT A C ˛B 2 /xO D AT y ;
(20)
which is nothing but the Euler equation for the minimization of the Tikhonov functional (9). Thus, the solution xO D x˛ of (20), or (17) (with x0 D 0, as in Kaula’s rule), is the minimizer of (9) with B given by (19). It allows an interpretation of the regularization parameter ˛ as the ratio of the observation noise level ı 2 to the unknown variance 2 . Moreover, in view of the relations Q D P B 2 , 2 K D ı 2 B 2 P1 the choice of the prior covariance or Kaula’s operator K means the choice of a regularizing operator B in (9). On the other hand, the regularization theory provides a theoretical justification for the Tikhonov regularization (9) under the assumption that the smoothness of the unknown solution x - as well as the smoothing properties of the design operator A can be measured in terms of a regularizing operator B. For example, from Engl et al. (1996) it is known that if for any x 2 X mjjB a xjj jjAxjj M jjB a xjj;
(21)
jjB p x - jj ;
(22)
and
1696
S. Lu and S.V. Pereverzev
with some positive constants a, p, m, M , and then the Tikhonov regularization 2.aC1/ ˛ approximation x" given as the minimizer of (9) with ˛ D O jjjj aCp provides the error bound p jjx - x˛ jj D O jjjj aCp
(23)
for p a C 2. Observe that in terms of covariance P (15) the expected value of the norm jj 2 jj can be written as follows: Ejjjj2 D ı 2
nP X
pi :
i D1
Then it is natural to assume that the norm of actual realization of the random variables " can be estimated as jjjj D jjAx - y jj cı;
(24)
where c is some fixed constant. It is interesting to note that within the problem setting (21), (22), and (24) the order of accuracy p jjx - x˛ jj D O ı aCp ;
(25)
given by (23) and (24), cannot be improved by any other regularization method (see, e.g., Nair et al. 2005). Our goal now is to clarify the assumption (21) and (22) in the space geodetic context. Of particular interest for our consideration are the spherical Sobolev spaces (Freeden 1999). Starting with an unbounded self-adjoint strictly positive definite in L2 .R / operator 1 2kC1 X X ˝ R ˛ 1 R Yk;i kC Df .t/ WD .t/ Yk;i ; f L2 . / ; R 2 i D1 kD0
R where Yk;i .t/ D R1 Yk;i introduce the space
t R
are L2 .R /-orthonormal spherical harmonics, we
(
) 1 2kC1 X X 1 2s ˝ R ˛2 kC Yk;i ; f L2 . / < 1 hs .R / D f W jjD f jj D R 2 i D1 s
2
kD0
Multiparameter Regularization in Downward Continuation of Satellite Data
1697
with the scalar product f; gs WD D s f; D s gL2 .R / and the associated norm jjf jjs D .hf ,f is /1=2 . The spherical Sobolev space Hs .R / is the completion of hs .R / under this norm. In particular H0 .R / D L2 .R /. The family fHs .R /g; s 2 .1; 1/, of spherical Sobolev spaces is a particular example of the so-called Hilbert scale. Any such scale allows the following interpolation inequality relating the norms of the same element in different spaces (see, e.g., Engl et al. 1996): s
jjf jj jjf jjs jjf jjss ;
(26)
where 1 < < < s < 1. The spherical Sobolev spaces are of particular interest in physical geodesy, since a mathematical study (Svensson 1983) shows that a square-integrable density inside the Earth’s sphere R implies a potential x - D F .t/ of class Hs .R / with s D 32 . E D R In addition, the known (unitless) leading coefficients x - ; Yk;i of the Earth’s 2 L .R /
anomalous potential allow the estimates (see, e.g., Freeden and Pereverzev 2001)
-
jjx jj3=2 WD
1 2kC1 P P
kC
kD0 i D1 -
-
jjx P31 x jj3=2 WD
1 2kC1 P P kD32 i D1
1 2 2
kC
D x
1 2 2
-
R ; Yk;i
D x
-
E2
R ; Yk;i
!1=2
E2
1:934 103 ; !1=2 4:3 108 ; (27)
R ;i where Pn is the orthoprojector onto the linear span of fYk;i
D 1; 2; : : :; 2kC1; k D 0; 1; : : :; ng. Thus, in the space geodetic context the assumption (22) is satisfied with any 3 B D D q , p D 2q , q > 0 and 1:934 103 . Now consider the assumption (21) of smoothing properties of design operators A. We analyze it in the case of SGG problem, when the design operator A D AS GG is given by (6) and (7). Similar analysis can be made for A D ADW C . Note that in the regularization theory, the number a in (21) is usually interpreted as the degree of the ill-posedness of a problem: the more a, the smoother is the image Ax, and the more ill-posed is the problem (10). As one can see it from (7), the singular values aj of the design operator AS GG decay exponentially fast with j ! 1. On the other hand, for any operator B D D q the singular values bj of B a have the form bj D .k C 1=2/aq , j D k 2 C i , i D 1; 2, . . . , 2k C 1, k D 0; 1; : : :, and decay at a power rate only. For this reason SGG-problem can be formally classified to be exponentially or severely ill-posed, since in general no inequality of the form (21) is possible for B D D q , A D AS GG and a finite a > 0. Nevertheless, we argue that for some satellite missions such as Gravity field and steady-state Ocean Circulation Explorer (GOCE) (see, e.g., Rebhan et al. 2000), the satellite gravity gradiometry problem can be treated as mildly ill-posed.
1698
S. Lu and S.V. Pereverzev
Recall that the aim of GOCE-mission (see chapters Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the First: The History of the Geomagnetic Atlases, Earth Observation Satellite Missions and Data Access, and Geodetic Boundary Value Problem of this book) is to provide a high-accuracy E model of D R up to the Earth’s gravitational field based on potential coefficients x - ; Yk;i 2 L .R /
degree k D 300. It is interesting to observe that up to this degree the singular values aj D ak;i D .R=r/k .k C 1/.k C 2/=r 2 , j D k 2 C i behave like .k C 1=2/s with s D 5. 5. A straightforward calculation shows that assuming a mean Earth’s radius R D 6;378 103 [m] and an altitude of GOCE-satellite of about r R D 250 103 [m] we obtain, in particular 11
11
0:2.k C 1=2/ 2 ak;i 3.k C 1=2/ 2 ;
k D 100; 101; : : : ; 300:
(28)
This observation gives a hint that within the GOCE-data processing it is reasonable to approximate AS GG by the design operator 300 P
AQS GG WD
R k .kC1/.kC2/ 2kC1 P
kD0 1 P
r
C
r2 11
.k C 1=2/ 2
i D1 2kC1 P i D1
kD301
D E r R Yk;i Yk;i ; r Yk;i
D
L2 .R /
E R Yk;i ;
L2 .R /
;
r .t/ D 1r Yk;i rt . where Yk;i Using (28) and the definition of the operator D generating the scale of spherical Sobolev spaces fHs .R /g one can easily find constants m, M such that for any x 2 L2 .R / ˇˇ ˇˇ ˇˇ ˇˇ 11 ˇˇ 11 ˇˇ ˇˇ ˇˇ jjAQS GG xjjL2 .r / M ˇˇD 2 x ˇˇ 2 : (29) m ˇˇD 2 x ˇˇ 2 L .R /
L .R /
It means that the approximate design operator AQS GG satisfies the assumption (21) 11 for any B D D q and a D 2q , q > 0. Consider now the solution xQ - of approximate SGG-problem AQS GG x D y0 ;
(30)
where data y0 D AS GG x - correspond to the exact potential x - . Then (29) together with the interpolation inequality (26), where D 11 , D 0, s D 32 , yields 2 ˇˇ ˇˇ ˇˇx xQ - ˇˇ
L2 .R /
ˇˇ ˇˇ 3 ˇˇ ˇˇ 11 ˇˇx - xQ - ˇˇ1411 ˇˇx - xQ - ˇˇ 14 3 2 ˇˇ 2 ˇˇ ˇˇ 11 3 ˇˇ 14 ˇˇAQS GG .x xQ /ˇˇ 2 ˇˇx xQ - ˇˇ 14 m 3 : L .r / 2
From the definition of AS GG and AQS GG , it follows that ˝ R -˛ Yk;i ; xQ L2 .
R/
˝ R -˛ D Yk;i ; x L2 .
R/
˝ r ˛ 1 Yk;i ; y0 L2 . D ak;i
R/
Multiparameter Regularization in Downward Continuation of Satellite Data
1699
11 up to the degree k D 300, while for k D 301, 302, . . . , ak;i < k C 12 2 and ˇD ˇ ˇD E ˇˇ ˇ R -E ˇ ˇ ˇ Y ; xQ ˇ D .k C 1 / 112 ˇ Y r ; y0 ˇ ˇ k;i ˇ k;i 2 L2 .R / ˇ L2ˇr ˇ ˇ ˇD ˇ E ˇ r ˇ ˇD R - E ˇ 1 ˇ ˇ ˇ: ˇ < ak;i ˇ Yk;i ; y0 2 D ˇ Yk;i ; x 2 ˇ L .r / L .r / ˇ It means that ˇˇ ˇˇ ˇˇ ˇˇ ˇˇ ˇˇ ˇˇx xQ - ˇˇ 3 ˇˇx - P300 x - ˇˇ 3 ˇˇx - P31 x - ˇˇ 3 : 2
2
2
On the other hand, it is easy to see that ˇˇ ˇˇ ˇˇAQS GG x - xQ - ˇˇ
L2 .r /
ˇˇ ˇˇ D ˇˇ.AS GG AQS GG /x - ˇˇL2 .r / ˇˇ ˇˇ D ˇˇ.AS GG AQS GG /.I P300 /x - ˇˇL2 .r / ˇ ˇ ˇ ˇ 11 .300 C 12 / 2 ˇˇ.I P300 /x - ˇˇL2 .R / ˇˇ ˇˇ .300 C 12 /7 ˇˇ.I P300 /x - ˇˇ 3 2 ˇˇ ˇˇ 4:5 1018 ˇˇx - P31 x - ˇˇ 3 : 2
Summing up and using (27) we obtain ˇˇ ˇˇ ˇˇx xQ - ˇˇ
L2 .R /
ˇˇ ˇˇ 2:104 ˇˇx - P31 x - ˇˇ 3 1011 : 2
Thus, approximating the solution x - of a severely ill-posed problem AS GG x D y0 by the solution xQ - of a moderately ill-posed problem AQS GG x D y0 one has a possibility to recover the gravitational potential x - with an accuracy of order 1011 . At the same time, as it has been indicated in Freeden and Pereverzev (2001, Table 3) the expected accuracy for a mission with the same altitude as GOCE is about 4 1011 . It means that within a desired accuracy of GOCE-data processing, there is a possibility to treat the satellite gravity gradiometry problem as a moderately illposed equation AQS GG x D y" with a degree of ill-posedness a D 11 , say. Moreover, 2 from (25) it follows that in terms of an observation error norm jj"jj D ı the accuracy q provided for such an equation the Tikhonov regularization (9) with B D D , q > 3 by 0 has at least the order O ı 14 .
3
An Appearance of a Multiparameter Regularization in the Geodetic Context: Theoretical Aspects
When determining the gravity field of the Earth by satellite data, we have to keep in mind that (8) and (10) contain also a modeling error, called aliasing sometimes. Such an error may be caused, e.g., by a deviation of a satellite from a circular orbit,
1700
S. Lu and S.V. Pereverzev
or by the fact that when locally determining the geopotential, the effects of outer zones are neglected. Due to modeling error, the design operator/matrix A is specified inexactly, and we represent it as A D A0 C hE;
(31)
where A0 is the exact design operator/matrix and hE is the modeling error. Remember that for well-posed problem of the form (8) and (10) total least squares method (TLS) takes care of additional perturbations in the design operator A0 and is a well-accepted generalization of the classical least squares (see, e.g., O Van Huffel and Vanderwalle 1991). In the TLS-method some estimate .x; O y; O A/ for .x - , y0 , A0 / from given data .y2 ; A/ is determined by solving the constrained minimization problem O jjAO Ajj2 C jjyO y jj2 ! min subject toAOxO D y:
(32)
O are the unknowns, and in the finite dimensional case In the problem, .x; O y; O A/ the norms in (32) are the Frobenius matrix norm and the Euclidean vector norm respectively. For ill-posed problems (8) and (10) it may happen that there does not exist any solution xO 2 X of the problem (32). Furthermore, if there exists a TLS-solution x, O it may be far away from the desired solution x - . Therefore, it is quite natural to restrict the set of admissible solutions by searching for approximations xO that belong to some prescribed set S , which is the philosophy of regularized total least squares (RTLS) (see Golub et al. 1999). The simplest case occurs when the set S is a ball S D fx 2 X W jjBxjj g, where B is some densely defined self-adjoint strictly positive operator, and is a prescribed radius. This leads to the RTLS-method, in which some estimate O for .x - , y0 , A0 / is determined by solving the constrained minimization .x; O y; O A/ problem O jjB xjj O : jjAO Ajj2 C jjyO y jj2 ! min subject to AOxO D y;
(33)
Note that several authors (see, e.g., Lu et al. 2009) reported a nondesirable sensitivity of RTLS-solutions to a misspecification of the bound , which ideally should be chosen as D jjBx - jj. But in the satellite geodesy one can easily overcome this drawback by using known bounds of the form (27). For example, it is known (see, e.g., Freeden and Pereverzev 2001) that !1=2 1 2kC1 X X ˇˇ ˇˇ 1 3 ˝ - R ˛2 ˇˇx P1 x - ˇˇ 3 WD kC x ; Yk;i 3:033 107 : 2 2 i D1 kD2
(34)
Multiparameter Regularization in Downward Continuation of Satellite Data
1701
Then within the framework of the RTLS-method it is reasonable to take B D B D
1 2kC1 X X˝ kD0 i D1
R Yk;i
˛
3 1 2kC1 X X 1 2˝ R ˛ kC Yk;i ; C 2 i D1
D 3:033 107 ;
kD2
(35) where " is some small positive number introduced with the aim to keep B D B strictly positive. Note that the reason to use = jjx - P1 x - jj3=2 instead of D jjx - jj3=2 is that in the satellite geodesy a regularization is often not applied to the first few degrees of the model (see, e.g., Xu et al. 2006). Let us now summarize some characterizations of RTLS-solutions that serve as a starting point for developing algorithms to construct such solutions. From Golub et al. (1999) and Beck et al. (2006) we have Theorem 1. If in the problem (33) the constraint jjB xjj O is active then the ˛;ˇ RTLS-solution xO D x satisfies the equations .AT A C ˛B 2 C ˇI /x˛;ˇ D AT y
(36)
jjBx˛;ˇ jj D ;
(37)
and
where the parameters ˛, ˇ satisfy ˛ ˝ 1 ˛ D 2 ˇ C jjy jj2 y ; Ax˛;ˇ ; ˛;ˇ
Moreover, the RTLS-solution xO D x minimization problem
˛;ˇ
ˇD
jjAx
y jj2 ˛;ˇ
1 C jjx jj2
:
(38)
is also the solution of the constrained
jjAx y jj2 ! mi n subject tojjBxjj : 1 C jjxjj2
(39)
From Theorem 1 it follows that in case of the active inequality constraint (33), the ˛;ˇ RTLS-solution xO D x minimizes the Tikhonov multiple regularization functional ˆ.˛; ˇI x/, since (36) is just the Euler equation for the minimization of (13). In this way multiparameter regularization naturally appears in the RTLS-method, but from (38) it can be seen that in the RTLS-method one of the regularization parameters is negative, which is unusual for Tikhonov regularization schemes, where regularization parameters are assumed to be positive. Nevertheless, the system (38) can be considered as an a posteriori rule for choosing regularization parameters. This rule does not require a knowledge of a modeling error h in (31), or an observation error ı in (24) but in general the system (38) is hardly tractable.
1702
S. Lu and S.V. Pereverzev
An alternative to the RTLS-method is the method of dual regularized total least squares (DRTLS), which has been recently proposed in Lu et al. (2009). In this method, the levels of a modeling error h and an observation error ı are assumed to O for .x - , y0 , A0 / is be known. Then in the DRTLS-method, some estimate .x; O y; O A/ determined by solving the constrained minimization problem jjBxjj ! min
subject to AOxO D y; O
jjAO Ajj h;
jjyO y ı;
(40)
where B is the same as in (33). From Lu et al. (2009) we have the following theorem, which provides us with a characterization of DRTLS-solutions. Theorem 2. If in the problem (40) the two inequality constraints are active, then ˛;ˇ the DRTLS-solution xO D x satisfies (36), where the regularization parameters ˛, ˇ solve the following system of nonlinear equations ˛;ˇ
jjAx
˛;ˇ
y jj D ı C hjjx jj; ˛;ˇ
ˇ D h.ıChjjx ˛;ˇ
jjx jj
jj/
:
(41)
It is worth to note that in the special case of no modeling error h D 0, we have ˇ D 0, and the DRTLS-solution xO D x˛;0 reduces to the Tikhonov regularized approximation x"˛ with ˛ chosen by the discrepancy principle such that jjAx˛ y jj D ı. In the case h ¤ 0, we again meet with a multiparameter regularization (36), where one of the regularization parameters is negative. The system (41) gives us a rule for choosing the parameters. In contrast to the RTLSmethod, this rule does not require a reliable bound for the solution norm jjBx - jj and use only the bounds for the error levels ı, h. Thus, in some sense the methods (36)–(38) and (36), (41) compliment each other. Moreover, in terms of the error levels ı, h both these methods guarantee an accuracy of the same order, provided that the standard smoothness conditions (21) and (22) are satisfied. It can be seen from the following theorem Lu et al. (2009). Theorem 3. Assume the conditions (21) and (22) hold with 1 p a C 2. Let ı;ˇ xO D x be the DRTLS-solution given by (36) and (41). If in (40) the inequality constraints are active then
-
p -
jjx xjj O 2jjB x jj
˛;ˇ
a aCp
ı C hjjx - jj m
p pCa
p D O .ı C h/ pCa :
In addition let xO D x be the RTLS-solution given by (36)–(38). If the exact solution x - satisfies the side condition jjBx - jj D then
Multiparameter Regularization in Downward Continuation of Satellite Data
-
p -
jjx xjj O .2jjB x jj/
a aCp
1703
p ! pCa p p maxf1; g. 2 C 1/ D O .ı C h/ pCa : m
From the discussion in Lu et al. (2009), it follows that under the conditions of the p Theorem 3 the order of accuracy O .ı C h/ pCa cannot be improved by any other regularization method.
4
Computational Aspects of Some Multiparameter Regularization Schemes
In this section, we discuss computational aspects of multiparameter regularization considered above. We start with the scheme (11) and (12). As it was observed in Xu et al. (2006), in the determination of the geopotential from precise satellite orbits this scheme performs similar to the iterative version of the single parameter regularization (20) with 0 B B B D I0 D B @
1
02 ::
C C C; A
: 0k
(42)
I where 02 , . . . , 0k are all zero matrices, corresponding to the harmonic coefficients of degrees 2 to k, which are supposed to be sufficiently precise and require no regularization. In Xu et al. (2006) the iterative version of (20) and (42) has been implemented by repeatedly replacing in (10) the true model parameters vector x with its estimate from the previous iteration. Such a scheme is sometimes called iterative Tikhonov regularization (see, e.g., Engl et al. 1996). The simulations of Xu et al. (2006) have shown that the iterative version of (20) and (42) is sufficient to derive an almost best quality geopotential model. Therefore, it can serve as a benchmark for performance evaluation of other schemes. In our numerical illustrations presented in the next section, we use the iterative version of (20) and (42) with k D 2, since in the multiparameter regularizations discussed above the choice of the operator B D B" (35) has been made under the R assumption that the harmonic coefficients hx - ; Yk;i i; of degree k D 2 are sufficiently precise. Note that in view of (27) an extension of all constructions for degrees up to k D 32 is also possible. Our illustrations are performed with synthetic data simulated from known exact solution x - Therefore, in (20) and (42) we are able to choose the best ˛ among a given set of regularization parameters. In this way, we create an ideal benchmark of the performance. We consider three iterations of (20) and (42). An increase of the number of iterations does not essentially change the results.
1704
S. Lu and S.V. Pereverzev
Let us briefly discuss computations with the RTLS-method. It is usually assumed that the radius in (33) is less than jjBxTLS jj, where xTLS denotes the solution of the TLS problem (32); otherwise no regularization would be necessary. Then at the minimizer of (33) the inequality constraint holds with equality, and from the Theorem 1 it follows that the RTLS-solution belongs to the two-parameter family ˛;ˇ of elements x2 , which solve (36) and satisfy (37). In view of (39) a straightforward numerical approach to the RTLS-solution is to construct a representative sample of ˛;ˇ x" by solving (36) with ˛ D ˛i 2 f˛1 ; ˛2 ; : : : ; ˛max g;
ˇ D ˇj 2 fˇ1 ; ˇ2 ; : : : ; ˇmax g ˛ ;ˇ
and then choose such a sample element x i j that meets (37) and brings the minimal ˛;ˇ ˛;ˇ value to the functional jjAx y jj2 = 1 C jjx jj2 among the samples. Of course, in general such a straightforward approach is rather expensive. In the literature, the condition (38) has been used to find RTLS-solution from (36) via solutions of linear or quadratic eigenvalue problems. In both cases one has to solve a sequence of eigenvalue problems, and to make this approach tractable one should reuse as much information as possible from previous problems of the sequence. More details can be found, e.g., in Lampe and Voss (2009). For our numerical illustrations, we implement the RTLS-method in the straightforward way described above, since we are mainly interested in a comparison of results produced by different multiparameter regularization schemes and the RTLSmethod is only one of them. At the end of the section, we briefly describe a strategy for finding two regularization parameters in the DRTLS-method (36) and (41). This strategy is based on an extension of the idea of model function approximation originally proposed in Kunisch and Zou (1998) for a realization of the discrepancy principle in the standard single parameter Tikhonov regularization (9) with B D I . For the DRTLS-method, we need to derive a model function of two variables. ˛;ˇ To this end we consider the function F .˛; ˇ/ D ˆ.˛, ˇ; x" /, where ˆ is ˛;ˇ defined by (13), and x2 is the solution of (36). Using the properties of the norm induced by a scalar product, one can easily check that ˇˇ ˇˇ ˇˇ ˇˇ2 ˇˇ2 ˇˇ2 F .˛; ˇ/ D jjy jj2 ˇˇAx˛;ˇ ˇˇ ˛ ˇˇBx˛;ˇ ˇˇ ˇ ˇˇx˛;ˇ ˇˇ :
(43)
Moreover, similarly to Kunisch and Zou (1998) it can be also shown that ˇˇ ˇˇ2 @F D ˇˇBx˛;ˇ ˇˇ ; @˛
ˇˇ ˇˇ2 @F D ˇˇx˛;ˇ ˇˇ : @ˇ
(44)
Multiparameter Regularization in Downward Continuation of Satellite Data
1705
˛;ˇ
Now the idea is to approximate in (43) the term jjAx jj2 locally at a given point ˛;ˇ .˛; ˇ/ by a value T jjx jj2 , where T is a positive constant to be determined. This approximation together with (43) and (44) gives us an approximate formula F jjy jj2 ˛
@F @F .ˇ C T / : @˛ @ˇ
By a model function approximation we mean a parameterized family of functions m.˛; ˇ/ for which this formula is exact, i.e., each model function should solve the partial differential equation mC˛
@m @m C .ˇ C T / D jjy jj2 : @˛ @ˇ
It is easy to check that a simple parametric family of the solutions of this equation is given by m.˛; ˇ/ D jjy jj2 C
C2 C1 C ; ˛ T Cˇ
(45)
where C1 , C2 , T are parameters to be determined. To use m.˛; ˇ/ for approximating F .˛; ˇ/ locally in a neighborhood of a point ˛ D a, ˇ D b, one can determine these parameters by the interpolation conditions 8 ˆ < m.a; b/ D F .a; b/ ˇˇ ˇˇ2 @m .a; b/ D @F .a; b/ D ˇˇBxa;b ˇˇ ; @˛ @˛ ˆ : @m .a; b/ D @F .a; b/ D ˇˇˇˇx a;b ˇˇˇˇ2 : @ˇ @ˇ
(46)
Then the parameters can be derived explicitly as 8 a;b 2 2 ˆ jj a ; < C1 D C1 .a; b/ D jjBx 2 2 a;b 2 C2 D C2 .a; b/ D jjyjj F .a; b/ ajjBxa;b jj2 =jjx jj ; ˆ : T D T .a; b/ D b C jjy jj2 F .a; b/ ajjBx a;b jj2 =jjx a;b jj2 :
(47)
Using the model function approximation (45)–(47), we can easily derive an iteration ˛;ˇ algorithm for constructing DRTLS-solutions x" from (36) and (41). ˛ ;ˇ Let x k k be an approximation to a DRTLS-solution constructed at k-th iteration step. Then using the second equation of (41), we can explicitly update ˇ D ˇkC1 by a fixed point iteration as ˛ ;ˇ ı C hjjx k k jj ˇkC1 D h
˛ ;ˇk
jjx k
: jj
(48)
1706
S. Lu and S.V. Pereverzev
For updating ˛ we use (44) and the representation ˇˇ ˇˇ ˛;ˇ ˇˇAx y ˇˇ2 D F .˛; ˇ/ ˛ @F .˛; ˇ/ ˇ @F .˛; ˇ/ @˛ @ˇ to rewrite the first equation of (41) as s !2 @F @F @F .˛; ˇ/ ˇ .˛; ˇ/ D ı C h .˛; ˇ/ : F .˛; ˇ/ ˛ @˛ @˛ @ˇ
(49)
Now the idea is to approximate F .˛; ˇ/ in the neighborhood of (˛k , ˇk / by a model function (45) with the parameters C1 , C2 , T determined by (47) for a D ˛k , b D ˇk , such that the updated regularization parameter ˛ D ˛kC1 can be easily found as the solution of corresponding approximate version of (49) s !2 @m @m @m .˛; ˇkC1 / ˇkC1 .˛; ˇkC1 / D ı C h .˛; ˇkC1 / : m.˛; ˇkC1 / ˛ @˛ @˛ @ˇ (50) which is in fact a linear equation in ˛. The algorithm is iterated until the relative or absolute change in the iterates is determined to be sufficiently small. In Lu et al. (2008), it has been demonstrated by numerical experiments that the algorithm of the model function approximation (45), (47), (48), and (50) converges rapidly thereby making the problem of computing DRTLS-solutions computationally tractable. To sum up the discussion, each of considered multiparameter regularization methods can be in principle equipped with a numerical realization scheme, and it seems that for the DRTLS-method such a scheme is algorithmically the simplest one. In the next section, we present numerical simulations to compare the performance of the discussed methods.
5
Numerical Illustrations
Similar to Xu et al. (2006), in our numerical experiments we do not work with real data, but do with artificially generated ones, in order to compare statistically the performances of considered methods. As in Bauer et al. (2007), we assume to have gravity data at an orbit height of about 400 km and to reconstruct a simulated gravitational field at the Earth’s surface. This situation can be modeled by means of the downward continuation operator (4), where aj D .1:06/k ; j D i C k 2 ; i D 1; 2; : : : ; 2k C 1; k D 0; 1; : : : :
Multiparameter Regularization in Downward Continuation of Satellite Data
1707
For the sake of simplicity in (4) we keep only spherical harmonics up to degree k D 10 such that the exact design operator A0 is given in the form of 121 121matrix A0 D diag.1; .1:06/1 ; .1:06/1 ; .1:06/1 ; : : : ; .1:06/10 /: Then we simulate a modeling error as in (31), where E is given as E D jjU jj1 F U, jj jjF is the Frobenius norm, and U is 121 121-matrix with random elements which are uniformly distributed on [0, 1]. The exact solutions x - are also simulated randomly as 121-dimensional vectors x - D x0;1 ; x1;1 ; x1;2 ; x1;3 ; x2;1 ; : : : ; xk;i ; : : : ; i D 1; 2; : : : ; 2k C 1;
k D 2; 3; : : : ; 10;
-
where xk;i are uniformly distributed on [0, 1], and
D 3:033 10
7
!1=2 10 2kC1 X X 1 2 - 2 kC xk;i ; 2 i D1 kD2
which is to say that the components of x - can be seen as spherical harmonics coefficients of a function satisfying (34). Then noisy observations y" are simulated as y D A0 x - C ıjjejj1 e; where e is a 121-dimensional random vector with uniformly distributed on [0, 1] components, and ı is chosen as ı D 0:01jjA0 x - jj that corresponds to the noise level of 1 %. For each of above mentioned problem instances, 50 independent runs of the random number generator were performed, and the methods discussed in Sect. 4 were applied in parallel to 50 obtained noisy matrix equations. The results are displayed in Figs. 1 and 2, where each circle exhibits a relative error in solving one of 50 simulated problems. The circles on the horizontal lines labeled by DRTLS, RTLS, and IT with B D I0 correspond to errors of the DRTLS- and RTLS-methods, and to errors of the iterative version of (20) and (42). Recall that the latter method is used here as a surrogate for the multiple parameter regularization (11) and (12). Moreover, the results obtained by this method correspond to the ideal choice of the regularization parameter based on the complete knowledge of the exact solution x - . Therefore, these results can be considered as a benchmark to assess the performance of other methods. Figure 1 corresponds to the case of no modeling error, h D 0, while Fig. 2 displays the results for modeling error simulated for h D ı.
1708
S. Lu and S.V. Pereverzev
d = 1% h=0 RTLS
DRTLS
IT with B = I0
0
1
2
3
4
5
6
7
8 x 10–3
Fig. 1 Comparison of the performances of considered multiparameter regularizations in the case of no modeling error
RTLS d = 1% DRTLS h=d IT with B = I0
0
1
2
3
4
5
6 x
10–3
Fig. 2 Comparison of the performances of considered multiparameter regularizations in the presence of modeling error, h D ı
In both cases, the RTLS-method was implemented in the straightforward way described in Sect. 4, where the parameters ˛i , ˇj are equidistant such that ˛1 D 0:1, ˛max D ˛20 D 104 , ˇ1 D 1, ˇmax D ˇ20 D 103 . To implement the DRTLS-method we used the algorithm of the model function approximation (45), (47), (48), and (50), and its convergence was observed after 4–5 iteration steps. In the DRTLS-method, as well as in the RTLS-method, we used B D B" defined by (35) and it turns out that the performance of considered methods is not so sensitive to the choice of ". In our experiments we tried " D 101 and " D 1010 , that did not essentially change the situation. For comparison we also present Tables 1 and 2, where mean values, median values and standard deviations of the relative error are given for all discussed methods. Moreover, in these tables one can find statistical performance measures for the most widely used Tikhonov-Phillips regularization scheme corresponding to (20) with B D I . A poor performance of this scheme can be seen as a sign that
Multiparameter Regularization in Downward Continuation of Satellite Data
1709
Table 1 Statistical performance measures for different methods in the case of 1 % of data noise Method DP with B D I DRTLS RTLS Iterated Tikhonov with B D I0
Mean relative error 0.0174 0.0027 0.0022 0.002
Median relative error 0.0174 0.0026 0.0022 0.002
Standard deviation of relative error 0.0006 0.0015 0.0006 0.0007
Mean regularization parameter(s) 0.0093 18.1807 1, 0.001 190.62
Table 2 Statistical performance measures for different methods in the presence of modeling error hDı Method DP with B D I DRTLS RTLS Iterated Tikhonov with B D I0
Mean relative error 0.0176 0.0019 0.0017 0.0026
Median relative error 0.0175 0.0019 0.0016 0.0026
Standard deviation of relative error 0.0006 0.0007 0.0006 0.0005
Mean regularization parameter(s) 0.0092 0.0942, 0.002 0.1, 0.001 10.2
the downward continuation problem should be treated with care. At the same time, multiparameter regularization such as the RTLS- and DRTLS-methods perform at the benchmark level, that allows the suggestion to employ multiparameter regularization schemes in satellite data processing.
6
Conclusion
The downward continuation of the spaceborne gravity data is a challenging task, due to the ill-posedness of the problem. This ill-posedness is inherited by other satellite geodetic problems such as satellite gravity gradiometry and satellite-tosatellite tracking. All these problems can be formally classified as exponentially or severely illposed ones. At the same time, under some circumstances, within desired accuracy range, one can effectively approximate the problems of satellite geodesy by moderately or polynomially ill-posed problems, as it has been shown in Sect. 2. Of course, in this way additional modeling error is introduced into the data processing, and it should be taken into account together with the unavoidable observation error. In mathematical formulation it leads to operator equations with noisy problem instances. The RTLS method is the well-accepted remedy for dealing with such equations. Recently the DRTLS method has been also proposed for solving noisy operator equations. Both these methods can be viewed as multiparameter regularization methods, where one of the regularization parameters is negative.
1710
S. Lu and S.V. Pereverzev
Note that in the geodetic context the concept of multiparameter regularization was for the first time introduced in Xu and Rummel (1992), but usually all regularization parameters are supposed to be positive. In this chapter, we have analyzed the above mentioned multiparameter schemes. Our analysis and numerical illustrations show that multiparameter regularization can indeed produce good results from simulated data. We have restricted our discussion to the case of two regularization parameters, since the RTLS- and DRTLS-method are, in fact, two-parameter regularization schemes. At the same time, a regularization with more than two parameters seems to be an effective tool. A complete analysis of such regularization is an important direction for future research, since the downward continuation of the spaceborne data requires the use of different computational methods as well as different techniques that take into account the nature of the this problem. Acknowledgements The authors are supported by the Austrian Fonds Zur Förderung der Wissenschaftlichen Forschung (FWF), Grant P20235-N18.
References Bauer F, Pereverzev SV (2006) An utilization of a rough approximation of a noise covariance within the framework of multi-parameter regularization. Int J Tomogr Stat 4:1–12 Bauer F, Mathé P, Pereverzev SV (2007) Local solutions to inverse problems in geodesy: the impact of the noise covariance structure upon the accuracy of estimation. J Geod 81:39–51 Beck A, Ben-Tal A, Teboulle M (2006) Finding a global optimal solution for a quadratically constrained fractional quadratic problem with applications to the regularized total least squares. SIAM J Matrix Anal Appl 28:425–445 Boumann J (2000) Quality assessment of satellite-based global gravity field models. PhD dissertation, Delft University of Technology Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Kluwer, Dordrecht Freeden W (1999) Multiscale modeling of spaceborne geodata. B.G. Teubner, Leipzig Freeden W, Pereverzev SV (2001) Spherical Tikhonov regularization wavelets in satellite gravity gradiometry with random noise. J Geod 74:730–736 Freeden W, Schneider F, Schreiner M (1997) Gradiometry – an inverse problem in modern satellite geodesy. In: Engl HW, Louis AK, Rundell W (eds) GAMM-SIAM symposium on inverse problems in geophysical applications. Fish Lake, Yosemite, pp 179–239 Golub GH, Hansen PC, O’Leary DP (1999) Tikhonov regularization and total least squares. SIAM J Matrix Anal Appl 21:185–194 Kellogg OD (1967) Foundations of potential theory. Springer, Berlin Klees R, Ditmar P, Broersen P (2003) How to handle colored observation noise in large leastsquares problems. J Geod 76:629–640 Kunisch K, Zou J (1998) Iterative choices of regularization parameters in linear inverse problems. Inverse Probl 14:1247–1264 Kusche J, Klees R (2002) Regularization of gravity field estimation from satellite gravity gradients. J Geod 76:359–368 Lampe J, Voss H (2009) Efficient determination of the hyperparameter in regularized total least squares problems. Available online https://www.mat.tu-harburg.de/ins/forschung/rep/rep133. pdf Louis AK (1989) Inverse und schlecht gestellte problems. Teubner, Stuttgart
Multiparameter Regularization in Downward Continuation of Satellite Data
1711
Lu S, Pereverzev SV, Tautenhahn U (2008) Dual regularized total least squares and multi-parameter regularization. Comput Methods Appl Math 8:253–262 Lu S, Pereverzev SV, Tautenhahn U (2009) Regularized total least squares: computational aspects and error bounds. SIAM J Matrix Anal Appl 31:918–941 Nair MT, Pereverzev SV, Tautenhahn U (2005) Regularization in Hilbert scales under general smoothing conditions. Inverse Probl 2:1851–1869 Pereverzev SV, Schock E (1999) Error estimates for band-limited spherical regularization wavelets in an inverse problem of satellite geodesy. Inverse Probl 15:881–890 Rebhan H, Aguirre M, Johannessen J (2000) The gravity field and steady-state ocean circulation explorer mission-GOCE. ESA Earth Obs Q 66:6–11 Rummel R, van Gelderen, Koop R, Schrama E, Sanso F, Brovelli M, Miggliaccio F, Sacerdote F (1993) Spherical harmonic analysis of satellite gradiometry. Publ Geodesy, New Series, 39. Netherlands Geodetic Commission, Delft Svensson SL (1983) Pseudodifferential operators – a new approach to the boundary value problems of physical geodesy. Manuscr Geod 8:1–40 van Huffel S, Vanderwalle J (1991) The total least squares problem: computational aspects and analysis. SIAM Philadelphia Xu PL (1992) Determination of surface gravity anomalies using gradiometric observables. Goephys J Int 110:321–332 Xu PL, Rummel R (1992) A generalized regularization method with application in determination of potential fields. In: Holota P, Vermeer M (eds) Proceedings of 1st continental workshop on the geoid in Europe, Prague, pp 444–457 Xu PL, Rummel R (1994) A generalized ridge regression method with application in determination of potential fields. Manuscr Geod 20:8–20 Xu PL, Fukuda Y, Liu YM (2006) Multiple parameter regularization: numerical solutions and applications to the determination of geopotential from precise satellite orbits. J Geod 80:17–27
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed Problems in Geomathematics Frank Bauer, Martin Gutting, and Mark A. Lukas
Contents 1
2
3
4
Introduction and Inverse Problems Considered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Inverse Problems Considered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Regularization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Deterministic and Stochastic Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Assumptions on the Solution x (Source Condition) . . . . . . . . . . . . . . . . . . . . . . . 2.4 Parameter Choice Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Optimal Regularization Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evaluation Process and Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Construction of Our Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Error Comparison in Case of Optimal Solutions . . . . . . . . . . . . . . . . . . . . . . . . . Description and Evaluation of Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Discrepancy Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Transformed Discrepancy Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Modified Discrepancy Principle (MD Rule) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Monotone Error (ME) Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Varying Discrepancy Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Balancing Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Hardened Balancing Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1714 1714 1716 1717 1718 1719 1720 1720 1721 1723 1723 1729 1730 1730 1732 1734 1736 1739 1740 1743
F. Bauer DZ Bank AG, Kapitalmärkte Handel, Quantitative Modelle, F/KHSQ, Frankfurt, Germany e-mail: [email protected] M. Gutting () Geomathematics Group, University of Siegen, Siegen, Germany e-mail: [email protected] M.A. Lukas () Mathematics and Statistics, School of Engineering and Information Technology, Murdoch University, Murdoch, WA, Australia e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_99
1713
1714
F. Bauer et al.
4.8 Quasi-optimality Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 L-Curve Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.10 Extrapolated Error Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.11 Modified Discrepancy Partner (MDP) Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.12 Normalized Cumulative Periodogram Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.13 Residual Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.14 Generalized Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.15 Generalized Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.16 Robust Generalized Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.17 Strong Robust GCV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.18 Modified Generalized Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.19 Other Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Requirements and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Average Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1746 1747 1750 1750 1752 1753 1755 1757 1759 1761 1763 1764 1765 1765 1766 1768
Abstract
Many different parameter choice methods have been proposed for regularization in both deterministic and stochastic settings. The performance of a particular method in a specific setting and its comparison to other methods is sometimes hard to predict. This chapter reviews many of the existing parameter choice methods and evaluates and compares them in a large simulation study for spectral cutoff and Tikhonov regularization. The numerical tests deal with downward continuation, i.e., an exponentially ill-posed problem, which is found in many geoscientific applications, in particular those involving satellite measurements. A wide range of scenarios for these linear inverse problems are covered with regard to both white and colored stochastic noise. The results show some marked differences between the methods, in particular, in their stability with respect to the noise and its type. We conclude with a table of properties of the methods and a summary of the simulation results, from which we identify the best methods.
1
Introduction and Inverse Problems Considered
1.1
Introduction
Inverse problems are an important part of geomathematics, especially now in the age of satellite measurement technologies. Most applications want to use measurements made at satellite altitude (e.g., by satellite-to-satellite tracking (SST), satellite gravity gradiometry (SGG), magnetic field measurements) to determine a geoscientific quantity such as the gravitational potential or the magnetic field on the Earth’s surface. This process involves some variant of downward continuation, which is known to be an exponentially ill-posed problem requiring regularization.
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1715
For further details, we refer to Freeden (1999), Freeden and Michel (2004), Freeden and Gerhards (2013), Lu and Pereverzev (2014), and the references therein. These ill-posed problems can be expressed mathematically as a linear inverse problem Ax D y;
(1)
where A is a linear compact operator mapping between two separable Hilbert spaces X and Y. In practical situations, only a noisy version y ı of y is available as data. Because of the compactness of A, solving (1) for x is unstable, and one needs to regularize the problem to obtain a reasonable approximate solution (see, e.g., Engl et al. 1996). The two most popular regularization methods are spectral cutoff regularization (also called truncated singular value decomposition) and Tikhonov regularization (also called ridge regression or Wiener filtering in certain contexts). Spectral cutoff uses projection onto a subspace spanned by singular vectors corresponding to the larger singular values of A, while Tikhonov regularization is a penalized variational approach which shrinks the contribution of the smaller singular values of A. For both regularization methods, the choice of the regularization parameter is crucial. Many different methods for choosing this parameter have been proposed, often with analytical justification in mind. However, this usually requires a particular setting, and it may not be known how the method performs in all practical situations and under general conditions. The paper of Bauer and Lukas (2011) provides a comprehensive review of most of the existing parameter choice methods and compares them through a large simulation study. Other more limited comparative studies can be found in Wahba (1985), Kohn et al. (1991), Thompson et al. (1991), Hanke and Hansen (1993), Hansen (1994, 1998, Chap. 7), Lukas (1998b), Farquharson and Oldenburg (2004), Abascal et al. (2008), Åkesson and Daun (2008), Rust and O’Leary (2008), Correia et al. (2009), Hämarik and Raus (2009), Hämarik et al. (2009, 2011, 2012), Palm (2010), and Reichel and Rodriguez (2013). Bauer and Lukas (2011) consider synthetic inverse problems with singular values having power-type decay. This paper is based on the survey of Bauer and Lukas (2011), but specifically treats exponentially ill-posed problems closely related to real applications in geomathematics. For these inverse problems, a thorough and up-to-date comparative study of parameter choice methods for spectral cutoff and Tikhonov regularization is provided, taking into account deterministic and stochastic settings. In particular, we investigate the variability of each method both for white noise and for colored noise. For ease of reference, we list below the subsections for the parameter choice methods that are considered. Further methods are briefly discussed in Sect. 4.19 where we also explain why they are left out in this paper. 4.1 Discrepancy principle 4.2 Transformed discrepancy principle 4.3 Modified Discrepancy Principle (MD rule)
1716
4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18
F. Bauer et al.
Monotone error (ME) rule Varying discrepancy principle Balancing principle, balancing principle (white), fast balancing principle Hardened balancing principle, hardened balancing principle (white) Quasi-optimality criterion L-curve method Extrapolated error method Modified discrepancy partner (MDP) rule Normalized cumulative periodogram method Residual method Generalized maximum likelihood Generalized cross-validation Robust generalized cross-validation Strong robust GCV Modified generalized cross-validation
We introduce the downward continuation problem in Sect. 1.2. Section 2 summarizes regularization of linear inverse problems. In Sect. 3, the design of our numerical experiments and the main algorithms used for their generation are presented in brief; we refer to Bauer and Lukas (2011) for a more detailed explanation of those parts that are similar. All the parameter choice methods are briefly described in Sect. 4 and their corresponding results are displayed and commented on there. A summary and conclusions can be found in Sect. 5.
1.2
Inverse Problems Considered
Due to its compactness, the operator A of (1) admits a singular value decomposition (SVD) fk ; uk ; vk gk2N , where fuk gk2N and fvk gk2N are orthonormal in X and Y, respectively, Auk D k vk , A vk D k uk , and k > 0 are in decreasing order. The decay rate of the singular values k determines the degree of ill-posedness of the inverse problem. For the inverse problem of downward continuation, we have X D L2 .R /, i.e., the space of square-integrable functions on the sphere R of radius R (radius of a spherical Earth), and Y D L2 .RCh /, i.e., the space of squareintegrable functions on the sphere of radius R C h, where h is the satellite altitude (assuming a simplified spherical orbit). Then x stands for, e.g., the gravitational potential on the Earth’s surface and y denotes the same potential at the satellite’s orbit. The orthonormal systems are given in terms of spherical harmonics Ym;j of degree m 2 N0 and order j D m; : : : ; m which are adjusted to the corresponding 1 radii. Thus, uk D uk.m;j / D R1 Ym;j , vk D vk.m;j / D RCh Ym;j and the singular values of the operator A W X ! Y of (1) are given by k D k.m;j / D
R RCh
m ;
m 2 N0 ; j D m; : : : ; m:
(2)
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1717
Note that this operator is the so-called upward continuation, its (generalized) inverse is the downward continuation. Obviously, the singular values in (2) are exponentially decreasing, i.e., downward continuation is an exponentially ill-posed problem. Since there are 2m C 1 spherical harmonics of degree m and the singular values do not depend on the order j , the singular values occur with this multiplicity. Note that we use the notation k.m; j / in (2) to indicate the renumbering, which is given by k.m; j / D m2 C m C 1 C j . It is also well known that there are .M C 1/2 spherical harmonics of degree M or less (see, e.g., Freeden and Gutting (2013) for a detailed introduction of spherical harmonics and their properties). It should also be noted that instead of L2 -spaces the theory of spherical Sobolev spaces can be applied. The SST and SGG problems that are mentioned in the introduction can be investigated in simplified form such that y is the first or second radial derivative of the gravitational potential instead of the potential itself. This provides the following singular values for m 2 N, j D m; : : : ; m: SST k.m;j /
mC1 D RCh
R RCh
m ;
SGG k.m;j /
.m C 1/.m C 2/ D .R C h/2
R RCh
m :
In both cases, the singular values are still exponentially decreasing as k or m tend to infinity. Further details can be found in, e.g., Freeden (1999), Freeden and Michel (2004), Freeden and Gerhards (2013), Freeden and Schreiner (2015), Lu and Pereverzev (2014), and the references therein. Another ill-posed problem closely related to downward continuation is the inverse gravimetric problem (see Freeden and Michel (2004), Michel (2014), and the references therein). While the problems discussed above are exponentially ill-posed in a formal sense, in practice, where there is a bound on the degree of the spherical harmonics, the problems may exhibit behavior of more moderately ill-posed problems. In particular, it is argued in Lu and Pereverzev (2014) that the SGG problem for the GOCE mission, where the degree is bounded by 300, can be considered as mildly (polynomially) ill-posed, SGG 5:5 with k.m;j for 100 m 300. This observation also applies to / .m C 1=2/ the finite dimensional problems to be considered in our numerical experiments. For example, with R D 6;371 km and h D 300 km, the eigenvalues k.m;j / in (2) satisfy 0:5R4 .m C 1=2/8:5 k.m;j / 3R4 .m C 1=2/8:5 for 100 m 300.
2
Preliminaries
In practice, an inverse problem is often discretized (either as the model or for computation) and/or only a finite set of discrete data is available. Starting with (1), we distinguish the following three cases which are considered in one common framework: Case C1. Infinite dimensional situation, where A is a compact linear operator mapping between two separable Hilbert spaces X and Y.
1718
F. Bauer et al.
Case C2. Finite dimensional situation, where A is a matrix with large condition number mapping between X D Rp and Y D Rq , p q. Assume that rank.A/ D p. This case is often called a discrete ill-posed problem. Case C3. Discrete data situation, where the underlying problem is still infinite dimensional, but we only have measurements yi D y.ti / at the q points fti gi D1;:::;q . Then A is a finite rank operator between X and Y D Rq with the correspondence .Ax/i D .A1 x/.ti /, i D 1; : : : ; q. Assume that rank.A/ D q. This case is known as a semi-discrete model. In all cases, the element y 2 Y is perturbed by noise, giving the data y ı .
2.1
Regularization Methods
In this paper, we will concentrate on the two main regularization methods that are used for solving linear inverse problems – spectral cutoff and Tikhonov regularization. Detailed accounts of these methods can be found, e.g., in Groetsch (1984), Hofmann (1986), and Engl et al. (1996). By the use of the singular value decomposition, we obtain
Ax D
R X
k hx; uk i vk ;
kD1
where R is 1, p, and q, respectively, in cases C1, C2, and C3 above. Spectral cutoff regularization is defined by xnı D
l.n/ X
k1 hy ı ; vk iuk ;
(3)
kD1
where l. / is an ascending integer-valued function. The traditional choice is l.n/ D n, but a general l. / allows us to restrict the regularized solutions to an appropriate subset, thereby reducing the computation time significantly without affecting the results. Tikhonov regularization has a continuous regularization parameter ˛, but in practice, one often searches over a discrete set. Here we use a geometric sequence of parameter values ˛n D ˛0 q˛n , where 0 < q˛ < 1 and n 2 N. Tikhonov regularization is defined by the variational formulation xnı D argmin kAx y ı k2 C ˛n kxk2 x2X
or, equivalently, by xnı D .A A C ˛n I /1 A y ı ;
(4)
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1719
where A is the adjoint of A. Here and throughout, the norm meant by kk refers to the Hilbert space in use and will be clear from the context. Note that, for both spectral cutoff and Tikhonov regularization, a larger value of the index n corresponds to less smoothing. For both methods, let xn0 be the regularized solution in the case of noise-free data and let A1 be the linear n regularization operator that maps y ı to xnı , i.e., it holds that xn0 D A1 n y
2.2
and
ı xnı D A1 n y :
Deterministic and Stochastic Noise
Several kinds of additive noise models are used in the study of inverse problems (cf. Eggermont et al. 2014 and the references therein). To describe the major ones, we will denote y ı D y C ı , where is a normalized noise element and ı > 0 is the noise level. The most common noise model in the classical inverse problems literature is deterministic noise (cf. Engl et al. 1996), where 2 Y with k k 1, so ky ı yk ı. This noise model is quite suitable to represent discretization errors, but it is rather poor for describing random measurement errors arising in practice. A practical noise model for a discrete data vector y ı 2 Rq (for cases C2 and C3, see, e.g., Wahba 1990) is y ı D y C ı , where the components i are i.i.d. random variables with mean E i D 0 and variance E i2 D 1. Then ı is the standard deviation of each error component ı i and Eky ı yk2 D ı 2 Ek k2 D qı 2 . This model can be extended to one involving correlated errors, where "i D ı i has covariance matrix C D ŒE."i "j / . A stochastic noise model can also be defined in an infinite dimensional setting (case C1) by using the singular value decomposition of A. Suppose that the Fourier coefficients hy; vk i are known only as the sequence data ykı D hy; vk i C ı k D k hx; uk i C ı k ;
k 2 N;
(5)
where k D h ; vk i are independent normal N .0; 1/ random variables and is a zero-mean weak Gaussian random element. This is called a continuous Gaussian white noise P model (cf. LepskijP1990). Note there is no bound for the error in Y here, since E .ykı hy; vk i/2 D ı 2 is infinite. Colored noise can be defined by introducing a covariance matrix K for the random variables k so that E. k l / D Kkl , in which case we have E h ; f i h ; gi D P f K k kl gl for any pair f; g 2 Y with fk D hf; vk i and gk D hg; vk i. A simple kl choice is to assume K to be diagonal. Then, if the entries Kkk are increasing, it is called blue noise, and, if they are decreasing, it is called red noise. For the finite dimensional case C2, if y ı D y C ı with N .0; I /, then, clearly, using the orthonormal singular vectors vk 2 Rq of A, the model can be written equivalently as a Gaussian white noise model with finite sequence data.
1720
2.3
F. Bauer et al.
Assumptions on the Solution x (Source Condition)
In most of the literature on regularization, it is assumed that x is a fixed (nonrandom) element of X . It is known (see, e.g., Cox 1988; Lukas 1988; Engl et al. 1996) that the error kx xnı k or the expected squared error Ekx xnı k2 (with respect to the noise distribution) in the regularized solution (3) or (4) depends on the abstract smoothness of the unknown solution x. The smoothness assumption made on x is called a source condition and is usually of the form x 2 R..A A/s / for some s > 0 (called Hölder type). However, this form is not suitable for a severely ill-posed problem, where the singular values of the operator have exponential decay, while the unknown solution has relatively low smoothness, in particular for downward continuation. For such problems, it is natural to use the logarithmic source condition x 2 fu W u D lnp .A A/1 v; kvk Kg (cf. Mair 1994; Hohage 2000; Pereverzev and Schock 2000). General source conditions are also used in Mathé and Pereverzev (2003) and Nair et al. (2003). In the Bayesian approach to inverse problems (cf. Tarantola 1987; Evans and Stark 2002; Kaipio and Somersalo 2005; Hofinger and Pikkarainen 2007), it is assumed that x is a random element of X with some prior distribution. These models can be formulated in any of the cases C1, C2, or C3 above. It is known (cf. Larkin 1972; Hofmann 1986; Wahba 1990; Fitzpatrick 1991) that, with a Gaussian prior and independent Gaussian error distribution, the posterior mean given the data is the solution of a certain Tikhonov regularization problem and the appropriate choice for the regularization parameter is determined by the variance of x and the noise level ı. However, in practice both are usually unknown.
2.4
Parameter Choice Method
A parameter choice method is a rule that assigns a value for the regularization parameter. In the case of a discrete set of parameters, the method selects a value for the index, which will be denoted by n . Parameter choice methods can be classified according to the input used to make the choice. There are three basic types (see, e.g., Engl et al. 1996; Bauer and Kindermann 2009): • A priori method, i.e., n is a function of ı and information about the smoothness of x. Since information about x is required, but in practice not known, such methods are not discussed here. • A posteriori method, i.e., n D n .ı; y ı /. The noise level ı is required and has to be estimated if it is not known. • Data-driven method, i.e., n D n .y ı /. The method only requires the data y ı as input. In the literature on deterministic noise, these methods are sometimes called “heuristic methods”, but this has a negative connotation that is not generally deserved.
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1721
We will consider several methods of each of the second and third types. For these methods, if y ı contains stochastic noise, then n is a random variable. Each method computes an associated function F .n/, and n is defined as either the point at which F falls below a threshold (type 1) or the minimizer of F (type 2). Most methods of type 1 originate from a deterministic setting and use a (sensitive) tuning parameter. The methods of type 2 mostly come from a stochastic framework or from heuristic ideas and work without (sensitive) tuning parameters.
2.5
Optimal Regularization Parameter
For the problem Ax D y with data y ı , we define the optimal regularization parameter (index) by nopt D argmin kx xnı k: n
If y ı contains stochastic noise, then nopt is a random variable. In our numerical experiments of each parameter choice method, we will assess the accuracy of the choice n by computing the inefficiency defined by kx xnı k=kx xnı opt k:
(6)
The closer this is to 1, the better is the parameter choice. Using stochastic noise with a large number of replicates of the problem, we can estimate the distribution of the inefficiencies and hence determine the performance of the method. It is clear that, since x is unknown, a practical parameter choice method must use some other known or easily computed/estimated quantities. Many methods use the norm of the residual defined as ky ı Axnı k. If the data are finite and the norm is the Euclidean norm, this is the square root of the usual residual sum of squares, and so it is easily computed. Splitting the error kx xnı k such that kx xnı k kx xn0 k C kxn0 xnı k;
(7)
the first term (regularization error) is bounded by a decreasing function '.n/ which depends on the source condition and the regularization method. A higher smoothness s for x 2 R..A A/s / improves the rate of decrease. If this is only possible up to a maximal value s0 , this value is the so-called qualification and the method exhibits saturation. Tikhonov regularization has qualification s0 D 1, while spectral cutoff has infinite qualification (see, e.g., Engl et al. 1996). Under a logarithmic source condition, slower (logarithmic) rates of decrease are obtained for standard regularizations, in particular for Tikhonov and spectral cutoff regularization (Mair 1994; Hohage 2000; Pereverzev and Schock 2000). It follows that Tikhonov regularization does not exhibit saturation here.
1722
F. Bauer et al.
The second term (propagated noise error) on the right-hand side of (7) can often be bounded for regularization methods as kxn0 xnı k ı%.n/;
(8)
where % is a known increasing function of n, indicating that, with less smoothing, there is more influence of the data noise. For spectral cutoff regularization, (8) holds 1=2 1 with %.n/ D l.n/ and, for Tikhonov regularization, (8) holds with %.n/ D ˛n (cf. Engl et al. 1996). In the case of stochastic noise, the risk, i.e., the expected squared error Ekx xnı k2 , is considered. For noise with zero mean, instead of (7), the risk can be decomposed exactly into a sum of squared bias kx xn0 k2 and variance terms Ekxn0 xnı k2 , i.e., Ekx xnı k2 D kx xn0 k2 C Ekxn0 xnı k2 :
(9)
The squared bias can be bounded as before and, under suitable assumptions, the variance can be expressed as ı 2 %2 .n/ for some increasing function %.n/. For white noise, the spectral cutoff solution (3) has variance 2 2 ı 2 %2 .n/ D ı 2 EkA1 n k D ı
l.n/ X
k2
(10)
Œk =.k2 C ˛n / 2 :
(11)
kD1
and the Tikhonov regularized solution (4) has variance 2 2 ı 2 %2 .n/ D ı 2 EkA1 n k D ı
X
A much more detailed discussion of the above errors (including minimax results) in various situations can be found in Cox (1988), Lukas (1988), Engl et al. (1996), Mair and Ruymgaart (1996), Mathé and Pereverzev (2003), Cavalier et al. (2004), Bauer (2007), and Hofmann and Mathé (2007). It should be pointed out that Bakushinskii (1984) states that, for an ill-posed problem, a parameter choice rule that does not explicitly use the noise level (e.g., data-driven rules) cannot yield a regularization method such that the worst-case error converges to 0 as ı ! 0. This Bakushinskii veto is important for deterministic noise, but it is not really appropriate for stochastic noise (cf. Bauer and Kindermann 2009; Becker 2011). In this situation, as we shall see, there are data-driven rules yielding regularization methods that converge with respect to the risk and perform very well in practice (see also Bauer and Lukas 2011). For some methods, there are stronger results involving oracle inequalities (see Cavalier et al. 2002; Candès 2006; Bauer and Reiß 2008; Cavalier 2008), which provide, for any noise level, a bound on the risk Ekx xnı k2 relative to the smallest possible value of the risk and allow the classification of methods as asymptotically optimal.
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1723
Similar results exist for some methods with a discrete noisy data vector y ı 2 R in case C3, where the asymptotic analysis is as q ! 1 with fixed variance ı 2 . There are connections between results for the continuous white noise model and for discrete sampled data. In particular, for function estimation (A D I ) and other problems, it is known that, under certain conditions, asymptotic results for the continuous white noise model as ı ! 0 can be translated into asymptotic results for discrete data as q ! 1 (cf. Brown and Low 1996; Pensky and Sapatinas 2010). q
3
Evaluation Process and Experiments
This section presents the design of the evaluation process and numerical experiments that will be used for each parameter choice method described in Sect. 4.
3.1
Construction of Our Numerical Experiments
Each parameter choice method will be assessed using the same large set of test problems corresponding to downward continuation. The results will be shown for all situations, independent of any prior knowledge about situations where the method does not work. For each parameter choice method, the experiments used the same random seed, so every method had exactly the same set of operators, solutions, and noisy data to deal with. The experiments were implemented in MATLABr whose notation we also use to describe the algorithms in this section. The test problems are finite dimensional problems (case C2), where A W X ! Y and X D Y D Rq with Euclidean norms. The problems are characterized by the following parameters: • Mmax : maximal spherical harmonic degree, i.e., the number of singular values of A is q D .Mmax C 1/2 ; • h: varying heights in (2) (with R D 6;371 km) determining the speed of the exponential decay behavior; • : (polynomial) decay behavior (smoothness) of x; • log.N 2S / : log10 of the noise-to-signal ratio N 2S D .Eky ı yk2 /1=2 =kyk; • ! : noise behavior (corresponds to the color).
Generation of the Operators The operator A will be taken to be a random diagonal q q matrix, with diagonal elements (i.e., eigenvalues or singular values) decaying like akk k D k.m;j / D
R RCh
m ;
m 2 N0 ; j D m; : : : ; m;
1724
F. Bauer et al.
where k.m; j / D m2 C m C 1 C j . A larger value of h corresponds to a more ill-posed problem, all of which are exponentially ill-posed. The diagonal vector is generated by the following procedure: for deg D 0 W Mmax C 10 OrderVector.deg^ 2 C 1 W .deg C 1/^ 2/ D degI end NumberEigenvalues D .Mmax C 1/^ 2I HelpVar D .6;371=.6;371 C h//:^ OrderVectorI Perturb D exp.0:5 randn.length.HelpVar/; 1/ 0:52 =2/I HelpOp D sort.HelpVar : P erturb; ‘descend’/I A D HelpOp.1 W NumberEigenvalues/I For each parameter choice method, more than 160,000 trials are done using about 40,000 up to 120,000 eigenvalues each time, so speed and memory usage are very important issues. Note that for the maximal degree Mmax we have to deal with .Mmax C 1/2 eigenfunctions and Mmax can be up to 350 (see Tables 1 and 2) in our tests. For this reason, we use the simplest possible form of an inverse problem, i.e., diagonal matrices. Because of the singular value decomposition, these diagonal problems are no less or more ill-posed than other discrete inverse problems. Furthermore, this approach enables us to see the effects of ill-posedness with almost no side effects originating from numerical errors due to machine precision and other machine-dependent errors. In addition, for Gaussian white noise, there is no Table 1 Test cases for Tikhonov regularization
h 300 300 300 300 300 300 300 300 500 500 500 500 500 500 500 700 700 700 700 700
Mmax 350 350 350 350 350 350 350 350 250 250 250 250 250 250 250 200 200 200 200 200
2 2 3 3 4 4 5 5 2 2 3 3 4 5 5 2 3 3 4 4
log.N 2S/ 4.0 6.0 6.0 8.0 8.0 10.0 8.0 10.0 4.0 6.0 6.0 8.0 8.0 8.0 10.0 6.0 6.0 8.0 8.0 10.0
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . . Table 2 Test cases for ExpCutOff regularization
h 300 300 300 300 300 300 300 300 500 500 500 500 500 500 500 700 700 700 700 700
Mmax 300 300 300 300 300 300 200 200 250 250 250 250 250 200 200 200 200 200 200 200
2 2 3 3 4 4 5 5 2 2 3 3 4 5 5 2 3 3 4 4
1725 log.N 2S/ 4.0 6.0 6.0 8.0 8.0 10.0 8.0 10.0 4.0 6.0 6.0 8.0 8.0 8.0 10.0 6.0 6.0 8.0 8.0 10.0
difference at all in using diagonal matrices to investigate stochastic behavior. This follows from the invariance of white noise under the orthogonal transformation involved in the SVD. We also wanted to ensure that we do not use, even accidently, specific features of the operator that would help the inversion, but cannot be found in practice (called “inverse crimes”; see, e.g., Colton and Kress 1998). Therefore, we used a slight random perturbation of the sequence k . The procedure balances retaining the overall exponential decay behavior of the operator including the multiplicity of its singular values while providing some randomness in this component.
Solution Generation Each time a solution x is generated, we use the following procedure (the variables O rderVector and N umberEige nvalues are the same as in Sect. 3.1: HelpVar D .OrderVector.1 W NumberEigenvalues/ C 1/:^ ./I Sign D 2 ceil.2: rand.NumberEigenvalues; 1// 3I Perturb D 1 C 0:1 randn.NumberEigenvalues; 1/ x D Sign : Perturb : HelpVarI This can be interpreted as generating random Fourier coefficients xk with decay behavior jxk j D jxk.m;j / j m and random sign of equal probability. A larger value of gives a smoother solution x. For such polynomial decay rates, the Sobolev lemma allows an interpretation in terms of differentiability, i.e., smoothness. In our setting with 2mC1 eigenfunctions
1726
F. Bauer et al.
(spherical harmonics) of polynomial degree m, this means that we require a decay of the order of m1" , for some " > 0 (i.e., > 1), to guarantee x is square integrable. Moreover, it is known (see, e.g., Freeden et al. 1998; Freeden 1999; Freeden and Michel 2004) that for > r C 2 the function defined by the Fourier coefficients xk , jxk j D jxk.m;j /j m , corresponds to an x 2 C .r/ , r 2 N0 . Therefore, we only consider 2 in our tests. An alternative way to generate solutions of different smoothness is through a source condition. However, using the standard source condition to define x D .A A/s x0 for some x0 2 X is not suitable for the downward continuation problem as it would imply x 2 C .1/ which is of less interest in practice. A more appropriate logarithmic source condition could be used to generate the solutions, but it is easier to use the direct approach above. Furthermore, our approach also allows encapsulation in the software design, i.e., the solution should not “see” the operator and vice versa. Note that, because the theory of saturation is based on source conditions, it does not apply directly for our assumed form of x. Nevertheless, we see similar results and will call these saturation effects. In our evaluation of the methods, we want to identify the effect of the smoothness of the solution, determined by the decay behavior of x and characterized by the parameter , and also the effect of the noise-to-signal ratio N 2S . To avoid too much variability in these features of x for different replicates of x, we chose not to use a colored Gaussian variable for x as in Kaipio and Somersalo (2005) and Bauer (2007). For a Gaussian random variable, both the norm of x and the decay behavior of x vary over a large scale, which also affects the noise-to-signal ratio.
Noise Generation We use a finite stochastic noise model. First, for each replicate y D Ax, the noise p level ı is defined from the input noise-to-signal ratio N 2S as ı D N 2S kyk= q. Then, each time a noise vector is generated, we employ the same procedure as in Bauer and Lukas (2011): we transform y via the discrete cosine transform to the time domain, add (possibly serially correlated) noise with standard deviation ı, and transform back. The degree of correlation and color is determined by !. By adding the noise in a different space from the one used to generate y D Ax, we avoided potential inverse crimes. For ! D 0, the noise is simply Gaussian white noise. For ! ¤ 0, the noise is defined by a moving average process in which both the weights and order depend on !. The noise has higher correlation for larger !, and it has color red for ! > 0 and color blue for ! < 0 (the algorithm can be found in Bauer and Lukas 2011). This type of correlation is observed in many practical applications, for example, in geodesy (cf. Ditmar et al. 2003, 2007). For our experiments, we considered two scenarios for the noise: • White noise, i.e., ! D 0 • Colored noise of unknown random color, i.e., ! is chosen as a pseudorandom variate, uniformly distributed in the interval Œ0:5; 0:5 .
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1727
Regularization Operators For the Tikhonov regularization parameter sequence ˛n D ˛0 q˛n , n 2 N, in (4), we used 100 values from 1 to 1014 with logarithmic equal spacing, i.e., ˛0 D 1:3849 and q˛ D 0:7221. The set of test cases for Tikhonov regularization is given in Table 1. The parameter values for these cases were chosen to achieve a balance between speed and the representation of a reasonably wide set of problems. In addition, the parameter values are constrained so that the optimal regularization parameter lies clearly between 1 and 1014 , i.e., it is not near an endpoint. For the sequence l.n/ of cutoff points in (3), i.e., for spectral cutoff regularization, we used the same procedure as in Bauer and Lukas (2011). The cutoff points are chosen such that there is a minimum spacing of 3 and, more importantly, the cutoff point n C 1 is the first to possess an eigenvalue k.nC1/ < k.n/ =1:07, where k.n/ is the eigenvalue corresponding to the cutoff point n. By this, we can avoid in most cases that there are two cutoff points corresponding to the same spherical harmonic degree m. This method is referred to as exponential spectral cutoff (ExpCutOff), and it also achieves the minimax rate for optimal parameter choice (cf. Bauer and Pereverzev 2005). The set of test cases we used for ExpCutOff is given in Table 2. Again, the parameter values for these cases are constrained so that the optimal parameter nopt has l.nopt / clearly not near an endpoint. Maximal Regularization Parameter For most parameter choice methods, the use of a discrete set of regularization parameters with a fine enough resolution, as chosen above, does not alter the behavior of the method. Clearly, for the efficient implementation of these methods, it is useful to have a bound on the value of nopt (i.e., a maximal regularization parameter), especially if the method minimizes some function. For a few parameter choice methods, e.g., the balancing principle (Sect. 4.6), a maximal index N is an essential input in the algorithm itself. The actual value of N is not crucial so long as nopt < N and, for the sake of computational efficiency, N is not too large. For both spectral cutoff and Tikhonov regularization in a stochastic setting, it is reasonable to define the maximal index as N D maxfnj%.n/ < 0:5%.1/g;
(12)
where Ekxn0 xnı k2 D ı 2 %2 .n/ and ı 2 %2 .1/ is the supremum of the variance. We expect that nopt < N because if n > N , then it follows from (9) that Ekx xnı k2 > ı 2 0:5Ekx x1 k , but we would expect Ekx xnı opt k2 to be much smaller than the right-hand side. To obtain N in practice, one either has to have an analytic expression for ı 2 %2 .n/, as in (10) and (11) for white noise, or a good estimate of it. For any noise color, if several independent data sets are available, then a good estimate is ı ı xn;j k2 ; i ¤ j g: ı 2 %2 .n/ 21 Meanfkxn;i
(13)
1728
F. Bauer et al.
In the experiments, we use four data sets to obtain N for the methods that require a maximal parameter. Often, two sets of data are sufficient (see Bauer 2007 for further details). Furthermore, so that all the parameter choice methods can be compared on an equal basis, we use the same maximal index N for all the methods. For many methods, the usage of N has almost no effect on the results, but for some methods, it has the beneficial effect of reducing the number of severely under-smoothed solutions (see also Bauer and Lukas 2011). If, in practice, only a single data set is available, then it may not be possible to estimate ı 2 %2 .n/ if the noise is correlated with unknown covariance. Then one can define a maximal index N1 by l.N1 / D q for spectral cutoff and by ˛N1 q2 for Tikhonov regularization. For the methods that perform much worse without the use of the maximal index N , the results for N and N1 may be quite different. However, for the methods that perform essentially the same with or without the use of N , the results for N and N1 will be very similar (see Bauer and Lukas 2011).
Runs and Organization of the Results For each parameter choice method, we performed exactly the same experiments, constructed as follows: • For each of the cases in Tables 1 and 2, and for both white noise and random colored noise, we generated 4 operators A as described in Sect. 3.1. • For each operator A, we generated 8 solutions x as described in Sect. 3.1. • For each pair of .A; x/, we generated 8 times 8 different noisy data vectors y ı as described in Sect. 3.1. In the colored noise scenario, each group of 8 data vectors has a different, randomly chosen color, and within the group, the color is the same. This means that, for each test case and noise scenario, there are 2,048 inverse problems that need to be solved. The hierarchical structure was chosen in order to considerably reduce the computational cost. In total, this chapter is based on the solution of more than 3:4 million inverse problems each with an operator of at least 40,000 singular values, in most cases more than twice that number. For each parameter choice method, we will display the simulation results, i.e., the inefficiencies as defined by (6), in one figure, with four panels corresponding to Tikhonov and ExpCutOff regularization under both the white noise and colored noise scenarios. Each panel has the following features: • For each test case (denoted by .h; ; log.N 2S //, with Mmax determined from Tables 1 and 2), a box plot (marking the lower and upper quartiles) shows the distribution of computed inefficiencies for the method, with blue whiskers showing the range. The whiskers have maximum length of 4 times the interquartile range, and outliers beyond this are marked with a red C symbol. • For each test case, the red middle band in the box shows the median of the computed inefficiencies, and an open green dot shows the sample mean.
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
3.2
1729
Error Comparison in Case of Optimal Solutions
To conclude this section, Fig. 1 shows the distribution of the optimal errors kx xnı opt k for Tikhonov and ExpCutOff regularization under both the white and colored noise scenarios. The test cases and replicates are the same as those used for the simulations in the next section (see Tables 1 and 2). The box plots are constructed in the same way as described in Sect. 3.1. The optimal errors differ by several orders of magnitude across the test cases, mostly because of the different noise levels. In addition, there can be significant variability within one test case, especially for the colored noise scenario. Depending on whether the colored noise is at the blue end or red end, both regularization methods will find it easier or harder, respectively, to extract the solution x, compared to the white noise situation. Furthermore, we can see some saturation effects for Tikhonov regularization. The errors for the parameter sets .300; 4; 8/ and .300; 5; 8/ or .500; 4; 8/ and .500; 5; 8/ (to a lesser degree also .300; 4; 10/ and .300; 5; 10/) indicate there is no improvement as the smoothness increases beyond a certain value. Therefore, for our finite dimensional exponentially ill-posed problems, Tikhonov regularization behaves in a similar way as for a polynomially ill-posed problem, in line with the discussion at the end of Sect. 1.2.
Fig. 1 Comparison of optimal errors
1730
4
F. Bauer et al.
Description and Evaluation of Methods
In this section, we will review most of the available parameter choice methods and evaluate them according to the process outlined in Sect. 3. For each method, we will start by describing the origin and rationale of the method. Then we will state the input requirements of the method and the algorithm that we use. This will be followed by a brief discussion of known theoretical and practical issues about the method, including whether the method works for other regularization methods. Finally, we will present the results of the numerical experiments for the method. The methods are ordered as follows: • The first group requires knowledge of the noise level ı. We provide the correct ı in all our tests even though this is usually not available in practice. An estimation of the noise level might induce further errors. • The second group requires at least two independent sets of data as input. • The third group requires no knowledge about the noise. Many of the methods use a tuning parameter or some other parameter that must be chosen. For each of these methods, we have used the tuning parameters of Bauer and Lukas (2011) to allow an easier comparison of the results. However, the setting might not be optimal for the problems here, and the optimal settings for different problems might vary significantly. Section 4.19 contains a list of methods that were not included in this study.
4.1
Discrepancy Principle
The discrepancy principle, which was originally proposed by Phillips (1962) and then developed and analyzed by Morozov (1966, 1984), is one of the oldest and most widely used parameter choice procedures (cf. Engl et al. 1996 and references therein). The rationale is simply that for a good regularized solution, the norm of the residual should match the noise level ı of the data. Although the method was originally developed in a deterministic setting, it has also been studied in a discrete, stochastic setting (see, e.g., Davies and Anderssen 1986; Lukas 1995; Vogel 2002) and a continuous stochastic setting (see Blanchard and Mathé 2012). Method The method needs the following input: • Norms of residuals fAxnı y ı gn N until a certain bound is satisfied • Noise level ı • Tuning parameter In a deterministic setting with ky ı yk ı, the parameter choice n is the first n such that kAxnı y ı k ı, where 1 is a tuning parameter. In our stochastic
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1731
setting, with the error in each element of y ı 2 Rq having standard deviation ı, the choice n is the first n such that p kAxnı y ı k ı q
(14)
where we use D 1:5. Known Issues There has been a lot of work done on the convergence properties of this method (see, e.g., Groetsch 1984; Morozov 1984; Engl et al. 1996). The convergence rate in the deterministic setting for Tikhonov regularization is at the optimal order O.ı 2s=.2sC1/ / if s 2 .0; 1=2 , but at the suboptimal order O.ı 1=2 / if s > 1=2. For spectral cutoff, the discrepancy principle gives the optimal order for any value of s. Under a logarithmic source condition, the discrepancy principle attains the optimal order for both Tikhonov and spectral cutoff regularization (see Hohage 2000, Pereverzev and Schock 2000; Nair et al. 2003; Mathé 2006). In the stochastic setting with a data vector y ı 2 Rq containing uncorrelated errors of variance ı 2 , for Tikhonov regularization, it is known (cf. Davies and Anderssen 1986; Lukas 1995; Vogel 2002) that as the sample size q ! 1, the “expected” discrepancy principle estimate has the optimal rate for the prediction risk EkAxnı yk2 (though, if D 1, the constant makes it over-smoothing). It is also order optimal for the X -norm risk Ekxnı xk2 if x is not too smooth relative to the operator A, but otherwise it is order suboptimal (under-smoothing). Lukas (1998b) shows that for D 1, the actual estimate is asymptotically unstable in a relative sense. For spectral cutoff, the “expected” estimate is order optimal for both the prediction risk and the X -norm risk (cf. Vogel 2002). For a preconditioned Gaussian white noise model and general linear regularization, Blanchard and Mathé (2012) show that the discrepancy principle yields suboptimal rates as ı ! 0. The discrepancy principle is one of the fastest methods available, since one only needs to compute the residuals until the bound (14) is satisfied. However, it has the serious drawback that it needs an accurate estimate of the noise level; even small misestimations can lead to very poor solutions (see Hansen 1998, Chap. 7). The discrepancy principle has also been applied to and analyzed for various iterative regularization methods for linear and nonlinear problems in the deterministic setting (see Hanke and Hansen 1993; Engl et al. 1996; Engl and Scherzer 2000; Bakushinsky and Smirnova 2005; Kaltenbacher et al. 2008; Jin and Tautenhahn 2009; and the references therein). It has also been analyzed for conjugate gradient iteration in a continuous stochastic setting by Blanchard and Mathé (2012). Results From Fig. 2, for white noise, the discrepancy principle performs reasonably well for spectral cutoff regularization, as well as for Tikhonov regularization with less smooth solutions. However, significant saturation effects can be seen for 4. The results for colored noise in Fig. 2 are mediocre, with quite a bit of variation. However, the performance for Tikhonov regularization is not much worse than it is for white noise.
1732
F. Bauer et al.
Fig. 2 Inefficiencies of the discrepancy principle
4.2
Transformed Discrepancy Principle
Motivated by the instability of the discrepancy principle to an incorrect noise level, Raus (1990, 1992) and Hämarik and Raus (2006) developed a parameter choice method in a deterministic setting where the noise level in the data y ı is known only O approximately as ı. Method The transformed discrepancy principle needs the following input: • Norms of transformed residuals fC .Axnı y ı /gn N where the operator C depends on A, the regularization method, and its qualification • Rough estimate ıO of the noise level ı • Tuning parameter b For the stochastic case with y ı 2 Rq , it is assumed that a rough estimate ıO of the error standard deviation ı is known. For Tikhonov regularization, one computes n as the least integer n for which ı ı Op p kA1 n .Axn y /k b ı q= ˛n ;
(15)
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1733
Fig. 3 Inefficiencies of the transformed discrepancy principle
where ˛n D ˛0 q˛n and b is some constant satisfying b > D ..1=4/1=4 .3=4/3=4 /2 0:3248. For spectral cutoff, one computes n as the least integer n for which p kA .Axnı y ı /k b ıO q l.n/ ;
(16)
where b is some constant satisfying b > D 1=2. We choose b D 1:5 for both regularization methods. Known Issues Note that the right-hand side of (15) is an approximate scaled bound p of the propagated noise error kxn0 xnı k ı= ˛n . On the left-hand side of (15) is the norm of the residual transformed to the domain space under the approximate inverse A1 n of A. For this reason, we refer to this parameter choice method as the transformed discrepancy principle. It was shown in Raus (1992) that, for deterministic noise, the method leads to optimal convergence rates when the noise level is known exactly, and it also O as ıO ! 0. Consequently, converges under the assumption that ky y ı k D O.ı/ it is more stable than the discrepancy principle in this case. No knowledge of the solution smoothness is required. A stronger result involving a deterministic oracle inequality was shown in Raus and Hämarik (2007) for the usual noise assumption. The method was also defined and shown to be convergent for problems where the
1734
F. Bauer et al.
operator is only known approximately as A , where kA Ak (Raus 1992). Like the discrepancy principle, the method can be applied easily to iterative regularization methods. Results Figure 3 shows that the transformed discrepancy principle has good to acceptable performance for all cases with white noise. Consistent with the theory, there is no saturation effect for Tikhonov regularization, so in many cases the results are significantly better than those for the discrepancy principle. However, there is substantial variation for some cases in the colored noise scenario. The results for spectral cutoff are slightly better than those of the discrepancy principle. In all situations, the results are very similar to those of the ME rule (Sect. 4.4).
4.3
Modified Discrepancy Principle (MD Rule)
The modified discrepancy principle (MD rule) was developed by Gfrerer (1987) and Engl and Gfrerer (1988) and independently by Raus (1984, 1985) for Tikhonov regularization and general regularization methods in a continuous, deterministic setting (see also Engl 1993 and Sects. 4.4 and 5.1 in Engl et al. 1996). The basic idea of the method is to minimize a bound on the squared error of the regularized solution derived from (7) to yield a practical a posteriori method with optimal convergence rates. It is also known as the Raus–Gfrerer rule or the minimum bound method. In Lukas (1998a), the method was adapted to the discrete, stochastic setting for Tikhonov regularization. The method was developed for regularization methods defined using the spectrum of A A by x˛ı D g˛ .A A/A y ı , where lim˛!0 g˛ ./ D 1=. This includes Tikhonov regularization, for which g˛ ./ D 1=. C ˛/. For such methods, one can derive from (7) a bound on the squared error of the form kx x˛ı k2 2.' 2 .˛; y/ C ı 2 %2 .˛//:
(17)
The minimizer of (17) is defined by f .˛; y/ D .' 2 /0 .˛; y/=.%2 /0 .˛/ D ı 2 :
(18)
By using y ı in place of y, the parameter choice is defined by f .˛; y ı / D ı 2 or f .˛; y ı / D 2 ı 2 for a tuning parameter . Method To use the MD rule, we need to be able to compute .' 2 /0 .˛; y/ and .%2 /0 .˛/ which can be done effectively for Tikhonov and other regularization methods (see Engl and Gfrerer 1988 and Engl et al. 1996, Sect. 5.1). The method can also be applied to regularization methods with a discrete parameter, including spectral cutoff, with the derivatives above replaced by differences. The required input consists of the following:
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1735
• All regularized solutions fxnı gn N until a certain bound is satisfied • Noise level ı • Tuning parameter For Tikhonov regularization, the function %2 .˛/ in the bound (17) is %2 .˛/ D ˛ 1 , and the parameter choice is defined by ˇ ˇ ı ˇ1=2 ˇ ı ı 1 dx˛ ˇ ˇ ˛ ˇ Ax˛ y ; .A / D ˛ 3=2 k.AA C ˛I /3=2 y ı k D ı: d˛ ˇ Using the discrete set f˛n D ˛0 q˛n g, we can approximate the derivative dx˛ı =d ˛ on ı the left-hand side with .xnı xnC1 /.˛n log q˛ /1 . For spectral cutoff regularization, 2 2 2 2 the term ı % .˛/ in (17) is ı l.n/ , and the method can be adapted by using differences. Thus, the parameter choice n is defined as the first n such that ˇ˝ ˛ˇ1=2 ı ˇ ı; ˇn ˇ Axnı y ı ; .A /1 xnı xnC1
(19)
where ( ˇn D
1=2
˛n . log q˛ /1=2
for Tikhonov with ˛n D ˛0 q˛n ;
2 2 1=2 l.n/ / .l.nC1/
for spectral cutoff:
For the tuning parameter, we use D 1:5 for Tikhonov regularization and D 0:5 for spectral cutoff regularization. Note that, for spectral cutoff, this rule is different from the MD rule defined in Raus and Hämarik (2007) since the latter is just the usual discrepancy principle for this regularization method. Known Issues The MD rule was a significant advance on the discrepancy principle because it achieves the optimal rate of convergence as ı ! 0 for deterministic noise, and it does so without any knowledge of the smoothness of the solution x (cf. Gfrerer 1987; Engl and Gfrerer 1988). The method also yields optimal rates for finite dimensional implementations (cf. Groetsch and Neubauer 1989) and when the operator is only known approximately (cf. Neubauer 1988). A deterministic oracle inequality is known for the MD rule with Tikhonov regularization (Raus and Hämarik 2007; Hämarik et al. 2012). The discrete, stochastic version of the method, with a particular tuning constant, is asymptotically (as q ! 1) equivalent to an unbiased risk method (see Lukas 1998a; Cavalier et al. 2002), and, in expectation, it yields the optimal convergence rate. The unbiased risk method chooses the parameter by minimizing an unbiased (for white noise) estimate of the risk, i.e., the expected squared error. However, in Lukas (1998b), it is shown asymptotically and by simulations that both of these methods are unstable and have high variability (see also Cavalier and Golubev (2006) for the unbiased risk method). By changing the tuning parameter, it is possible to improve the stability of the method.
1736
F. Bauer et al.
Fig. 4 Inefficiencies of the MD rule
The modified discrepancy principle can also be applied to iterative regularization methods for linear problems, and it achieves optimal convergence rates in a deterministic setting (cf. Engl and Gfrerer 1988). For nonlinear problems, the method has been extended for Tikhonov regularization and shown to yield optimal rates (see Scherzer et al. 1993; Jin and Hou 1999). In addition, it has been proposed in Jin (2000) as the stopping rule for the iteratively regularized Gauss–Newton method yielding again optimal rates. Results As seen in Fig. 4, the MD rule performs very well in the white noise situation, especially for spectral cutoff regularization. However, it does not perform quite so well for colored noise; in particular, test cases with smoother solutions show a lot of variability. Consistent with the theory, the median results are much better than those of the discrepancy principle in the cases with smoother solutions.
4.4
Monotone Error (ME) Rule
The monotone error (ME) rule was proposed in Alifanov and Rumyantsev (1979), Hämarik and Raus (1999), and Tautenhahn and Hämarik (1999) for various regularization methods in a deterministic setting, and it was extensively discussed
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1737
along with other similar parameter choice rules by Hämarik and Tautenhahn (2001, 2003). The rule is based on the observation that if n is too small (i.e., too much smoothing), then the error kx xnı k (like the regularization error kx xn0 k) decreases monotonically as n increases. Method As input one needs the following: • All regularized solutions fxnı gn N until a certain bound is satisfied • Noise level ı • Tuning parameter In contrast to similar methods, no tuning parameter is required. However, in order to gain some security in the case that ı is only known roughly, a tuning parameter 1 is advisable. For continuous regularization methods, the algorithm is formulated by differentiating with respect to the regularization parameter ˛. The parameter choice ˛ is the largest ˛ such that ˇD Eˇ ˇ 1 ı ˇ y ˇ Ax˛ı y ı ; dd˛ A A1 ˇ ˛ ı k dd˛ A 1 A1 ˛ y k
ı
with 1:
In order to use it in our framework, we have generated a simple discretized version by replacing the differentials with adjacent differences. Then, in the stochastic setting with y ı 2 Rq containing errors of standard deviation ı, the parameter choice n is the first n such that ˇ˝ ı ˛ˇ ˇ Ax y ı ; .A /1 x ı x ı ˇ p n n nC1 ı q: ı 1 ı xn xnC1 k k.A /
(20)
We take D 1:5 for Tikhonov regularization and D 0:75 for spectral cutoff regularization. Known Issues For Tikhonov regularization (and iterated Tikhonov regularization) in a deterministic setting, the ME rule has some favorable properties (cf. Tautenhahn and Hämarik 1999). If ˛ > ˛ , then the error kx x˛ı k decreases monotonically as ˛ is decreased and so kx x˛ı k < kx x˛ı k, which provides a useful bound for parameter selection. Unlike the discrepancy principle (Sect. 4.1), the ME rule is order optimal for the maximal range of the smoothness index (up to the qualification). In addition, for any noise level ı, it leads to smaller errors than the modified discrepancy rule (Sect. 4.3) for the same tuning parameter. With precisely known ı, the optimal tuning parameter is D 1. In the case of spectral cutoff regularization, no optimality results are known.
1738
F. Bauer et al.
In Hämarik and Raus (2009) an alternative discretized version to (20) defines n as the first n such that ˇ˝ ˛ˇ ˇ ˇ .Ax ı C Ax ı /=2 y ı ; .A /1 x ı x ı p n n nC1 nC1 ı q: ı 1 ı xn xnC1 k k.A / This version has the advantage that, like the continuous version, the error in xnı decreases monotonically as n is increased for n < n . The ME rule can also be applied to iterative regularization methods, in particular Landweber iteration, for which it is order optimal (see Hämarik and Tautenhahn 2001, 2003). Results Figure 5 shows that the ME rule has mostly good to acceptable performance for all cases with white noise. There is substantial variation for some cases in the colored noise scenario. The performance for Tikhonov regularization is slightly better than that of the modified discrepancy rule in Fig. 4, which is consistent with the theory. For both Tikhonov and spectral cutoff regularization, the performance is almost identical to that of the transformed discrepancy principle in Sect. 4.2.
Fig. 5 Inefficiencies of the ME rule
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
4.5
1739
Varying Discrepancy Principle
This method, due to Lu and Mathé (2014) and based on Blanchard and Mathé (2012), was developed by modifying the usual discrepancy principle to take into account stochastic noise and to achieve optimal convergence rates in a continuous stochastic setting. The modification involves a certain weighted form of the discrepancy with a varying weight. Method The varying discrepancy principle needs the following input: • • • •
Norms of weighted residuals f.n I C A A/1=2 A .Axnı y ı /gn N The effective dimension, i.e., the trace of .n I C A A/1 A A The noise level ı, where the noise is assumed to be Gaussian white Tuning parameter
The parameter choice is the first n such that p k.n I C A A/1=2 A .Axnı y ı /k ı tr ..n I C A A/1 A A/;
2 for spectral cutoff. For where n D ˛n for Tikhonov regularization and n D l.n/ the tuning parameter > 1, we use D 4 for Tikhonov regularization and D 1:1 for spectral cutoff.
Known Issues The rule was developed for general linear regularization methods as well as conjugate gradient iteration. In Blanchard and Mathé (2012) the discrepancy is weighted using a fixed weight parameter , and it is shown that this weighted discrepancy principle is order optimal as ı ! 0 for an appropriate choice of depending on the solution smoothness. Based on this optimality result, Lu and Mathé (2014) proposed varying yielding the a posteriori method above. It is shown that the method is order optimal provided that the solution has a certain self-similarity property. In Blanchard and Mathé (2012) and Lu and Mathé (2014), the regular methods are combined with an “emergency stop” condition, which defines a maximal regularization parameter. We have not implemented this condition here, but we have used the maximal regularization parameter that is common to all the methods evaluated. Results As seen in Fig. 6, there are significant saturation effects for 4 in the Tikhonov case. Apart from that, the performance with white noise is good, almost as good as for the MD rule in Sect. 4.3. For colored noise, the varying discrepancy principle does not perform quite so well; in particular, test cases with smoother solutions show a lot of variability.
1740
F. Bauer et al.
Fig. 6 Inefficiencies of the varying discrepancy principle
4.6
Balancing Principle
The balancing principle, due to Lepskij (1990), was originally derived for statistical estimation from direct observations in a white noise model. Since then, it has been developed further for regularization of linear inverse problems (see, e.g., Goldenshluger and Pereverzev 2000; Tsybakov 2000; Mathé and Pereverzev 2003; Bauer and Pereverzev 2005; Mathé and Pereverzev 2006) and nonlinear inverse problems (cf. Bauer and Hohage 2005; Bauer et al. 2009) in deterministic and stochastic settings. The principle aims to balance the known propagated noise error bound ı%.n/ in (8) with the unknown regularization error (7) by an adaptive procedure that employs a collection of differences of regularized solutions. Method As input one needs the following: • Maximal index N . • All regularized solutions fxnı gn N up to N . • An upper bound ı%.n/ for the propagated noise error kxn0 xnı k; bounds are known for spectral cutoff and Tikhonov regularization (see after (8)). In the stochastic setting, a bound or estimate ı 2 %2 .n/ of the variance Ekxn0 xnı k2 .
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1741
• Noise level ı (and the covariance in the stochastic setting if known). Then one can use known expressions for ı%.n/. Alternatively, if one has two or more independent sets of data yiı , then Ekxn0 xnı k2 can be estimated by (13). • Tuning constant . Define the balancing functional by b.n/ D max
n 1 The noise level does not need to be known. The modified GCV estimate is defined by n D argmin n N
kAxnı y ı k2 ; 2 .q 1 tr.I cAA1 n //
(33)
where c > 1 is the stabilization parameter (here: c D 3). When c D 1, the method reduces to GCV. Known Issues The effect of the factor c can be explained in terms of the degrees 1 of freedom for the regularized solution, defined as df D tr.AA1 n /, where AAn is the influence matrix (cf. Cummins et al. 2001). Clearly the factor introduces a pole at q=c in the objective function as a function of df , which constrains the value of n so that df < q=c and modifies the function’s shape to prevent under-smoothing. The modified GCV method for Tikhonov regularization is closely related to the RGCV method of Sect. 4.16 in the sense that, under appropriate conditions, they are asymptotically equivalent as the sample size q ! 1 (see Lukas 2008). Clearly, the modified GCV estimate can be computed in the same way as the GCV estimate of Sect. 4.15. It can be expected that the modified GCV method can be extended in the same way as GCV to iterative regularization methods. Results As seen in Fig. 21, the modified GCV method has reasonably good performance across all the test cases for spectral cutoff in both the white and colored noise situations. For Tikhonov regularization, it performs well in the cases with less smooth solutions, but, like GCV, it suffers from the saturation effect. With c D 3 here, the results for the colored noise situation are similar (slightly better) to those of R1 GCV in Fig. 20.
1764
F. Bauer et al.
Fig. 21 Inefficiencies of modified generalized cross-validation
4.19
Other Methods
There are a few parameter choice methods that, for certain reasons, we did not consider in our study. These methods and the reasons are listed below. It should be noted that a method’s omission does not mean that it performs poorly: • The method is known to behave very much like another method in the study: – Akaike information criterion (AIC) of Akaike (1973) behaves like GCV (cf. Eubank 1988). – The unbiased prediction risk method (also known as Mallows CP or CL ) behaves like GCV (see Li 1987; Wahba 1990; Efron 2001). – The unbiased risk method of Lukas (1998a) and Cavalier and Golubev (2006) behaves like the modified discrepancy principle with a particular tuning constant. – Rule R1 of Hämarik and Raus (2009) behaves like the fast balancing principle with `.n/ D n C 1 in (22). • The method does not generalize in an obvious way to both Tikhonov regularization and spectral cutoff regularization: – Arcangeli’s principle (see Arcangeli 1966; Groetsch and Schock 1984; Nair and Rajan 2002), which is an early method developed for Tikhonov regularization with deterministic noise
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1765
– Rule R2 of Raus and Hämarik (2009) and Hämarik et al. (2011, 2012), which was developed for (iterated) Tikhonov regularization with deterministic noise – Risk hull method of Cavalier and Golubev (2006), which was developed for spectral cutoff regularization with Gaussian white noise • The method is difficult to automate: – Some versions of the L-curve method • The method requires a heavy load of precomputations, which makes it difficult to test in our large-scale simulation experiments. Such methods usually make use of a precise specification of the stochastic noise model, which in practice (at least at the necessary precision) is not known: – Risk hull method of Cavalier and Golubev (2006), – Modified balancing principle of Spokoiny and Vial (2009).
5
Conclusion
In this section, we will summarize the requirements and properties of the parameter choice methods described in Sect. 4 and then compare them with respect to their average performance in our numerical experiments. For standard uses, this (in combination with Bauer and Lukas 2011) should serve as a practical guide through the jungle of different methods. However, for special problems, the performance of the methods might be quite different. In addition, for each method, there are implementation issues which might make one or other method more practical in certain situations. Most of the methods are based on principles or rationales that carry over to other regularizations, and so, if a regularization behaves like Tikhonov or spectral cutoff regularization, then similar results can be expected. Thus, our results for Tikhonov and spectral cutoff regularization are seen as model results for regularization methods of low and high qualification, respectively.
5.1
Requirements and Properties
The requirements and general properties of each method are summarized in Table 3. The methods are split into the three groups according to the input information required as listed in the beginning of Sect. 4. We provide the following columns in Table 3: “Spaces” “Setting” “Tuning” “Complexity” “Cal.”
Function space(s) in which norms/inner products are computed. Data setting, i.e., infinite data possible (1) or finite data required (q). Use of tuning parameters (yes/no). Computational complexity, i.e., all regularized solutions needed (N) or bisection possible (log). Requirements for expensive calculations such as traces (tr) or determinants (det) in each step.
1766
F. Bauer et al.
Table 3 Requirements and general properties of the methods. (D(˙) for the L-curve method means that there are both positive and negative results) Method Disc. princ. Trans. disc. MD rule ME rule Vary. disc. Bal. pr. (wh.) Balancing pr. Fast balancing Hardened bal. Hard. bal. (wh.) Quasi-opt. L-curve MDP rule Extrap. err. NCP meth. Residual meth. GML GCV Robust GCV Str. rob. GCV Modified GCV
“Proofs” “Gen.”
Spaces Y X Y Y X X X X X X X X;Y Y X;Y Y Y Y Y Y Y Y
Tuning y y y y y y y y n n n n n n n n n n y y y
Setting 1 1 1 1 1 1 1 1 1 1 1 1 1 q q 1 q q q q q
Complexity log log log log log N N log N N log log log log log log log log log log log
Cal. – – – – tr tr – – – tr – – – – – tr det tr tr tr tr
Proofs D+S D D+S D S D+S D+S D+S S S D+S D(˙) D – – S S S S S S
Gen. l+nl – l+nl – l l+nl l+nl l – – l – l – – – – l – – –
Availability of proofs in a deterministic (D) and/or stochastic (S) setting. Generalizations to other regularization methods (mostly Landweber) for linear (l) or nonlinear (nl) problems.
For further details (input, origin), see the corresponding sections. Note that the optimal use of a tuning parameter usually improves the results for a certain class of problems, but setting it incorrectly can give very poor results. Setting the parameter is especially difficult if there is no theory about its effect on the regularized solution or if one cannot necessarily check the validity of a certain setting. In practice, methods without (or with very robust) tuning parameters are usually preferable. All statements given are to the best knowledge of the authors; some might change in the course of further research. For further details, we refer to Bauer and Lukas (2011).
5.2
Average Performance
For each method evaluated in Sect. 4, there are four displayed figure panels (Tikhonov/ExpCutOff with white/colored noise) with a box plot for every test
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1767
Table 4 Mean and median inefficiencies of the methods over all the test cases Method Disc. princ. Trans. disc. MD rule ME rule Vary. disc. Bal. pr. (wh.) Balancing pr. Fast balancing Hardened bal. Hard. bal. (wh.) Quasi-opt. L-curve MDP rule Extrap. err. NCP meth. Residual meth. GML GCV Robust GCV Str. rob. GCV Modified GCV
ExpCutOff, White Mean Median 1.59 1.51 1.47 1.43 1.12 1.11 1.48 1.44 1.27 1.26 1.41 1.36 1.41 1.36 1.48 1.43 1.01 1.01 1.01 1.01 1.05 1.04 27,349 1,548 1.05 1.04 42.45 1.85 1.16 1.14 1.28 1.25 77,361 3,257 1.00 1.00 1.09 1.08 1.60 1.58 1.19 1.15
ExpCutOff, Color Mean Median 3.14 1.52 2.66 1.44 1.88 1.13 2.63 1.48 2.20 1.26 2.51 1.38 1.38 1.35 1.43 1.42 1.01 1.01 7.50 1.04 1.06 1.05 30,140 1,602 1.06 1.06 60.85 1.92 25.08 2.08 1.87 1.33 73,677 3,140 1,487 1.66 717.8 1.39 2.15 1.58 1.63 1.16
Tikhonov, White Mean Median 5.91 1.53 1.32 1.31 1.49 1.42 1.36 1.38 4.60 1.35 1.91 1.60 1.91 1.60 2.37 1.90 1.03 1.02 1.02 1.02 1.06 1.07 2,676 1,024 1.71 1.61 7.11 1.57 3.03 1.13 9.87 1.95 8.25 2.02 17.42 3.87 13.72 3.08 6.49 1.60 4.34 1.56
Tikhonov, Color Mean Median 4.83 1.54 4.16 1.30 5.98 1.42 3.80 1.37 3.92 1.42 7.73 1.59 1.83 1.58 2.24 1.88 1.03 1.03 4.92 1.04 1.06 1.08 5,211 1,031 3.28 1.63 7.31 1.57 19.65 1.84 18.29 2.03 8.31 1.97 316.4 3.95 178.8 2.85 6.61 1.54 3.87 1.60
case, showing the sample mean(I ) (green ı) and sample median(I ) (red ) of the computed inefficiencies I . The average performance of the method across all the test cases is measured using both mean(mean(I )) and median(median(I )), and these results are displayed in Table 4, which again has three groups according to the input information required. The best methods, i.e., those giving the smallest mean and those giving the smallest median, in the three groups and four situations are marked using boldface. In most cases, the same method is best for both the mean and median, with both values close to the ideal value of 1. Several other methods, especially in the colored noise situation, have a high mean and low median, which means that, while the method mostly performs well, it also generates a significant number of very poor outliers. In each of the three groups, the following methods seem to be the best performers for white and colored noise: • Noise level ı is known accurately: modified discrepancy principle (ExpCutOff), transformed discrepancy principle (Tikhonov), monotone error rule (Tikhonov), and varying discrepancy principle.
1768
F. Bauer et al.
• Several independent data sets are available: hardened balancing principle. • No extra information: hardened balancing principle (white), quasi-optimality criterion, modified discrepancy partner rule (ExpCutOff), strong robust GCV (ExpCutOff), and modified GCV (ExpCutOff). It should be noted that, in most situations, the best methods that do not require the noise level performed better than the methods that use the noise level (i.e., those in the first group). This indicates that one should not use the “known ı” methods for the sake of performance, but there may be another reason, e.g., computational efficiency, for doing so. The conclusions here are consistent with those in the recent numerical studies of Palm (2010), Bauer and Lukas (2011), Hämarik et al. (2011), and Reichel and Rodriguez (2013). The three of these other than Bauer and Lukas (2011) use the set of test problems of Hansen (1994) with uniformly distributed white noise and colored noise. Comparing the results here with Bauer and Lukas (2011), the multiplicity of the eigenvalues in our geomathematically motivated tests seems to have a stabilizing influence on many methods, as they clearly show less variance and fewer outliers here. From Table 4, some methods performed much worse for the colored noise situation than for white noise. In practical applications, where one normally has only a vague idea of the underlying noise structure, it is usually advisable to opt for those methods which have good performance for colored noise. This is especially true as the methods that performed well for colored noise also performed quite well for the white noise situation. It should be remarked that often the application of the regularization method requires much more computational effort than the evaluation of the parameter choice. Therefore, it can be advisable to use several parameter choice methods to help find the best choice of the parameter.
References Abascal J-F, Arridge SR, Bayford RH, Holder DS (2008) Comparison of methods for optimal choice of the regularization parameter for linear electrical impedance tomography of brain function. Physiol Meas 29:1319–1334 Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Second international symposium on information theory (Tsahkadsor, 1971). Akadémiai Kiadó, Budapest, pp 267–281 Åkesson EO, Daun KJ (2008) Parameter selection methods for axisymmetric flame tomography through Tikhonov regularization. Appl Opt 47:407–416 Alifanov O, Rumyantsev S (1979) On the stability of iterative methods for the solution of linear ill-posed problems. Sov Math Dokl 20:1133–1136 Anderssen RS, Bloomfield P (1974) Numerical differentiation procedures for non-exact data. Numer Math 22:157–182 Arcangeli R (1966) Pseudo-solution de l’équation Ax D y. C R Acad Sci Paris Sér A 263(8): 282–285 Bakushinskii AB (1984) Remarks on choosing a regularization parameter using the quasi-optimality and ratio criterion. USSR Comput Math Math Phys 24:181–182
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1769
Bakushinsky A, Smirnova A (2005) On application of generalized discrepancy principle to iterative methods for nonlinear ill-posed problems. Numer Funct Anal Optim 26(1):35–48 Bauer F (2007) Some considerations concerning regularization and parameter choice algorithms. Inverse Probl 23(2):837–858 Bauer F, Hohage T (2005) A Lepskij-type stopping rule for regularized Newton methods. Inverse Probl 21:1975–1991 Bauer F, Kindermann S (2008) The quasi-optimality criterion for classical inverse problems. Inverse Probl 24:035002, 20 Bauer F, Kindermann S (2009) Recent results on the quasi-optimality principle. J Inverse Ill-Posed Probl 17(1):5–18 Bauer F, Lukas MA (2011) Comparing parameter choice methods for regularization of ill-posed problems. Math Comput Simul 81(9):1795–1841 Bauer F, Mathé P (2011) Parameter choice methods using minimization schemes. J Complex 27:68–85 Bauer F, Munk A (2007) Optimal regularization for ill-posed problems in metric spaces. J Inverse Ill-Posed Probl 15(2):137–148 Bauer F, Pereverzev S (2005) Regularization without preliminary knowledge of smoothness and error behavior. Eur J Appl Math 16(3):303–317 Bauer F, Reiß M (2008) Regularization independent of the noise level: an analysis of quasioptimality. Inverse Probl 24:055009, 16 Bauer F, Hohage T, Munk A (2009) Iteratively regularized Gauss–Newton method for nonlinear inverse problems with random noise. SIAM J Numer Anal 47(3):1827–1846 Becker SMA (2011) Regularization of statistical inverse problems and the Bakushinskii veto. Inverse Probl 27:115010, 22 Blanchard G, Mathé P (2012) Discrepancy principle for statistical inverse problems with application to conjugate gradient iteration. Inverse Probl 28:115011, 23 Brezinski C, Rodriguez G, Seatzu S (2008) Error estimates for linear systems with applications to regularization. Numer Algorithms 49(1–4):85–104 Brezinski C, Rodriguez G, Seatzu S (2009) Error estimates for the regularization of least squares problems. Numer Algorithms 51(1):61–76 Brown LD, Low MG (1996) Asymptotic equivalence of nonparametric regression and white noise. Ann Stat 24(6):2384–2398 Calvetti D, Hansen PC, Reichel L (2002) L-curve curvature bounds via Lanczos bidiagonalization. Electron Trans Numer Anal 14:20–35 Candès EJ (2006) Modern statistical estimation via oracle inequalities. Acta Numer 15:257–325 Cavalier L (2008) Nonparametric statistical inverse problems. Inverse Probl 24(3):034004, 19 Cavalier L, Golubev Y (2006) Risk hull method and regularization by projections of ill-posed inverse problems. Ann Stat 34(4):1653–1677 Cavalier L, Golubev GK, Picard D, Tsybakov AB (2002) Oracle inequalities for inverse problems. Ann Stat 30(3):843–874 Cavalier L, Golubev Y, Lepski O, Tsybakov A (2004) Block thresholding and sharp adaptive estimation in severely ill-posed inverse problems. Theory Probab Appl 48(3):426–446 Colton D, Kress R (1998) Inverse acoustic and electromagnetic scattering theory, 2nd edn. Springer, Berlin Correia T, Gibson A, Schweiger M, Hebden J (2009) Selection of regularization parameter for optical topography. J Biomed Opt 14(3):034044, 11 Cox DD (1988) Approximation of method of regularization estimators. Ann Stat 16(2):694–712 Craven P, Wahba G (1979) Smoothing noisy data with spline functions. Numer Math 31:377–403 Cummins DJ, Filloon TG, Nychka D (2001) Confidence intervals for nonparametric curve estimates: toward more uniform pointwise coverage. J Am Stat Assoc 96(453):233–246 Davies AR (1982) On the maximum likelihood regularization of Fredholm convolution equations of the first kind. In: Baker C, Miller G (eds) Treatment of integral equations by numerical methods. Academic, London, pp 95–105
1770
F. Bauer et al.
Davies AR, Anderssen RS (1986) Improved estimates of statistical regularization parameters in Fourier differentiation and smoothing. Numer Math 48:671–697 Ditmar P, Kusche J, Klees R (2003) Computation of spherical harmonic coefficients from gravity gradiometry data to be acquired by the GOCE satellite: regularization issues. J Geod 77(7–8):465–477 Ditmar P, Klees R, Liu X (2007) Frequency-dependent data weighting in global gravity field modeling from satellite data contaminated by non-stationary noise. J Geod 81(1):81–96 Efron B (2001) Selection criteria for scatterplot smoothers. Ann Stat 29(2):470–504 Eggermont PN, LaRiccia V, Nashed MZ (2014) Noise models for ill-posed problems. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, Heidelberg Eldén L (1984) A note on the computation of the generalized cross-validation function for ill-conditioned least squares problems. BIT 24:467–472 Engl HW (1993) Regularization methods for the stable solution of inverse problems. Surv Math Ind 3(2):71–143 Engl HW, Gfrerer H (1988) A posteriori parameter choice for general regularization methods for solving linear ill-posed problems. Appl Numer Math 4(5):395–417 Engl HW, Scherzer O (2000) Convergence rate results for iterative methods for solving nonlinear ill-posed problems. In: Colton D, Engl H, Louis A, McLaughlin J, Rundell W (eds) Survey on solution methods for inverse problems. Springer, New York, pp 7–34 Engl HW, Hanke H, Neubauer A (1996) Regularization of inverse problems. Kluwer, Dordrecht Eubank RL (1988) Spline smoothing and nonparametric regression. Marcel Dekker, New York Evans SN, Stark PB (2002) Inverse problems as statistics. Inverse Probl 18(4):R55–R97 Farquharson CG, Oldenburg DW (2004) A comparison of automatic techniques for estimating the regularization parameter in non-linear inverse problems. Geophys J Int 156:411–425 Fitzpatrick BG (1991) Bayesian analysis in inverse problems. Inverse Probl 7(5):675–702 Freeden W (1999) Multiscale modelling of spaceborne geodata. Teubner, Leipzig Freeden W, Gerhards C (2013) Geomathematically oriented potential theory. Chapman & Hall/CRC, Boca Raton Freeden W, Gutting M (2013) Special functions of mathematical (geo-)physics. Birkhäuser, Basel Freeden W, Michel V (2004) Multiscale potential theory (with applications to geoscience). Birkhäuser, Boston Freeden W, Schreiner M (2015) Satellite gravity gradiometry (SGG): from scalar to tensorial solution. In: Freeden W, Nashed MZ, Sonar T (Eds) Handbook of Geomathematics, 2nd Edn. Springer Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere. Oxford University Press, Oxford Gfrerer H (1987) An a posteriori parameter choice for ordinary and iterated Tikhonov regularization of ill-posed problems leading to optimal convergence rates. Math Comput 49(180):507–522 Girard D (1989) A fast “Monte-Carlo cross-validation” procedure for large least squares problems with noisy data. Numer Math 56(1):1–23 Glasko V, Kriksin Y (1984) On the quasioptimality principle for linear ill-posed problems in Hilbert space. Vychisl Math Math Fiz 24:1603–1613 Goldenshluger A, Pereverzev S (2000) Adaptive estimation of linear functionals in Hilbert scales from indirect white noise observations. Probl Theory Relat Fields 118(2):169–186 Golub GH, von Matt U (1997) Generalized cross-validation for large scale problems. J Comput Graph Stat 6(1):1–34 Golub GH, Heath M, Wahba G (1979) Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21:215–223 Grad J, Zakrajšek E (1972) LR algorithm with Laguerre shift for symmetric tridiagonal matrices. Comput J 15(3):268–270 Groetsch CW (1984) The theory of Tikhonov regularization for Fredholm equations of the first kind. Pitman, Boston Groetsch CW, Neubauer A (1989) Regularization of ill-posed problems: optimal parameter choice in finite dimensions. J Approx Theory 58(2):184–200
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1771
Groetsch CW, Schock E (1984) Asymptotic convergence rate of Arcangeli’s method for ill-posed problems. Appl Anal 18:175–182 Gu C (2002) Smoothing spline ANOVA models. Springer, New York Gu C, Bates DM, Chen Z, Wahba G (1989) The computation of generalized cross-validation functions through Householder tridiagonalization with applications to the fitting of interaction spline models. SIAM J Matrix Anal Appl 10(4):457–480 Haber E, Oldenburg DW (2000) A GCV based method for nonlinear ill-posed problems. Comput Geosci 4:41–63 Hämarik U, Raus T (1999) On the a posteriori parameter choice in regularization methods. Proc Estonian Acad Sci Phys Math 48(2):133–145 Hämarik U, Raus T (2006) On the choice of the regularization parameter in ill-posed problems with approximately given noise level of data. J Inverse Ill-Posed Probl 14(3):251–266 Hämarik U, Raus T (2009) About the balancing principle for choice of the regularization parameter. Numer Funct Anal Optim 30(9–10):951–970 Hämarik U, Tautenhahn U (2001) On the monotone error rule for parameter choice in iterative and continuous regularization methods. BIT 41(5):1029–1038 Hämarik U, Tautenhahn U (2003) On the monotone error rule for choosing the regularization parameter in ill-posed problems. In: Lavrent’ev MM et al (ed) Ill-posed and non-classical problems of mathematical physics and analysis. VSP, Utrecht, pp 27–55 Hämarik U, Palm R, Raus T (2009) On minimization strategies for choice of the regularization parameter in ill-posed problems. Numer Funct Anal Optim 30(9–10):924–950 Hämarik U, Palm R, Raus T (2011) Comparison of parameter choices in regularization algorithms in case of different information about noise level. Calcolo 48:47–59 Hämarik U, Palm R, Raus T (2012) A family of rules for parameter choices in Tikhonov regularization of ill-posed problems with inexact noise level. J Comput Appl Math 236: 2146–2157 Hanke M (1996) Limitations of the L-curve method in ill-posed problems. BIT 36(2):287–301 Hanke M, Hansen PC (1993) Regularization methods for large-scale problems. Surv Math Ind 3(4):253–315 Hanke M, Raus T (1996) A general heuristic for choosing the regularization parameter in ill-posed problems. SIAM J Sci Comput 17(4):956–972 Hansen PC (1992) Analysis of discrete ill-posed problems by means of the L-curve. SIAM Rev 34(4):561–580 Hansen PC (1994) Regularization tools: a Matlab package for analysis and solution of discrete ill-posed problems. Numer Algorithms 6:1–35 Hansen PC (1998) Rank-deficient and discrete ill-posed problems. SIAM, Philadelphia Hansen PC (2001) The L-curve and its use in the numerical treatment of inverse problems. In: Johnstone PR (ed) Computational inverse problems in electrocardiography. WIT, Southampton, pp 119–142 Hansen PC, O’Leary DP (1993) The use of the L-curve in the regularization of discrete ill-posed problems. SIAM J Sci Comput 14(6):1487–1503 Hansen PC, Kilmer ME, Kjeldsen RH (2006) Exploiting residual information in the parameter choice for discrete ill-posed problems. BIT 46(1):41–59 Hansen PC, Jensen TK, Rodriguez G (2007) An adaptive pruning algorithm for the discrete L-curve criterion. J Comput Appl Math 198(2):483–492 Hofinger A, Pikkarainen HK (2007) Convergence rate for the Bayesian approach to linear inverse problems. Inverse Probl 23(6):2469–2484 Hofmann B (1986) Regularization of applied inverse and ill-posed problems. Teubner, Leipzig Hofmann B, Mathé P (2007) Analysis of profile functions for general linear regularization methods. SIAM J Numer Anal 45(3):1122–1141 Hohage T (2000) Regularization of exponentially ill-posed problems. Numer Funct Anal Optim 21:439–464 Hutchinson M (1989) A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. Commun Stat Simul Comput 18(3):1059–1076
1772
F. Bauer et al.
Hutchinson MF, de Hoog FR (1985) Smoothing noisy data with spline functions. Numer Math 47:99–106 Jansen M, Malfait M, Bultheel A (1997) Generalized cross validation for wavelet thresholding. Signal Process 56(1):33–44 Jin Q-N (2000) On the iteratively regularized Gauss–Newton method for solving nonlinear illposed problems. Math Comput 69(232):1603–1623 Jin Q-N, Hou Z-Y (1999) On an a posteriori parameter choice strategy for Tikhonov regularization of nonlinear ill-posed problems. Numer Math 83(1):139–159 Jin Q, Tautenhahn U (2009) On the discrepancy principle for some Newton type methods for solving nonlinear inverse problems. Numer Math 111(4):509–558 Johnstone PR, Gulrajani RM (2000) Selecting the corner in the L-curve approach to Tikhonov regularization. IEEE Trans Biomed Eng 47(9):1293–1296 Kaipio J, Somersalo E (2005) Statistical and computational inverse problems. Springer, New York Kaltenbacher B, Neubauer A, Scherzer O (2008) Iterative regularization methods for nonlinear ill-posed problems. Walter de Gruyter, Berlin Kilmer ME, O’Leary DP (2001) Choosing regularization parameters in iterative methods for illposed problems. SIAM J Matrix Anal Appl 22(4):1204–1221 Kindermann S (2011) Convergence analysis of minimization-based noise level-free parameter choice rules for linear ill-posed problems. Electron Trans Numer Anal 38:233–257 Kindermann S, Neubauer A (2008) On the convergence of the quasioptimality criterion for (iterated) Tikhonov regularization. Inverse Probl Imaging 2(2):291–299 Kohn R, Ansley CF, Tharm D (1991) The performance of cross-validation and maximum likelihood estimators of spline smoothing parameters. J Am Stat Assoc 86:1042–1050 Kou SC, Efron B (2002) Smoothers and the Cp , generalized maximum likelihood, and extended exponential criteria: a geometric approach. J Am Stat Assoc 97(459):766–782 Larkin FM (1972) Gaussian measure in Hilbert space and applications in numerical analysis. Rocky Mt J Math 2:379–421 Lawson CL, Hanson RJ (1974) Solving least squares problems. Prentice-Hall, Englewood Cliffs Leonov A (1979) Justification of the choice of regularization parameter according to quasioptimality and quotient criteria. USSR Comput Math Math Phys 18(6):1–15 Lepskij O (1990) On a problem of adaptive estimation in Gaussian white noise. Theory Probab Appl 35(3):454–466 Li K-C (1986) Asymptotic optimality of CL and generalized cross-validation in ridge regression with application to spline smoothing. Ann Stat 14:1101–1112 Li K-C (1987) Asymptotic optimality for Cp , CL , cross-validation and generalized crossvalidation: discret index set. Ann Stat 15:958–975 Lu S, Mathé P (2013) Heuristic parameter selection based on functional minimization: optimality and model function approach. Math Comput 82(283):1609–1630 Lu S, Mathé P (2014) Discrepancy based model selection in statistical inverse problems. J Complex 30(3):290-308 Lu S, Pereverzev S (2014) Multiparameter regularization in downward continuation of satellite data. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, Heidelberg Lukas MA (1988) Convergence rates for regularized solutions. Math Comput 51(183):107–131 Lukas MA (1993) Asymptotic optimality of generalized cross-validation for choosing the regularization parameter. Numer Math 66(1):41–66 Lukas MA (1995) On the discrepancy principle and generalised maximum likelihood for regularisation. Bull Aust Math Soc 52(3):399–424 Lukas MA (1998a) Asymptotic behaviour of the minimum bound method for choosing the regularization parameter. Inverse Probl 14(1):149–159 Lukas MA (1998b) Comparisons of parameter choice methods for regularization with discrete noisy data. Inverse Probl 14(1):161–184 Lukas MA (2006) Robust generalized cross-validation for choosing the regularization parameter. Inverse Probl 22(5):1883–1902
Evaluation of Parameter Choice Methods for Regularization of Ill-Posed. . .
1773
Lukas MA (2008) Strong robust generalized cross-validation for choosing the regularization parameter. Inverse Probl 24:034006, 16 Lukas MA (2010) Robust GCV choice of the regularization parameter for correlated data. J Integral Equ Appl 22(3):519–547 Mair BA (1994) Tikhonov regularization for finitely and infinitely smoothing operators. SIAM J Math Anal 25:135–147 Mair BA, Ruymgaart FH (1996) Statistical inverse estimation in Hilbert scales. SIAM J Appl Math 56(5):1424–1444 Mathé P (2006) What do we learn from the discrepancy principle? Z Anal Anwend 25(4):411–420 Mathé P, Pereverzev SV (2003) Geometry of linear ill-posed problems in variable Hilbert spaces. Inverse Probl 19(3):789–803 Mathé P, Pereverzev SV (2006) Regularization of some linear ill-posed problems with discretized random noisy data. Math Comput 75(256):1913–1929 Michel V (2014) Tomography: problems and multiscale solutions. In: Freeden W, Nashed MZ, Sonar T (eds) Handbook of geomathematics, 2nd edn. Springer, Heidelberg Morozov VA (1966) On the solution of functional equations by the method of regularization. Soviet Math Dokl 7:414–417 Morozov VA (1984) Methods for solving incorrectly posed problems. Springer, New York Nair MT, Rajan MP (2002) Generalized Arcangeli’s discrepancy principles for a class of regularization methods for solving ill-posed problems. J Inverse Ill-Posed Probl 10(3):281–294 Nair MT, Schock E, Tautenhahn U (2003) Morozov’s discrepancy principle under general source conditions. Z Anal Anwend 22:199–214 Neubauer A (1988) An a posteriori parameter choice for Tikhonov regularization in the presence of modeling error. Appl Numer Math 4(6):507–519 Neubauer A (2008) The convergence of a new heuristic parameter selection criterion for general regularization methods. Inverse Probl 24(5):055005, 10 Opsomer J, Wang Y, Yang Y (2010) Nonparametric regression with correlated errors. Stat Sci 16:134–153 Palm R (2010) Numerical comparison of regularization algorithms for solving ill-posed problems. PhD thesis, University of Tartu, Estonia Pensky M, Sapatinas T (2010) On convergence rates equivalency and sampling strategies in functional deconvolution models. Ann Stat 38(3):1793–1844 Pereverzev S, Schock E (2000) Morozov’s discrepancy principle for Tikhonov regularization of severely ill-posed problems in finite dimensional subspaces. Numer Funct Anal Optim 21: 901–916 Phillips D (1962) A technique for the numerical solution of certain integral equations of the first kind. J Assoc Comput Mach 9:84–97 Pohl S, Hofmann B, Neubert R, Otto T, Radehaus C (2001) A regularization approach for the determination of remission curves. Inverse Probl Eng 9(2):157–174 Raus T (1984) On the discrepancy principle for the solution of ill-posed problems. Uch Zap Tartu Gos Univ 672:16–26 Raus T (1985) The principle of the residual in the solution of ill-posed problems with nonselfadjoint operator. Tartu Riikl Ül Toimetised 715:12–20 Raus T (1990) An a posteriori choice of the regularization parameter in case of approximately given error bound of data. In: Pedas A (ed) Collocation and projection methods for integral equations and boundary value problems. Tartu University, Tartu, pp 73–87 Raus T (1992) About regularization parameter choice in case of approximately given error bounds of data. In: Vainikko G (ed) Methods for solution of integral equations and ill-posed problems. Tartu University, Tartu, pp 77–89 Raus T, Hämarik U (2007) On the quasioptimal regularization parameter choices for solving illposed problems. J Inverse Ill-Posed Probl 15(4):419–439 Raus T, Hämarik U (2009) New rule for choice of the regularization parameter in (iterated) Tikhonov method. Math Model Anal 14:187–198
1774
F. Bauer et al.
Reginska T (1996) A regularization parameter in discrete ill-posed problems. SIAM J Sci Comput 17(3):740–749 Reichel L, Rodriguez G (2013) Old and new parameter choice rules for discrete ill-posed problems. Numer Algorithms 63:65–87 Robinson T, Moyeed R (1989) Making robust the cross-validatory choice of smoothing parameter in spline smoothing regression. Commun Stat Theory Methods 18(2):523–539 Rust BW (2000) Parameter selection for constrained solutions to ill-posed problems. Comput Sci Stat 32:333–347 Rust BW, O’Leary DP (2008) Residual periodograms for choosing regularization parameters for ill-posed problems. Inverse Probl 24(3):034005, 30 Santos R, De Pierro A (2003) A cheaper way to compute generalized cross-validation as a stopping rule for linear stationary iterative methods. J Comput Graph Stat 12(2):417–433 Scherzer O, Engl H, Kunisch K (1993) Optimal a posteriori parameter choice for Tikhonov regularization for solving nonlinear ill-posed problems. SIAM J Numer Anal 30(6):1796–1838 Spokoiny V, Vial C (2009) Parameter tuning in pointwise adaptation using a propagation approach. Ann Stat 37:2783–2807 Tarantola A (1987) Inverse problem theory: methods for data fitting and model parameter estimation. Elsevier, Amsterdam Tautenhahn U, Hämarik U (1999) The use of monotonicity for choosing the regularization parameter in ill-posed problems. Inverse Probl 15(6):1487–1505 Thompson AM, Kay JW, Titterington DM (1989) A cautionary note about crossvalidatory choice. J Stat Comput Simul 33:199–216 Thompson AM, Brown JC, Kay JW, Titterington DM (1991) A study of methods for choosing the smoothing parameter in image restoration by regularization. IEEE Trans Pattern Anal Machine Intell 13:3326–3339 Tikhonov A, Arsenin V (1977) Solutions of ill-posed problems. Wiley, New York Tikhonov A, Glasko V (1965) Use of the regularization method in non-linear problems. USSR Comput Math Math Phys 5(3):93–107 Tsybakov A (2000) On the best rate of adaptive estimation in some inverse problems. C R Acad Sci Paris Sér I Math 330(9):835–840 Vio R, Ma P, Zhong W, Nagy J, Tenorio L, Wamsteker W (2004) Estimation of regularization parameters in multiple-image deblurring. Astron Astrophys 423:1179–1186 Vogel CR (1986) Optimal choice of a truncation level for the truncated SVD solution of linear first kind integral equations when data are noisy. SIAM J Numer Anal 23(1):109–117 Vogel CR (1996) Non-convergence of the L-curve regularization parameter selection method. Inverse Probl 12(4):535–547 Vogel CR (2002) Computational methods for inverse problems. SIAM, Philadelphia Wahba G (1977) Practical approximate solutions to linear operator equations when the data are noisy. SIAM J Numer Anal 14(4):651–667 Wahba G (1985) A comparison of GCV and GML for choosing the smoothing parameter in the generalized spline smoothing problem. Ann Stat 13:1378–1402 Wahba G (1990) Spline models for observational data. SIAM, Philadelphia Wahba G, Wang YH (1990) When is the optimal regularization parameter insensitive to the choice of the loss function? Commun Stat Theory Methods 19(5):1685–1700 Wang Y (1998) Smoothing spline models with correlated random errors. J Am Stat Assoc 93: 341–348 Wecker WE, Ansley CF (1983) The signal extraction approach to nonlinear regression and spline smoothing. J Am Stat Assoc 78(381):81–89 Yagola AG, Leonov AS, Titarenko VN (2002) Data errors and an error estimation for ill-posed problems. Inverse Probl Eng 10(2):117–129
Quantitative Remote Sensing Inversion in Earth Science: Theory and Numerical Treatment Yanfei Wang
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Typical Inverse Problems in Earth Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Land Surface Parameter Retrieval Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Backscatter Cross-Section Inversion with Lidar . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Aerosol Inverse Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 What Causes Ill-Posedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Imposing a Priori Constraints on the Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Tikhonov/Phillips-Twomey’s Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Direct Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Statistical Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Sparse/Nonsmooth Inversion in l1 Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Optimization Methods for l2 Minimization Model . . . . . . . . . . . . . . . . . . . . . . . 5 Practical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Kernel-Based BRDF Model Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Inversion of Airborne Lidar Remote Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Particle Size Distribution Function Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1776 1779 1779 1780 1781 1782 1782 1784 1784 1789 1790 1791 1791 1793 1795 1795 1798 1800 1803 1804
Y. Wang () Key Laboratory of Petroleum Resources Research, Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing, People’s Republic of China e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_26
1775
1776
Y. Wang
Abstract
Quantitative remote sensing is an appropriate way to estimate structural parameters and spectral component signatures of Earth surface cover type. Since the real physical system that couples the atmosphere, water, and the land surface is very complicated and should be a continuous process, sometimes it requires a comprehensive set of parameters to describe such a system, so any practical physical model can only be approximated by a mathematical model which includes only a limited number of the most important parameters that capture the major variation of the real system. The pivot problem for quantitative remote sensing is the inversion. Inverse problems are typically ill-posed. The ill-posed nature is characterized by (C 1) the solution may not exist, (C 2) the dimension of the solution space may be infinite, and (C 3) the solution is not continuous with variations of the observed signals. These issues exist nearly for all inverse problems in geoscience and quantitative remote sensing. For example, when the observation system is band-limited or sampling is poor, i.e., there are too few observations, or directions are poor located, the inversion process would be underdetermined, which leads to the large condition number of the normalized system and the significant noise propagation. Hence (C 2) and (C 3) would be the highlight difficulties for quantitative remote sensing inversion. This chapter will address the theory and methods from the viewpoint that the quantitative remote sensing inverse problems can be represented by kernel-based operator equations and solved by coupling regularization and optimization methods.
1
Introduction
Both modeling and model-based inversion are important for quantitative remote sensing. Here, modeling mainly refers to data modeling, which is a method used to define and analyze data requirements; model-based inversion mainly refers to using physical or empirically physical models to infer unknown but interested parameters. Hundreds of models related to atmosphere, vegetation, and radiation have been established during past decades. The model-based inversion in geophysical (atmospheric) sciences has been well understood. However, the model-based inverse problems for Earth surface received much attention by scientists only in recent years. Compared to modeling, model-based inversion is still in the stage of exploration (Wang et al. 2009c). This is because that intrinsic difficulties exist in the application of a priori information, inverse strategy, and inverse algorithm. The appearance of hyperspectral and multiangular remote sensor enhanced the exploration means and provided us more spectral and spatial dimension information than before. However, how to utilize these information to solve the problems faced in quantitative remote sensing to make remote sensing really enter the time of quantification is still an arduous and urgent task for remote sensing scientists. Remote sensing inversion for different scientific problems in different branch is being paid more and more attentions in recent years. In a series of international study
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
a
1777
b Laser
Receiver
Atmosphere
Atmosphere
Reflected Sunlight
βt
Aperture Dt
Atmosphere
βr
Aperture Dr
Atmosphere Transmitted Power Pt
R
R
Power Pr
Thermal Radiation Scattering cross section σ
Fig. 1 Remote observing the Earth (a); geometry and parameters for laser scanner (b)
projections, such as International Geosphere-Biosphere Programme (IGBP), World Climate Research Programme (WCRP), and NASA’s Earth Observing System (EOS), remote sensing inversion has become a focal point of study. Model-based remote sensing inversions are usually optimization problems with different constraints. Therefore, how to incorporate the method developed in operation research field into remote sensing inversion field is very much needed. In quantitative remote sensing, since the real physical system that couples the atmosphere and the land surface is very complicated (see Fig. 1a) and should be a continuous process, sometimes it requires a comprehensive set of parameters to describe such a system, so any practical physical model can only be approximated by a model which includes only a limited number of the most important parameters that capture the major variation of the real system. Generally speaking, a discrete forward model to describe such a system is in the form y D h.x; S/;
(1)
where y is a single measurement; x is a vector of controllable measurement conditions such as wave band, viewing direction, time, Sun position, polarization, and the forth; S is a vector of state parameters of the system approximation; and h is a function that relates x with S, which is generally nonlinear and continuous. With the ability of satellite sensors to acquire multiple bands, multiple viewing directions, and so on, while keeping S essentially the same, we obtain the following nonhomogeneous equations y D h.x; S/ C n;
(2)
where y is a vector in RM , which is an M dimensional measurement space with M values corresponding to M different measurement conditions, and n 2 RM
1778
Y. Wang
is the vector of random noise with same vector length M . Assume that there are m undetermined parameters need to be recovered. Clearly, if M D m, (2) is a determined system, so it is not difficult to develop some suitable algorithms to solve it. If more observations can be collected than the existing parameters in the model (Verstraete et al. 1996), i.e., M > m, the system (2) is over determined. In this situation, the traditional solution does not exist. We must define its solution in some other meaning, for example, the least squares error (LSE) solution. However as Li (Li et al. 1998) pointed out that, “for physical models with about ten parameters (single band), it is questionable whether remote sensing inversion can be an over determined one in the foreseeable future.” Therefore, the inversion problems in geosciences seem to be always underdetermined in some sense. Nevertheless, the underdetermined system in some cases, can be always converted to an overdetermined one by utilizing multiangular remote sensing data or by accumulating some a priori knowledge (Li et al. 2001). Developed methods in literature for quantitative remote sensing inversion are mainly statistical methods with several variations from Bayesian inference. In this chapter, using kernel expression, we analyze from algebraic point of view, about the solution theory and methods for quantitative remote sensing inverse problems. The kernels mentioned in this chapter mainly refer to integral kernel operators (characterized by integral kernel functions) or discrete linear operators (characterized by finite rank matrices). It is closely related with the kernels of linear functional analysis, Hilbert space theory, and spectral theory. In particular, we present regularizing retrieval of parameters with a posteriori choice of regularization parameters, several cases of choosing scale/weighting matrices to the unknowns, numerically truncated singular value decomposition (NTSVD), nonsmooth inversion in lp space, and advanced optimization techniques. These methods, as far as we know, are novel to literature in Earth science. The outline of this chapter is as follows: in Sect. 2, we list three typical kernelbased remote sensing inverse problems. One is the linear kernel-based bidirectional reflectance distribution function (BRDF) model inverse problem, which is of great importance for land surface parameters retrieval; the other is the backscattering problem for Lidar sensing; and the last one is aerosol particle size distributions from optical transmission or scattering measurements, which is a long time existed problem and still an important topic today. In Sect. 3, the regularization theory and solution techniques for ill-posed quantitative remote sensing inverse problems are described. Section 3.1 introduces the conception of well-posed problems and illposed problems; Sect. 3.2 discusses about the constrained optimization; Sect. 3.3 fully extends the Tikhonov regularization; Sect. 3.4 discusses about the direct regularization methods for equality-constrained problem; then in Sect. 3.5, the regularization scheme formulated in the Bayesian statistical inference is introduced. In Sect. 4, the optimization theory and solution methods are discussed for finding an optimized solution of a minimization model. Section 4.1 talks about sparse and nonsmooth inversion in l1 space; Sect. 4.2 introduces the Newton-type and gradienttype methods. In Sect. 5.1, the detailed regularizing solution methods for retrieval of ill-posed land surface parameters are discussed. In Sect. 5.2, the results for retrieval
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1779
of backscatter cross-sections by Tikhonov regularization are displayed. In Sect. 5.3, the regularization and optimization methods for recovering aerosol particle size distribution functions are presented. Finally, in Sect. 6, some concluding remarks are given.
2
Typical Inverse Problems in Earth Science
Many inverse problems in geophysics are kernel-based, e.g., problems in seismic exploration and gravimetry. I do not introduce these solid Earth problems in the present chapter, instead, I mainly focus on Earth surface problems. The kernel methods can increase the accuracy of remote-sensing data processing, including specific land-cover identification, biophysical parameter estimation, and feature extraction (Camps-Valls 2008; Wang et al. 2009c). I introduce three typical kernelbased inverse problems in geoscience, one belongs to the atmospheric problem and another two belong to the Earth surface problems.
2.1
Land Surface Parameter Retrieval Problem
As is well-known, the anisotropy of the land surface can be best described by the BRDF. With the progress of the multiangular remote sensing, it seems that the BRDF models can be inverted to estimate structural parameters and spectral component signatures of Earth surface cover type (see Roujean et al. 1992; Strahler et al. 1994). The state of the art of BRDF is the use of the linear kernel-driven models, mathematically described as the linear combination of the isotropic kernel, volume scattering kernel, and geometric optics kernel. The information extraction on the terrestrial biosphere and other problems for retrieval of land surface albedos from satellite remote sensing have been considered by many authors in recent years, see for instance the survey papers on the kernel-based BRDF models by Pokrovsky and Roujean (2002, 2003), Pokrovsky et al. (2003), and references therein. The computational stability is characterized by the algebraic operator spectrum of the kernel-matrix and the observation errors. Therefore, the retrieval of the model coefficients is of great importance for computation of the land surface albedos. The linear kernel-based BRDF model can be described as follows (Roujean et al. 1992): fiso C kvol .ti ; tv ; /fvol C kgeo .ti ; tv ; /fgeo D r.ti ; tv ; /;
(3)
where r is the bidirectional reflectance; the kernels kvol and kgeo are the so-called kernels, that is, known functions of illumination and of viewing geometry which describe volume and geometric scattering, respectively; ti and tv are the zenith angle of the solar direction and the zenith angle of the view direction, respectively; ' is the relative azimuth of sun and view direction; and fiso , fvol , and fgeo are three unknown parameters to be adjusted to fit observations. Theoretically, fiso , fvol , and
1780
Y. Wang
fgeo are closely related to the biomass such as leaf area index (LAI), Lambertian reflectance, sunlit crown reflectance, and viewing and solar angles. The vital task then is to retrieve appropriate values of the three parameters. Generally speaking, the BRDF model includes kernels of many types. However, it was demonstrated that the combination of RossThick (kvol ) and LiSparse (kgeo ) kernels had the best overall ability to fit BRDF measurements and to extrapolate BRDF and albedo (see, e.g., Wanner et al. 1995; Privette et al. 1997; Li et al. 1999). A suitable expression for the RossThick kernel kvol was derived by Roujean et al. (1992). It is reported that the LiTransit kernel kTransit , instead of the kernel kgeo , is more robust and stable than LiSparse non-reciprocal kernel and the reciprocal LiSparse kernel ksparse (LiSparseR) where the LiTransit kernel and the LiSparse kernel are related by kTransit D
ksparse ; B 2; and B is given by B WD B.ti ; tv ; / 2 k ; B > 2; B sparse
D O.ti ; tv ; / C sec ti0 C sec tv0 in Li et al. (2000). More detailed explanation about O and t 0 in the definition of kTransit can be found in Wanner et al. (1995). To use the combined linear kernel model, a key issue is to numerically solve the inverse model in a stable way. However, it is difficult to do in practical applications due to ill-posed nature of the inverse problem. So far, statistical methods and algebraic methods have been developed for solving this inverse problem, e.g., Pokrovsky and Roujean (2002, 2003) and Wang et al. (2007a, 2008). We will describe these methods and introduce recent advances in following paragraphs.
2.2
Backscatter Cross-Section Inversion with Lidar
Airborne laser scanning (ALS) is an active remote sensing technique which is also often referred to as Lidar or laser radar. Due to the increasing availability of sensors, ALS has been receiving increasing attention in recent years (e.g., see Wagner et al. 2006). In ALS, a laser emits short infrared pulses toward the Earth surface and a photodiode records the backscattered echo. With each scan, measurements are taken of the round trip time of the laser pulse, the received echo power and of the beam angle in the locator coordinate system. The round-trip time of the laser pulse allows calculating the range (distance) between the laser scanner and the object that generated the backscattered echo. Thereby, information about the geometric structure of the Earth surface is obtained. The received power provides information about the scattering properties of the targets, which can be exploited for object classification and for modeling of the scattering properties. The latest generation of ALS systems does not only record a discrete number of echoes but also digitizes the whole waveform of the reference pulse and the backscattered echoes. In this way, besides the range further echo parameters can
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1781
be determined. The retrieval of the backscatter cross-section is of great interest in full-waveform ALS. Since it is calculated by deconvolution, its determination is an ill-posed problem in a general sense. ALS utilizes a measurement principle, firstly, strongly related to radar remote sensing (see Fig. 1b). The fundamental relation to explain the signal strength in both techniques is the radar equation (Wagner et al. 2006): 2R Dr2 Pr .t/ D ; Pt t vg 4R4 ˇt2
(4)
where t is the time, R is the range, Dr is the aperture diameter of the receiver optics, ˇt is the transmitter beam width, Pt is the transmitted power of the laser, and denotes the scattering cross-section. The time delay is equal to t 0 D 2R=vg , where vg is the group velocity of the laser pulse in the atmosphere. Taking the occurrence of multiple scatterers into account and regarding the impulse response .t/ of the system receiver, we get (Wagner et al. 2006)
Pr .t/ D
N X i D1
Dr2 Pt .t/ i0 .t/ .t/; 4R4 ˇt2
(5)
where * denotes the convolution operator. Since convolution is commutative, we can set Pt .t/ i0 .t/ .t/ D Pt .t/ .t/ i0 .t/ D S .t/ i0 .t/; i.e., it is possible to combine both the transmitter and the receiver characteristics to a single term S .t/. This term is referred to as the system waveform. Thus, we are able to write our problem in the form (Wang et al. 2009a) h.t/ D
N X
.f g/.t/:
(6)
i D1
where h is the incoming signal recorded by the receiver, f denotes a mapping which specifies the kernel function or point spread function, and g is the unknown crosssection. The problem is how to deconvolve the convolution equation (6) to get the approximation to the actual cross-section.
2.3
Aerosol Inverse Problems
It is well-known that the characteristics of the aerosol particle size, which can be represented as a size distribution function in the mathematical formalism, say n.r/, plays an important role in climate modeling due to its uncertainty (Houghton et al. 1966). So, the determination of particle size distribution function becomes a basic task in aerosol research (see, e.g., Davies 1974; Mccartney 1976; Twomey 1977; Bohren and Huffman 1983; Bockmann 2001; Bockmann and Kirsche 2006). Since the relationship between the size of atmospheric aerosol particles and the wave-
1782
Y. Wang
length dependence of the extinction coefficient was first suggested by Ångström (1929), the size distribution began to be retrieved by extinction measurements. For sun-photometer, the attenuation of the aerosols can be written as the integral equation of the first kind Z 1 aero ./ D r 2 Qext .r; ; /n.r/dr C %./; (7) 0
where r is the particle radius, n.r/ is the columnar aerosol size distribution (i.e., the number of particles per unit area per unit radius interval in a vertical column through the atmosphere), is the complex refractive index of the aerosol particles, is the wavelength, %./ is the error/noise, and Qext .r; ; / is the extinction efficiency factor from Mie theory. Since aerosol optical thickness (AOT) can be obtained from the measurements of the solar flux density with sun-photometers, one can retrieve the size distribution by the inversion of AOT measurements through the above equation. This type of method is called extinction spectrometry, which is not only the earliest method applying remote sensing to determine atmospheric aerosol size characteristics but also the most mature method thus far. A common feature for all particle size distribution measurement systems is that the relation between noiseless observations and the size distribution function can be expressed as a first kind Fredholm integral equation, e.g., see Nguyen and Cox (1989), Voutilainenand and Kaipio (2000), Wang et al. (2006a), Wang (2007, 2008), and Wang and Yang (2008). For the aerosol attenuation problem (7), let us rewrite (7) in the form of the abstract operator equation K W X ! Y; R1 .Kn/./ C %./ D 0 k.r; ; / .r/dr C %./ D o./ C %./ D d ./;
(8)
where k.r; œ; ˜/ D r 2 Qext .r; œ; ˜/I X denotes the function space of aerosol size distributions; and Y denotes the observation space. Both X and Y are considered to be the separable Hilbert space. Note that aero in Eq. 7 is the measured term, it inevitably induces noise or errors. Hence, d ./ is actually a perturbed right-hand side. Keeping in mind operator symbol, Eq. 8 can be written as Kn C % D o C % D d:
3
Regularization
3.1
What Causes Ill-Posedness
(9)
From this section till the end of the chapter, unless it is specified, we will denote the operator equation as K.x/ D y;
(10)
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1783
which is an appropriate expression for an observing system, with K the response function (linear or nonlinear), x the unknown input, and y the observed data. Particularly, if K is a linear mapping, we will denote the response system as Kx D y;
(11)
which is clearly a special case of (10). We will also use K as a operator in infinite spaces sometimes, and a matrix sometimes. We assume that readers can readily recognize them. The problem (10) is said to be properly posed or well-posed in the sense that it has the following three properties: (C1 ) (C2 ) (C3 )
There exists a solution of the problem, i.e., existence; There is at most one solution of the problem, i.e., uniqueness; The solution depends continuously on the variations of the right-hand side (data), i.e., stability.
The condition (C1 ) can be easily fulfilled if we enlarge the solution space of the problem (10). The condition (C2 ) is seldom satisfied for many indirectly measurement problems. This means more than one solution may be found for the problem (10) and the information about the model is missing. In this case, a priori knowledge about the solution must be incorporated and built into the model. The requirement of stability is the most important one. If the problem (10) lacks the property of stability, then the computed solution has nothing to do with the true solution since the practically computed solution is contaminated by unavailable errors. Therefore, there is no way to overcome this difficulty unless additional information about the solution is available. Again, a priori knowledge about the solution should be involved. If problem (10) is well-posed, then K has a well-defined, continuous inverse operator K 1 . In particular, K 1 .K.x// D x for any x 2 X and Range.K/ D Y. In this case, both the algebraic nature of the spaces and the topologies of the spaces are ready to be employed. The particle size distribution model (9) is a linear model in infinite spaces. The operator K is compact. The ill-posedness is self-evident because that at least one of the three items for well-posed problems is violated. Note that (3) is a linear model in finite spaces, therefore it is easy to rewrite it into a finite rank operator equation Kx D y;
(12)
by setting x D Œfiso ; fvol ; fgeo T and y D Œyj with the entries yj D rj .ti ; tv ; '/, where y is the measurement data. The inverse problem is how to recover the model parameters x given the limited measurement data y. For Lidar backscatter crosssection inversion, one needs to solve a deconvolution problem. The ill-posedness is due to the ill-conditioning of the spectrum of the operator and noisy data. Numerically, the discrete ill-posedness for the above examples is because that their operators may be inaccurate (can only be approximately calculated), their
1784
Y. Wang
models are usually underdetermined if there are too few observations or poor directional range, or the observations are highly linearly dependent and noisy. For example, a single angular observation may lead to a under determined system whose solutions are infinite (the null space of the kernel contains nonzero vectors) or the system has no solution (the rank of the coefficient matrix is not equal to the augmented matrix). In practice, random uncertainty in the reflectances sampled translates into uncertainty in the BRDF and albedo. We note that noise inflation depends on the sampling geometry alone. For example, for MODIS and MISR sampling, they vary with latitude and time of year; but for kernel-based models, they do not depend on wavelength or the type of BRDF viewed. Therefore, the random noise in the observation (BRDF) and the small singular values of K control the error propagation.
3.2
Imposing a Priori Constraints on the Solution
For effective inversion of the ill-posed kernel driven model, we have to impose an a priori constraint to the interested parameters. This leads to solving a constrained LSE problem min J .x/;
s:t:
Kx D y;
1 c.x/ 2 ;
(13)
where J .x/ denotes an object functional, which is a function of x; c.x/ is the constraint to the solution x; and 1 and 2 are two constants which specify the bounds of c.x/. Usually, J .x/ is chosen as the norm of x with different scale. If the parameter x comes from a smooth function, then J .x/ can be chosen as a smooth function, otherwise, J .x/ can be nonsmooth. The constraint c.x/ can be smooth (e.g., Sobolev stabilizer) or nonsmooth (e.g., total variation or lq norm .q ¤ 2/ based stabilizer). A generically used constraint is the smoothness. It assumes that physical properties in a neighborhood of space or in an interval of time present some coherence and generally do not change abruptly. Practically, we can always find regularities of a physical phenomenon with respect to certain properties over a short period of time (Wang et al. 2007a, 2008). The smoothness a prior has been one of the most popular a prior assumptions in applications. The general framework is the so-called regularization which will be explained in the next subsection.
3.3
Tikhonov/Phillips-Twomey’s Regularization
Most of inverse problems in real environment are generally ill-posed. Regularization methods are widely used to solve such ill-posed problems. The complete theory for regularization was developed by Tikhonov and his colleagues (Tikhonov and Arsenin 1977). For the discrete model (12), we suppose y is the true right-hand side, and denote yn the measurements with noise which represents the bidirectional
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1785
reflectance. The Tikhonov regularization method is to solve a regularized minimization problem J ˛ .x/ WD jjKx yn jj22 C ˛jjD 1=2 xjj22 ! min
(14)
J .x/ D jjKx yn jj22 ! min :
(15)
instead of solving
In (14), ˛ is the regularization parameter and D is a positively (semi-)definite operator. By a variational process, the minimizer of (14) satisfies K T Kx C ˛Dx D K T yn :
(16)
The operator D is a scale matrix which imposes smoothness constraint to the solution x. The scale operator D and the regularization parameter ˛ can be considered as some kind of a priori information, which will be discussed next. Phillips-Twomey’s regularization is based on solving the problem (Phillips 1962; Twomey 1975) min Q.x/; s:t: jjKx yn jj D ; x
(17)
where Q.x/ D .Dx; x/, where D is a preassigned scale matrix and > 0. It is clear that Phillips-Twomey’s regularization shares similarity with Tikhonov’s regularization and can be written in consistent form.
Choices of the Scale Operator D To regularize the ill-posed problem discussed in Sect. 3.1, the choice of the scale operator D has great impact to the performance to the regularization. Note that the matrix D plays the role in imposing a smoothness constraint to the parameters and in improving the condition of the spectrum of the adjoint operator K T K. Therefore, it should be positively definite or at least positively semi-definite. One may readily see that the identity may be a choice. However this choice does not fully employ the assumption about the continuity of the parameters. In Wang et al. (2007a), we assume that the operator equation (12) is the discretized version of a continuous physical model K.x.// D y./
(18)
with K the linear/nonlinear operator, x./ the complete parameters describing the land surfaces, and y the observation. Most of the kernel model methods reported in literature may have the above formulation. Hence instead of establishing regularization for the operator equation (12) in the Euclidean space, it is more convenient to perform the regularization to the operator equation (18) on an abstract
1786
Y. Wang
space. So from a priori considerations we suppose that the parameters x is a smooth function, in the sense that x is continuous on [a; b], is differentiable almost everywhere and its derivative is square-integrable on [a; b]. By Sobolev’s imbedding theorem (see, e.g., Tikhonov and Arsenin 1977; Xiao et al. 2003), the continuous differentiable function x in W 1;2 space imbeds into integrable continuous function space L2 automatically. The inner product of two functions x./ and y./ in W 1;2 space is defined by Z .x./; y.//W 1;2 WD
! n X @x @y x./y./ C d1 d2 : : : dn ; @i @j i D1
(19)
where is the assigned interval of the definition. Now we construct a regularizing algorithm that an approximate solution x ˛ 2 1;2 W Œa; b which converges, as error level approaching zero, to the actual parameters in the norm of space W 1;2 Œa; b , precisely we construct the functional J ˛ .x/ D F ŒKx; y C ˛L.x/;
(20)
where F ŒKx; y D 12 jjKx yjj2L2 ; L.x/ D 12 jjxjj2W 1;2 : Assume that the variation of x./ is flat and smooth near the boundary of the integral interval Œa; b . In this case, the derivatives of x are zeros at the boundary of Œa; b . Let hr be the step size of the grids in Œa; b , which could be equidistant or adaptive. Then after discretization of L.x/, D is a tridiagonal matrix in the form 2 6 6 6 6 D WD D1 D 6 6 6 4
1 h2r h12 r
1C
:: : 0
0
h12 r
1C ::
2 h2r
:
0
h12 r :: :: : : h12 1 C h22 r
0
h12 r
3
0
r
0 :: : h12 r
1C
1 h2r
7 7 7 7 7: 7 7 5
For the linear model (3), after the kernel normalization, we may consider Œa; b D Œ1; 1 . Thus, D is in the above form with hr D 2=.N 1/. There are many kinds of techniques for choosing the scale matrix D appropriately. In Phillips-Twomey’s formulation of regularization (see, e.g., Wang et al.P2006a), the matrix D is created by the norm of the second 1 differences, iND2 .xi 1 2xi C xi C1 /2 , which leads to the following form of matrix D
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
2
1 2 6 2 5 6 6 1 4 6 6 6 0 1 6 : : D WD D2 D 6 6 :: :: 6 6 0 0 6 6 0 0 6 4 0 0 0 0
1 4 6 4 :: : 0 0 0 0
0 0 1 0 4 1 6 4 :: :: : :
0 0 0 1 :: :
:: :
1 0 0 0
4 6 1 4 0 1 0 0
0 0 0 0
0 0 0 0 :: :
0 0 0 0 :: :
0 0 0 0 :: :
1787
0 0 0 0 :: :
3
7 7 7 7 7 7 7 7: 7 7 4 1 0 7 7 6 4 1 7 7 4 5 2 5 1 2 1
However, the matrix D is badly conditioned and thus the solution to minimize the functional J ˛ Œx with D as the smooth constraint is observed to have some oscillations (Wang et al. 2006). Another option is the negative Laplacian (see, e.g., P 2 Wang and Yuan 2003; Wang 2007): Lx WD niD1 @@x2 , for which the scale matrix i D for the discrete form of the negative Laplacian Lx is 2
1 6 1 6 6 D WD D3 D 6 ::: 6 4 0
1 0 0 2 1 0 : :: :: : : :: 0 0 1 2 0 0 0 1
0 0 :: :
3
7 7 7 7: 7 1 5 1
Where we assume that the discretization step length as 1. The scale matrix D3 is positive semi-definite but not positive definite and hence the minimization problem may not work efficiently for severely ill-posed inverse problems. Another option of the scale matrix D is the identity, i.e., D WD D4 D diag.e/, where e is the components of all ones, however this scale matrix is too conservative and may lead to over regularization.
Regularization Parameter Selection Methods As noted above, the choice of the regularization parameter ˛ is important to tackle the ill-posedness. A priori choice of the parameter ˛ allows 0 < ˛ < 1. However the a priori choice of the parameter does not reflect the degree of approximation that may lead to either overestimate or underestimate of the regularizer. We will use the widely used discrepancy principle (see, e.g., Tikhonov and Arsenin 1977; Tikhonov et al. 1995; Xiao et al. 2003) to find an optimal regularization parameter. In fact, the optimal parameter ˛* is a root of the nonlinear function ‰.˛/ D jjKx˛ yn jj2 ı 2 ;
(21)
where ı is the error level to specify the approximate degree of the observation to the true noiseless data, x˛ denotes the solution of the problem in Eq. (16) corresponding
1788
Y. Wang
to the value ˛ of the related parameter. Noting , .˛/ is differentiable, fast algorithms for solving the optimal parameter ˛* can be implemented. In this chapter we will use the cubic convergent algorithm developed in (Wang and Xiao 2001):
˛kC1 D ˛k
2‰.˛k / ‰ 0 .˛k /
C
.‰ 0 .˛k /2
1
2‰.˛k /‰ 00 .˛k // 2
:
(22)
In the above cubic convergent algorithm, the function ‰ 0 .˛/ and ‰ 00 .˛/ have the following explicit expression: " 2 # 2 dx x d ˛ ˛ ‰ 0 .˛/ D ˛ˇ 0 .˛/; ‰ 00 .˛/ D ˇ 0 .˛/ 2˛ ; d˛ C x˛ ; d˛ 2 ˛ where ˇ.˛/ D jjx˛ jj2 , ˇ 0 .˛/ D 2 dx ; x˛ , and x˛ , dx˛ =d˛ and d2 x˛ =d˛ 2 can be d˛ obtained by solving the following equations: .K T K C ˛D/x˛ D K T yn ; dx˛ D Dx˛ ; d˛
(24)
d2 x˛ dx˛ : D 2D d˛ 2 d˛
(25)
.K T K C ˛D/
.K T K C ˛D/
(23)
To solve the linear matrix-vector equations (23)–(25), we use the Cholesky (square root) decomposition method. A remarkable characteristic of the solution of (23)– (25) is that the Cholesky decomposition of the coefficient matrix K T K C ˛D needs only once, then the three vectors x˛ ; dx˛ =d˛; d2 x˛ =d2 ˛ can be obtained cheaply. In the case of perturbation of operators, the above method can be applied similarly. Note that in such case, the discrepancy equation becomes 2 Q 2 k .yn ; K/ Q Q ˛ yn jj2 .ı C ı/ Q ‰.˛/ D jjKx D 0;
(26)
Q and where ıQ is the error level of KQ approximating the true operator, D .ı; ı/ Q .yn ; K/ is the incompatibility measure of the equation Kx D y and > 0. Equation 26 is called a generalized discrepancy equation and is an one-dimensional nonlinear equation, which can be solved by Newton’s or cubic convergent method. For more information about generalized discrepancy, we refer to Tikhonov et al. (1995) and Wang (2007) for details.
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
3.4
1789
Direct Regularization
Instead of Tikhonov regularization, our goal in this section is to solve an equality constrained l2 problem Q C n D yn ; jjxjj2 ! min; s:t: Kx
(27)
where KQ 2 RM N is a perturbation of K (i.e., if we regard K as an accurate operator, then KQ is an approximation to K which may contain error or noise), x 2 RN ; n; yn 2 RM . As is mentioned already, the ill-posedness is largely due to the small singular values of the linear operator. Let us denote the singular value decomposition of KQ N P as KQ D UM N †N N VNT N D i ui viT , where both U D Œui and V D Œvi i D1
are orthonormal matrices, i.e., the products of U with its transpose and V with its transpose are both identity matrices; ˙ is a diagonal matrix whose nonzero Q The traditional LSE solution xlse of entries consist of the singular values of K. the constrained optimization system (27) can be expressed by the singular values and singular vectors in the form xlse D
N X 1 T ui yn vi : i D1 i
(28)
If the rank of KQ is p minfM; N g, then the above solution form inevitably encounters numerical difficulties, since the denominator contains numerically infinitesimal values. Therefore, to solve the problem by the SVD, we must impose a priori information. As we have noted, Tikhonov regularization solves a variation problem by incorporating a priori information into the solution. In this section, we consider another way of incorporating a priori information to the solution. The idea is quite simple: instead of filtering the small singular values by replacing the small singular values with small positive numbers, we just make a truncation of the summation, i.e., the terms containing small singular values are replaced by zeroes. In this way, we obtain a regularized solution of the least squares problem (27) of minimal norm xtrunc lse D
p X 1 T .ui yn /vi i D1 i
(29)
P T 2 Q yn jj2 D and minx jjKx i DpC1; jui yn j . We wish to examine the truncated 2 singular value decomposition more. Note that in practice, KQ may not be exactly rank deficient, but instead be numerically rank deficient, i.e., it has one or more small but Q Here, pı refers to the numerical nonzero singular values such that pı < rank.K/. ı-rank of a matrix, see, e.g., Wang et al. (2006b) for details. It is clear from Eq. 29 that the small singular values inevitably give rise to difficulties. The regularization
1790
Y. Wang
technique for SVD means some of the small singular values are truncated when in computation and is hence is called the NTSVD. Now assume that K is corrupted by the error matrix Bı . Then, we replace K by a matrix KpN that is close to K and mathematically rank deficient. Our choice of KpN is obtained by replacing the small nonzero singular values pC1 ; pC2 ; : : : with exact zeros, i.e., N N
KpN D
pQ X
i ui viT
(30)
i D1
where pQ is usually chosen as pı . We call (30) the NTSVD of K. Now, we use (30) as the linear kernel to compute the least squares solutions. Actually, we solve the appr problem min jjKpQ xyn jj2 and obtain the approximate solution xlse of the minimalx norm appr
-
xlse D KpQ yn D
pQ X 1 T .u yn /vi ; i i D1 i
(31)
-
where KpQ denotes the Moore-Penrose generalized inverse. Let us explain in more details the NTSVD for the underdetermined linear system. In this case, the number of independent variables is more than the number of observations, i.e., M < N . Assume that the ı-rank of KQ is pQ minfM; N g. It is easy to augment KQ to be an N N square matrix KQ aug by padding zeros underneath its M nonzero rows. Similarly, we can augment the right-hand side vector yn with zeros. The singular decomposition of KQ can be rewritten as KQ aug D U †V T , where U D Œu1 u2 : : :uN N N ; V D Œv1 v2 : : :; vN N N and † D diag.1 ; 2 ; : : : ; pQ ; 0; : : : ; 0/. From this decomposition, we find that there are N pQ theoretical zero singular values of the diagonal matrix †. These N pQ zero singular values will inevitably induce high numerical instability.
3.5
Statistical Regularization
Bayesian statistics provides a conceptually simple process for updating uncertainty in the light of evidence. Initial beliefs about some unknown quantity are represented by a prior distribution. Information in the data is expressed by the likelihood function L.xjy/. The a prior distribution p.x/ and the likelihood function are then combined to obtain the posterior distribution for the quantity of interest. The a posterior distribution expresses our revised uncertainty in light of the data, in other words, an organized appraisal in the consideration of previous experience. The role of Bayesian statistics is very similar to the role of regularization. Now, we establish the relationship between the Bayesian estimation and the regularization. A continuous random vector x is said to have a Gaussian distribution if its joint probability distribution function has the form
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1791
1 T 1 exp .x / C .x / ; px .xI ; C / D p 2 .2/N det.C / 1
(32)
where x; 2 RN , C is an n-by-n symmetric positive definite matrix, and det./ denotes the matrix determinant. The mean is given by E.x/ D and the covariance matrix is cov.x/ D C . Suppose y D Kx C n is a Gaussian distribution with mean Kx and covariance Cn , where Cn is the noise covariance of the observation noise and model inaccuracy. Then by (32) we obtain 1 1 p.yjx/ D p exp .y Kx/T Cn1 .y Kx/ : 2 .2/M det.Cn / 1 T 1 Cx x/
exp. x From (32), the prior probability distribution is given by p.x/ D p 2 N
.2/ det.Cx /
(33)
. By
Bayesian statistical inference and the above two equations, we obtain an a posteriori log likelihood function 1 1 L.xjy/ D log p.xjy/ D .y Kx/T Cn1 .y Kx/ xT Cx1 x C ; 2 2
(34)
where is constant with respect to x. The maximum a posteriori estimation is obtained by maximizing (34) with respect to x, x D .K T Cn1 K C Cx1 /1 K T Cn1 y:
(35)
The easiest way of choosing Cn and Cx is by letting Cn D n2 IM ; Cx D x2 IN , and then (35) becomes x D .K T K C IM /1 K T y;
(36)
where D n2 =x2 , which is the noise-to-signal ratio. It is clear that the solution obtained by maximum a posteriori estimation has the same form as the solution of the Tikhonov regularization.
4
Optimization
4.1
Sparse/Nonsmooth Inversion in l1 Space
It deserves attention that the ill-posedness is the intrinsic feature of the inverse problems. Unless some additional information/knowledge such as monotonicity, smoothness, boundedness, or the error bound of the raw data are imposed, the difficulty is hardly to be solved. Generally speaking, the kernel-driven BRDF model is semiempirical, the retrieved parameters x are mostly considered as a kind
1792
Y. Wang
of weight function though it is a function of leaf area index (LAI), Lambertian reflectance, sunlit crown reflectance, and viewing and solar angles. Therefore, x is not necessarily positive. However, since it is a weight function, an appropriate arrangement of the components of x can yield the same results. That is to say, x can be “made” to be nonnegative. The problem remaining is to develop some proper methods to solve the “artificial” problem. Our new meaning to the solution x* is related to the l1 norm problem min jjxjj1 ; s:t: Kx D y; x 0; x
(37)
which automatically imposes a priori information by considering the solution in l1 space. Because of the limitations of the observation system, one may readily see that the recovered land surface parameters are discrete and sparse. Therefore, if an inversion algorithm is not robust, the outliers far from the true solution may occur. In this situation, the priori constrained l1 minimization may work better than the conventional regularization techniques. The model (37) can be reduced to a linear programming problem (see Ye 1997; Yuan 2001; Wang et al. 2005), hence linear programming methods can be used for solving the inverse problem. The l1 norm solution method is seeking for a feasible solution within the feasible set S D fx W Kx D y; x 0g. So it is actually searching for an interior point within the feasible set S , and hence is called the interior point method. The dual standard form of (37) is in the form max yT g;
s:t: s D e K T g 0;
(38)
where e is a vector with all components equaling to 1. Therefore, the optimality conditions for (x, g, s) to be a primal-dual solution triplet are that Kx D y; K T g C s D e; SQ FQ e D 0; x 0; s 0;
(39)
where SQ D diag.s1 ; s2 ; ; sN /; FQ D diag.x1 ; x2 ; : : : ; xN /, and si , xi are components of vectors s and x, respectively. The notation diag./ denotes the diagonal matrix whose only nonzero components are the main diagonal line. The interior point method generates iterates fxk ; gk ; sk g such that xk > 0 and sk > 0. As the iteration index k approaches infinity, the equality-constraint violations jjy Kxjj and jjK T gk Csk ejj and the duality gap xTk sk are driven to zero, yielding a limiting point that solves the primal and dual linear problems. For the implementation procedures and examples about using the algorithm, please refer to Wang et al. (2007b, 2009d) for details. A more general regularization model is recently proposed in Wang et al. (2009b), where the authors considered a regularization model in general form min J Œf WD
1 p q jjKf hn jjlp C jjL.f f0 /jjlq ; 2 2
(40)
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1793
where p, q > 0 which are specified by users, v > 0 is the regularization parameter, L is the scale operator, and f0 is an a priori solution of the original model. This formulation includes most of the developed methods. Particularly, for p D 2 and q D 1 or q D 0, the model represents nonsmooth and sparse regularization, which represents a quite important and hot topic in present, compressive sensing for decoding in signal processing. A regularizing active set method was proposed both for quadratic programming and non-convex problems, we refer Wang et al. (2009b) for details.
4.2
Optimization Methods for l2 Minimization Model
Newton-Type Methods The conventional Tikhonov regularization method is equivalent to constrained l2 minimization problem min jjxjj2 ; x
s:t: Kx D y:
(41)
This reduces to solve an unconstrained optimization problem x D argminx J ˛ .x/; J ˛ .x/ D
1 ˛ jjKx yjj22 C jjxjj22 : 2 2
(42)
The gradient and Hessian of J ˛ .x/ are given by gradx ŒJ ˛ .x/ D .K T K C ˛I /1 x K T y and Hessx ŒJ ˛ .x/ D K T K C ˛I , respectively. Hence at the k-th iterative step, the gradient and Hessian of J ˛ .xk / can be expressed as gradk ŒJ ˛ and Hessk ŒJ ˛ , which are evaluated by gradxk ŒJ ˛ .xk / and Hessxk ŒJ ˛ .xk / , respectively. Newton-type methods are based on Gauss-Newton method and its various variations. We only supply the algorithm for Gauss-Newton method in this subsection. The Gauss-Newton method is an extension of Newton method in one-dimensional space to higher dimensional space. The iteration formula reads as xkC1 D xk k .Hessk ŒJ ˛ /1 gradk ŒJ ˛ ;
(43)
where k , a damping parameter, which can be solved by line search technique, is used to control the direction .Hessk ŒJ ˛ /1 gradkŒJ ˛ . One may also apply a more popular technique, called the trust region technique, to control the direction .Hessk ŒJ ˛ /1 gradk ŒJ ˛ within a reliable generalized ball in every iteration (see Wang and Yuan 2005; Wang 2007). We recall that the inverse of Hessk ŒJ ˛ should be avoided for saving the amount of computation. Instead, linear algebraic decomposition methods can be applied to solve .Hessk ŒJ ˛ /1 gradk ŒJ ˛ . There are different variations of the Gauss-Newton method, which are based on the approximation of the explicit Hessian matrix Hessx ŒJ ˛ , e.g., DFP, BFGS,
1794
Y. Wang
L-BFGS, and trust region methods. For these extensions to well-posed and illposed problems, please refer to Nocedal (1980), Dennis and Schnable (1983), Yuan (1993, 1994), Kelley (1999), Wang and Yuan (2005), and Wang (2007) for details. We mention briefly a global convergence method, the trust region method. The method solves an unconstrained non-quadratic minimization problem minn .x/. x2R
For the problem (42), the trust region method requires solving a trust region subproblem 1 min ‡.s/ WD .gradx ŒJ˛ ; s/ C .Hessx ŒJ ˛ s; s/; s 2
s:t: jjsjj ;
(44)
where > 0 is the trust region radius. In each step, a trial step s is computed and decided whether it is acceptable or not. The decision rule is based on the ratio between the actual reduction in the objective functional and the predicted reduction in the approximate model. And the trust region iterative step remains unchanged if ˛ ˛ 0, where D Ared.x/ Pred.x/ , and Ared(x) and Pred(x) are defined by J .x/J .xCs/ and Y .0/ Y .s/, respectively. For the model in (42), since it is in a quadratic form, the ratio is always equal to 1. This means the trial step s, no matter it is good or not, will be always accepted. We note that the approximate accuracy is characterized by the discrepancy between the observation and the true data; therefore variations of the norm of the discrepancy may reflect the degree of approximation. Based on these considerations, we propose to accept or reject the trial step sk at the kth step by the ratio k D
J ˛ .xk C sk / J ˛ .xkC1 / D ; ˛ J .xk / J ˛ .xk /
where J ˛ .xkC1 / and J ˛ .xk / are the reductions in norm of the discrepancy at .kC1/th and k-th steps, respectively. For the convergence and regularizing properties, we refer to Wang (2007) and Wang and Ma (2009) for details.
Gradient-Type Methods The gradient method does not need the Hessian information. For the linear operator equation Kx D y, where K, x, and y are with the same meaning as before, we first recall the well-known fixed-point iteration method in standard mathematical textbook: the fixed-point iteration formula for solving the above linear operator equation is as XkC1 D Xk C .y Kxk /;
(45)
where 2 .0; 2=jjKjj/ and K is linear, bounded, and nonnegative. One may readily see that this method is very similar to the method of successive approximations, where a very simple way to introduce the method is the following. Consider the operator T .x/ D x C .y Kx/, where is the so-called relaxation parameter. Any solution of the linear operator equation is equivalent to finding a fixed point of
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1795
the operator T , i.e., solve for x from x D T .x/. Assuming that T is a contraction mapping, then by the method of successive approximations, we obtain the following iterative scheme xkC1 D T .xk /, i.e., iterative formula (45). The method converges if and only if K x D y has a solution. Now we introduce a very simple gradient method, the steepest descent method, the iteration formula reads as xkC1 D xk C k K T .y Kxk /;
(46)
where k is obtained by line search, i.e., k D argmin 0 J .xk gradk ŒJ /. If we restrict stepsize k to be fixed in .0; 2=jjK T Kjj/, then the steepest descent method reduces to the famous Landweber-Fridman iteration method. More extensions include nonmonotone gradient method, truncated conjugate gradient method with trust region techniques and different applications in applied science and can be found in, e.g., Brakhage (1987), Barzilai and Borwein (1988), Fletcher (2001), Wang and Yuan (2002), Wang (2007), and Wang and Ma (2007). Finally we want to mention that for underdetermined ill-posed problems, regularization constraints or a priori constraints should be incorporated into the minimization model then we may apply the aforementioned gradient methods. Application examples on aerosol particle size distribution function retrieval problems and nonmonotone gradient method are include in Wang (2008).
5
Practical Applications
5.1
Kernel-Based BRDF Model Inversion
Inversion by NTSVD Consider the linear combination of three kernels kgeo , kvol , and the isotropic kernel fOiso C fOgeo kgeo .ti ; tv ; / C fOvol kvol .ti ; tv ; / D rO for each observation. Considering the smoothing technique in l2 space, we solve the following constrained optimization problem min jjŒfOiso ; fOgeo ; fOvol T jj2 ; s:t: fOiso C fOgeo kgeo C fOvol kvol D rO :
(47)
Let us just consider an extreme example for kernel-based BRDF model: i.e., if only a single observation is available at one time, then it is clear that the above equation has infinitely many solutions. If we denote K D Œ1kgeo .ti , tv , '/kvol .ti , tv , '/ 1 3 , then the singular decomposition of the zero augmented matrix Kaug T leads to Kaug D U3 3 †3 3 V3 3 with U D Œu1 u2 u3 , † D diag.1 , 2 , 3 /, and V D Œv1 v2 v3 , where each ui , vi , i D 1; 2; 3, are the 3-by-1 columns. Our a priori information is based on searching for a minimal norm solution within
1796
Y. Wang
h iT the infinite set of solutions, i.e., the solution f D fOiso ; fOgeo ; fOvol satisfies fOiso C fOgeo kgeo .ti ; tv ; '/ C fOvol kvol .ti ; tv ; '/ D rO and at the same time jjf jj ! minimum.
Tikhonov Regularized Solution Denote by M the number of measurements in the kernel-based models. Then the operator equation (12) can be rewritten in the following matrix-vector form Kx D y; 2
1 61 6 where K D 6 : 4 ::
kgeo .1/ kgeo .2/ :: :
kvol .1/ kvol .2/ :: :
3 7 7 7; 5
2
(48) 3
fiso x D 4 fgeo 5 ; fvol
2 6 6 yD6 4
r1 r2 :: :
3 7 7 7: 5
rM 1 kgeo .M / kvol .M / In which, kgeo .k/ and kvol .k/ represent the values of kernel functions kgeo .ti ; tv ; '/ and kvol .ti ; tv ; '/ corresponding to the k-th measurement for k D 1; 2; : : : I rk represents the k-th observation for k D 1; 2; : : :. By Tikhonov Regularization, we solve for a regularized solution x˛ from minimizing the functional J .x; ˛/ D
1 1 jjKx yjj2 C ˛jjDxjj2 : 2 2
(49)
Choices of the parameter ˛ and the scale operator D are discussed in Sect. 3.3.
Land Surface Parameter Retrieval Results We use the combination of RossThick kernel and LiTransit kernel in the numerical tests. In practice, the coefficient matrix K cannot be determined accurately, and a perturbed version KQ is obtained instead. Also instead of the true measurement y, the observed measurement yn D y C n is the addition of the true measurement y and the noise n, which for simplicity is assumed to be additive Gaussian random noise. Therefore it suffices to solve the following operator equation with perturbation Q D yn ; Kx where KQ WD K C ıB for some perturbation matrix B and ı denotes the noise level (upper bound) of n in (0,1). In our numerical simulation, we assume that B is a Gaussian random matrix and also that jjyn yjj ı < jjyn jj. The above assumption about the noise can be interpreted as that the signal-to-noise ratio (SNR) should be greater than 1. We make such an assumption as we believe that observations (BRDF) are not trustable otherwise. It is clear that (48) is an underdetermined system if M 2 and an overdetermined system if M > 3. Note that for satellite remote sensing, because of the restrictions in view and illumination geometries, KQ T KQ needs
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1797
not have bounded inverse (see Verstraete et al. 1996; Li et al. 2001; Wang et al. 2007a, 2008). We believe that the proposed regularization method can be employed Q -˛ yn jj ! min. to find an approximate solution x˛ satisfies jjKx We use atmospherically corrected moderate resolution imaging spectroradiometer (MODIS) 1B product acquired on a single day as an example of single observation BRDF at certain viewing direction. Each pixel has different view zenith angle and relative azimuth angle. The data MOD021KM.A2001135-150 with horizontal tile number (26) and vertical tile number (4) were measured covers Shunyi county of Beijing, China. The three parameters are retrieved by using this 1B product. Figure 2a plots the reflectance for band 1 of a certain day DOY = 137. In MODIS AMBRALS algorithm, when insufficient reflectances or a poorly representative sampling of high quality reflectances are available for a full inversion, a database of archetypal BRDF parameters is used to supplement the data and a magnitude inversion is performed (see Verstraete et al. 1996; Strahler et al. 1999). We note that the standard MODIS AMBRALS algorithm cannot work for such an extreme case, even for MODIS magnitude inversion since it is hard to obtain seasonal data associated with a dynamic land cover in a particular site. But our method still works for such an extreme case because that smoothness constraint is implanted into the model already. We plot the white-sky albedos (WSAs) retrieved by NTSVD, Tikhonov regularization and sparse inversion for band 1 of one observation (DOY = 137) in Fig. 2b–d, respectively. From Fig. 2b–d, we see that the albedo retrieved from insufficient observations can generate the general profile. We observe that most of the details are preserved though the results are not perfect. The results are similar to the one from NTSVD method developed in Wang et al. (2007a). Hence, we conclude that these developed methods can be considered useful methods for retrieval of land surface parameters and for computing land surface albedos. Thus these developed algorithms can be considered as supplement algorithms for the robust estimation of the land surface BRDF/albedos. We want to emphasize that our method can generate smoothing data for helping retrieval of parameters once sufficient observations are unavailable. As we have pointed out in Wang et al. (2007a, 2008), we do not suggest discarding the useful history information (e.g., data that is not too old) and the multiangular data. Instead, we should fully employ such information if it is available. The key to why our algorithm outperforms previous algorithms is because that our algorithm is adaptive, accurate, and very stable, which solves kernel-based BRDF model of any order, which may be a supplement for BRDF/albedo retrieval product. For the remote sensor MODIS, which can generate a product by using 16 days different observations data, this is not a strict restriction for MODIS, since it aims at global exploration. For other sensors, the period for their detection of the same area will be longer than 20 days or more. Therefore, for vegetation in the growing season, the reflectance and albedos will change significantly. Hence robust algorithms to estimate BRDF and albedos in such cases are highly desired. Our algorithm is a proper choice, since it can generate retrieval results which quite approximate the true values of different vegetation type of land surfaces by capturing just one time of observation.
1798
a
c
Y. Wang 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
b
d
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
Fig. 2 (a) Reflectance for band 1 of MOD021KM.A2001137; (b) white-sky albedo retrieved by Tikhonov regularization method; (c) white-sky albedo retrieved by NTSVD method; and (d) WSA retrieved by l1 sparse regularization method
Moreover, for some sensors with high spatial resolution, the quasi multiangular data are impossible to obtain. This is why there are not high resolution albedo products. But with our algorithm, we can achieve the results. This is urgently needed in real applications.
5.2
Inversion of Airborne Lidar Remote Sensing
The analytical representation of the transmitted laser pulse and the true crosssections given by third-order spline functions. For example, we generate the synthetic laser pulse function flp .x/ within the interval [2, 3] by the formula flp .x/ D 31:25x 3 C 206:25x 2 356:25x C 218:75. The analytical representation of the cross-section function gcs .x/ within the interval [3/8, 1/2] is by the formula gcs .x/ D 8x 3 10x 2 C 3x C 16 . The recorded waveform function hwf .x/ (i.e., data) is calculated by a convolution of the splines representing the transmitted laser pulse .flp .x// and the cross-section .gcs .x// so that hwf .x/ D flp .x/ gcs .x/:
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1799
Fig. 3 (a) Synthetic emitted laser pulse; (b) comparison of the true and recovered cross-sections in the case of noise of level 1; (c) second emitted laser pulse; and (d) recorded echo waveform of the laser pulse shown in (c) (dotted curve) and its reconstruction using the cross-section shown in Fig. 4a (solid curve)
Note that hwf represents the observation that means different kinds of noise may be also recorded besides the true signal. Here we only consider a simple case, i.e., we assume that the noise is mainly additive Gaussian noise in [0, 1], i.e., hwf D true true htrue wf C ı r and .size.hwf //, where ı > 0 is the noise level and randsize hwf is the true Gaussian random noise with the same size as hwf . In our simulation, the Gaussian random noise is generated with mean equaling 0 and standard deviation equaling 2. We apply Tikhonov regularization algorithm (see Sect. 3.3) to recover the crosssection and make a comparison. The synthetic laser pulse sampled with 1 ns resolution is shown in Fig. 3a. Comparisons of the undistorted cross-sections with the recovered cross-sections are illustrated in Fig. 3b. It is apparent that our algorithm can find stable recoveries to the simulated synthetic cross-sections. We do not list the plot of the comparison results for small noise levels since the algorithm yields perfect reconstructions. We also tested the applicability of the regularization method to LMS-Q560 data (RIEGL LMS-Q560 (www.riegl.co.at)). The emitted laser scanner sensor pulse is shown in Fig. 3c. The recorded waveform of the first echo of this pulse is shown in Fig. 3d (dotted line). The retrieved backscatter cross-
1800
a
Y. Wang
b 14 107
0.1 0.09
12
0.08
10 Amplitude
Amplitude
0.07 0.06 0.05 0.04 0.03
8 6 4 2
0.02 0
0.01 0 3810
3820
3830
3840 3850 3860 Time stamp (ns)
3870 3880
–2 3810
3820
3830
3840 3850 3860 Time stamp (ns)
3870
3880
Fig. 4 (a) The retrieved backscatter cross-section using regularization; (b) the retrieved backscatter cross-section using least squares fitting without regularization
section using regularization method is shown in Fig. 4a. The solid line in Fig. 3d shows the reconstructed signal derived by the convolution of the emitted laser pulse and this cross-section. One may see from Fig. 4a that there are several small oscillations in the region [3,850, 3,860] ns. But note that the amplitude of these oscillations are typically small, we consider they are noise or computational errors induced by noise when performing numerical inversion. To show the necessity of regularization, we plot the result of least squares fitting without regularization in Fig. 4b. The comparison results immediately reveal the importance of acceptance of regularization. More extension about numerical performances and comparisons can be found in Wang et al. (2009a).
5.3
Particle Size Distribution Function Retrieval
We consider retrieving aerosol particle size distribution function n.r/ from the attenuation equation (7). But it is an infinite dimensional problem with only a finite set of observations, so it is improbable to implement such a system by computer to get a continuous expression of the size distribution n.r/. Numerically, we solve the discrete problem of operator equation (9). Using collocation (Wang et al. 2006), the infinite problem can be written in an finite dimensional form by sampling some grids frj gN j D1 in the interval of interests [a; b]. Denoting by K D .Kij /N N ; n; % and d the corresponding vectors, we have Kn C % D d: This discrete form can be used for computer simulations.
(50)
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1801
Phillips-Twomey’s regularization is based on solving the problem min Q.n/; s:t:jjKn djj D ; n
(51)
where Q.n/ D .Dn; n/, where D is a preassigned scale matrix. In Phillips-Twomey’s formulation of regularization, the choice of the scale matrix is vital. PN 1 They chose the form2 of the matrix D by the norm of the second differences, i D2 .ni 1 2ni C ni C1 / , which corresponds to the form of matrix D D D2 . However, the matrix D is badly conditioned. For example, with N D 200, the largest singular value is 15.998012. The smallest singular value is 6:495571 1017. This indicates that the condition number of the matrix D defined by the ratio of the largest singular value to the smallest singular value equals 2.462911 1017 , which is worse. Hence, for small singular values of the discrete kernel matrix K, the scale matrix D cannot have them filtered even with large Lagrangian multiplier . This numerical difficulty encourages us to study a more robust scale matrix D, which is formulated as follows. We consider the Tikhonov regularization in Sobolev W 1;2 space as is mentioned in Sect. 3.3. By variational process, we solve a regularized linear system of equations K T Kn C ˛H n K T d D 0;
(52)
where H is a triangular matrix in the form of D1 . For choice of the regularization parameter, we consider the a posteriori approach mentioned in Sect. 3.3. Suppose we are interested in the particle size in the interval [0.1, 4] +m, the step size is hr D N3:9 1 . Now choosing the discrete nodes N D 200, the largest singular value of H is 1:041482176501067 104 by double machine precision, and the smallest singular value of H is 0. 99999999999953 by double machine precision. Compared to the scale matrix D of Phillips-Twomey’s regularization, the condition number of H is 1:041482176501554 104 , which is better than D in filtering small singular values of the discrete kernel K. To perform the numerical computations, we apply the technique developed in King et al. (1978), i.e., we assume that the actual aerosol particle size distribution function consists of the multiplication of two functions h.r/ and f .r/ W n.r/ D h.r/f .r/, where h.r/ is a rapidly varying function of r, while f .r/ is more slowly varying. In this way we have Z
b
aero ./ D
Œk.r; ; /h.r/ f .r/dr;
(53)
a
where k.r; ; / D r 2 Qext .r; ; / and we denote k.r; ; /h.r/ as the new kernel function which corresponding to a new operator \: .„f /.r/ D aero ./:
(54)
1802
Y. Wang
After obtaining the function f .r/, the size distribution function n.r/ can be obtained by multiplying f .r/ by h.r/. The extinction efficiency factor (kernel function) Qext .r; ; / is calculated from Mie theory: by Maxwell’s electromagnetic (E; H ) theory, the spherical particle size scattering satisfies curlH D i 2 E; curlE D i H;
(55)
where D 2=. The Mie solution process is one of finding a set of complex numbers an and bn which give vectors E and H that satisfy the boundary conditions at the surface of the sphere (Bohren and Huffman 1983). Suppose the boundary conditions of the sphere is homogenous, the expressions for Mie scattering coefficients an and bn are related by an .z; / D
n . z/ n0 .z/
n . z/ n0 .z/
bn .z; / D
0 n . z/ n .z/ 0 n . z/ n .z/
0 n .z/ n . z/ ; 0 . z/ .z/ n n
(56)
0 n .z/ n . z/ ; 0 n . z/ n .z/
(57)
p z p p J 1 .z/; n .z/ D z J .z/ i z NnC 1 .z/; JnC 1 .z/ and 2 nC 12 2 2 2 2 2 nC NnC 1 .z/ are the n C 12 -th order first kind Bessel function and second kind Bessel 2 function (Neumann function), respectively. These complex-valued coefficients, functions of the refractive index , z D 2r and z provide the full solution to the scattering problem. Thus the extinction efficiency factor (kernel function) can be written as where
n .z/
D
Qext .r; ; / D
1 2 X .2n C 1/Real.an C bn /: z2 nD1
(58)
The size distribution function nt rue .r/ D 10:5r 3:5 exp.1012 r 2 / is used to generate synthetic data. The particle size radius interval of interest is [0. 1, 2] +m. This aerosol particle size distribution function can be written as nt rue .r/ D h.r/f .r/, where h.r/ is a rapidly varying function of r, while f .r/ is more slowly varying. Since most measurements of the continental aerosol particle size distribution reveal that these functions follow a Junge distribution (Junge 1955), h.r/ D r .C1/ , where is a shaping constant with typical values in the range 2.0–4.0, therefore it is reasonable to use h.r/ of Junge type as the weighting factor to f .r/. In this work, we choose D 3 and f .r/ D 10:5r 1=2 exp.1012 r 2 /. The form of this size distribution function is similar to the one given by Twomey (1975), where a rapidly changing function h.r/ D C r 3 can be identified, but it is more similar to a Junge distribution for r 0:1 +m. One can also generate other particle number size distributions and compare the reconstruction with the input. In the first place, the complex refractive index is assumed to be 1:45 0:00i
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1803
Fig. 5 Iterative computational values of regularization parameters when the error level ı D 0:05 (a); input and retrieved results with our inversion method in the case of error level ı D 0:05 and different complex refractive indices (b) Table 1 The rmses for different noise levels Noise levels ı D 0:005 ı D 0:01 ı D 0:05
˜ D 1:45 0:00i 1:6443 105 1:6493 105 1:6996 105
˜ D 1:45 0:03i 1:2587 105 1:2720 105 1:3938 105
˜ D 1:50 0:02i 2:2773 105 2:2847 105 2:3504 105
and 1:50 0:00i , respectively. Then we invert the same data, supposing has an imaginary part. The complex refractive index is assumed to be 1:45 0:03i and 1:50 0:02i , respectively. The precision of the approximation is characterized by the root mean-square error (rmse) v u m u1 X .comp .i / meas .i //2 rmse D t ; m i D1 .comp .i //2
(59)
which describes the average relative deviation of the retrieved signals from the true signals. In which, comp refers to the retrieved signals and meas refers to the measured signals. Numerical illustrations are plotted in Fig. 5b with noise level ı D 0:05 for different refractive indices, respectively. The behavior of regularization parameter is plotted in Fig. 5a. The rmses for each case are shown in Table 1.
6
Conclusion
In this chapter, we study the regularization and optimization methods for solving the inverse problems in geoscience and quantitative remote sensing. Three typical kernel-based problems are introduced, including computation of number of aerosol
1804
Y. Wang
particle size distribution function, estimation of land surface biomass parameter, and backscatter cross-section. These problems are formulated in functional space by introducing the operator equations of the first kind. The mathematical models and solution methods in l1 and l2 spaces are considered. The regularization strategies and optimization solution techniques are fully described. The equivalence between the Tikhonov regularization and Bayesian statistical inference for solving geoscience inverse problems is established. The general regularization model in lp lq (for p; q 0) spaces, which can be convex or non-convex, are introduced. Numerical simulations for these problems are performed and illustrated. Acknowledgements This research is supported by National “973” Key Basic Research Developments Program of China under grant numbers 2007CB714400, National Natural Science Foundation of China (NSFC) under grant numbers 10871191 and 40974075, and Knowledge Innovation Programs of Chinese Academy of Sciences KZCX2-YW-QN107.
References Ångström A (1929) On the atmospheric transmission of sun radiation and on dust in the air. Geogr Ann 11:156–166 Barzilai J, Borwein J (1988) Two-point step size gradient methods. IMA J Numer Anal 8:141–148 Bockmann C (2001) Hybrid regularization method for the ill-posed inversion of multiwavelength lidar data in the retrieval of aerosol size distributions. Appl Opt 40:1329–1342 Bockmann C, Kirsche A (2006) Iterative regularization method for lidar remote sensing. Comput Phys Commun 174:607–615 Bohren GF, Huffman DR (1983) Absorption and scattering of light by small particles. Wiley, New York Brakhage H (1987) On ill-posed problems and the method of conjugate gradients. In: Engl HW, Groetsch CW (eds) Inverse and ill-posed problems. Academic, Boston, pp 165–175 Camps-Valls G (2008) New machine-learning paradigm provides advantages for remote sensing. SPIE Newsroom. doi:10.1117/2.1200806. 1100 Davies CN (1974) Size distribution of atmospheric aerosol. J Aerosol Sci 5:293–300 Dennis JE, Schnable RB (1983) Numerical methods for unconstrained optimization and nonlinear equations. Prentice Hall, Englewood Cliffs Fletcher R (2001) On the Barzilai-Borwein method. Numerical Analysis report NA/207 Houghton JT, Meira Filho LG, Callander BA, Harris N, Kattenberg A, Maskell K (1966) Climate change 1995. Published for the Intergovernmental Panel on Climate Change, Cambridge University Press Junge CE (1955) The size distribution and aging of natural aerosols as determined from electrical and optical data on the atmosphere. J Meteorol 12:13–25 Kelley CT (1999) Iterative methods for optimization. SIAM, Philadelphia King MD, Byrne DM, Herman BM, Reagan JA (1978) Aerosol size distributions obtained by inversion of spectral optical depth measurements. J Aerosol Sci 35:2153–2167 Li X, Wang J, Hu B, Strahler AH (1998) On utilization of a priori knowledge in inversion of remote sensing models. Sci China D 41:580–585 Li X, Wang J, Strahler AH (1999) Apparent reciprocal failure in BRDF of structured surfaces. Prog Nat Sci 9:747–750 Li X, Gao F, Liu Q, Wang JD, Strahler AH (2000) Validation of a new GO kernel and inversion of land surface albedo by kernel-driven model (1). J Remote Sens 4:1–7 Li X, Gao F, Wang J, Strahler AH (2001) A priori knowledge accumulation and its application to linear BRDF model inversion. J Geophys Res 106:11925–11935
Quantitative Remote Sensing Inversion in Earth Science: Theory and. . .
1805
Mccartney GJ (1976) Optics of atmosphere. Wiley, New York Nguyen T, Cox K (1989) A method for the determination of aerosol particle distributions from light extinction data. In: Abstracts of the American association for aerosol research annual meeting, American Association of Aerosol Research, Cincinnati, pp 330–330 Nocedal J (1980) Updating quasi-Newton matrices with limited storage. Math Comput 95:339–353 Phillips DL (1962) A technique for the numerical solution of certain integral equations of the first kind. J Assoc Comput Mach 9:84–97 Pokrovsky O, Roujean JL (2002) Land surface albedo retrieval via kernel-based BRDF modeling: I. Statistical inversion method and model comparison. Remote Sens Environ 84:100–119 Pokrovsky OM, Roujean JL (2003) Land surface albedo retrieval via kernel-based BRDF modeling: II. An optimal design scheme for the angular sampling. Remote Sens Environ 84:120–142 Pokrovsky IO, Pokrovsky OM, Roujean JL (2003) Development of an operational procedure to estimate surface albedo from the SEVIRI/MSG observing system by using POLDER BRDF measurements: II. Comparison of several inversion techniques and uncertainty in albedo estimates. Remote Sens Environ 87:215–242 Privette JL, Eck TF, Deering DW (1997) Estimating spectral albedo and nadir reflectance through inversion of simple bidirectional reflectance distribution models with AVHRR/MODIS-like data. J Geophys Res 102:29529–29542 Roujean JL, Leroy M, Deschamps PY (1992) A bidirectional reflectance model of the Earth’s surface for the correction of remote sensing data. J Geophys Res 97:20455–20468 Strahler AH, Li XW, Liang S, Muller J-P, Barnsley MJ, Lewis P (1994) MODIS BRDF/albedo product: algorithm technical basis document. NASA EOS-MODIS Doc. 2.1 Strahler AH, Lucht W, Schaaf CB, Tsang T, Gao F, Li X, Muller JP, Lewis P, Barnsley MJ (1999) MODIS BRDF/albedo product: algorithm theoretical basis document. NASA EOS-MODIS Doc. 5.0 Tikhonov AN, Arsenin VY (1977) Solutions of ill-posed problems. Wiley, New York Tikhonov AN, Goncharsky AV, Stepanov VV, Yagola AG (1995) Numerical methods for the solution of ill-posed problems. Kluwer, Dordrecht Twomey S (1975) Comparison of constrained linear inversion and an iterative nonlinear algorithm applied to the indirect estimation of particle size distributions. J Comput Phys 18:188–200 Twomey S (1977) Atmospheric aerosols. Elsevier, Amsterdam Verstraete MM, Pinty B, Myneny RB (1996) Potential and limitations of information extraction on the terrestrial biosphere from satellite remote sensing. Remote Sens Environ 58:201–214 Voutilainenand A, Kaipio JP (2000) Statistical inversion of aerosol size distribution data. J Aerosol Sci 31:767–768 Wagner W, Ullrich A, Ducic V, Melzer T, Studnicka N (2006) Gaussian decomposition and calibration of a novel small-footprint full-waveform digitising airborne laser scanner. ISPRS J Photogram Remote Sens 60:100–112 Wang YF (2007) Computational methods for inverse problems and their applications. Higher Education Press, Beijing Wang YF (2008) An efficient gradient method for maximum entropy regularizing retrieval of atmospheric aerosol particle size distribution function. J Aerosol Sci 39:305–322 Wang YF, Ma SQ (2007) Projected Barzilai-Borwein methods for large scale nonnegative image restorations. Inverse Probl Sci Eng 15:559–583 Wang YF, Ma SQ (2009) A fast subspace method for image deblurring. Appl Math Comput 215:2359–2377 Wang YF, Xiao TY (2001) Fast realization algorithms for determining regularization parameters in linear inverse problems. Inverse Probl 17:281–291 Wang YF, Yang CC (2008) A regularizing active set method for retrieval of atmospheric aerosol particle size distribution function. J Opt Soc Am A 25:348–356 Wang YF, Yuan YX (2002) On the regularity of a trust region-CG algorithm for nonlinear ill-posed inverse problems. In: Sunada T, Sy PW, Yang L (eds) Proceedings of the third Asian mathematical conference, Diliman, Philippines, 23–27, Oct 2000. World Scientific, Singapore, pp 562–580
1806
Y. Wang
Wang YF, Yuan YX (2003) A trust region algorithm for solving distributed parameter identification problem. J Comput Math 21:759–772 Wang YF, Yuan YX (2005) Convergence and regularity of trust region methods for nonlinear ill-posed inverse problems. Inverse Probl 21:821–838 Wang YF, Li XW, Ma SQ, Yang H, Nashed Z, Guan YN (2005) BRDF model inversion of multiangular remote sensing: ill-posedness and interior point solution method. In: Proceedings of the 9th international symposium on physical measurements and signature in remote sensing (ISPMSRS), Beijing, 17–19 Oct 2005, vol XXXVI, pp 328–330 Wang YF, Fan SF, Feng X, Yan GJ, Guan YN (2006a) Regularized inversion method for retrieval of aerosol particle size distribution function in W 1;2 space. Appl Opt 45:7456–7467 Wang YF, Wen Z, Nashed Z, Sun Q (2006b) Direct fast method for time-limited signal reconstruction. Appl Opt 45:3111–3126 Wang YF, Li XW, Nashed Z, Zhao F, Yang H, Guan YN, Zhang H (2007a) Regularized kernelbased BRDF model inversion method for ill-posed land surface parameter retrieval. Remote Sens Environ 111:36–50 Wang YF, Fan SF, Feng X (2007b) Retrieval of the aerosol particle size distribution function by incorporating a priori information. J Aerosol Sci 38:885–901 Wang YF, Yang CC, Li XW (2008) A regularizing kernel-based BRDF model inversion method for ill-posed land surface parameter retrieval using smoothness constraint. J Geophys Res 113:D13101 Wang YF, Zhang JZ, Roncat A, Künzer C, Wagner W (2009a) Regularizing method for the determination of the backscatter cross-section in Lidar data. J Opt Soc Am A 26:1071–1079 Wang YF, Cao JJ, Yuan YX, Yang CC, Xiu NH (2009b) Regularizing active set method for nonnegatively constrained ill-posed multichannel image restoration problem. Appl Opt 48:1389–1401 Wang YF, Yang CC, Li XW (2009c) Kernel-based quantitative remote sensing inversion. In: Camps-Valls G, Bruzzone L (eds) Kernel methods for remote sensing data analysis. Wiley, New York Wang YF, Ma SQ, Yang H, Wang JD, Li XW (2009d) On the effective inversion by imposing a priori information for retrieval of land surface parameters. Sci China D 39:360–369 Wanner W, Li X, Strahler AH (1995) On the derivation of kernels for kernel-driven models of bidirectional reflectance. J Geophys Res 100:21077–21090 Xiao TY, Yu SG, Wang YF (2003) Numerical methods for the solution of inverse problems. Science Press, Beijing Ye YY (1997) Interior point algorithms: theory and analysis. Wiley, Chichester Yuan YX (1993) Numerical methods for nonlinear programming. Shanghai Science and Technology Publication, Shanghai Yuan YX (1994) Nonlinear programming: trust region algorithms. In: Xiao ST, Wu F (eds) Proceedings of Chinese SIAM annual meeting, Tsinghua University Press, Beijing, pp 83–97 Yuan YX (2001) A scaled central path for linear programming. J Comput Math 19:35–40
Correlation Modeling of the Gravity Field in Classical Geodesy Christopher Jekeli
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Correlation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Functions on the Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Functions on the Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 From the Sphere to the Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Properties of Correlation Functions and PSDs . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Stochastic Processes and Covariance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Earth’s Anomalous Gravitational Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The Disturbing Potential as a Stochastic Process . . . . . . . . . . . . . . . . . . . . . . . . . 4 Covariance Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Analytic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The Reciprocal Distance Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Parameter Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Summary and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1808 1810 1811 1813 1816 1817 1821 1824 1826 1830 1831 1832 1833 1835 1837 1840 1843
Abstract
The spatial correlation of the Earth’s gravity field is well known and widely utilized in applications of geophysics and physical geodesy. This paper develops the mathematical theory of correlation functions, as well as covariance functions under a statistical interpretation of the field, for functions and processes on the sphere and plane, with formulation of the corresponding power spectral densities in the respective frequency domains and with extensions into the third
C. Jekeli () Division of Geodetic Science, School of Earth Sciences, Ohio State University, Columbus, OH, USA e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_28
1807
1808
C. Jekeli
dimension for harmonic functions. The theory is applied, in particular, to the disturbing gravity potential with consistent relationships of the covariance and power spectral density to any of its spatial derivatives. An analytic model for the covariance function of the disturbing potential is developed for both spherical and planar application, which has analytic forms also for all derivatives in both the spatial and the frequency domains (including the along-track frequency domain). Finally, a method is demonstrated to determine the parameters of this model from empirical regional power spectral densities of the gravity anomaly.
1
Introduction
The Earth’s gravitational field plays major roles in geodesy, geophysics, and geodynamics and is also a significant factor in specific applications such as precision navigation and satellite orbit analysis. With the advance of instrumentation technology over the last several decades, we now have gravitational models of high spatial resolution over most of the land areas, thanks to extensive ground and expanding airborne survey campaigns and over the oceans owing to satellite radar altimetry, which measures essentially a level surface. Recent satellite gravity missions (e.g., the Gravity Field and Steady-State Ocean Circulation Explorer (GOCE), Rummel et al. 2011) also have vastly improved the longer-wavelength parts of the model with globally distributed in situ measurements. Despite these improvements, there remain deficiencies in resolution, including a lack of uniformity and accuracy in some land areas, such as Antarctica and significant parts of Africa, South America, and Asia (Pavlis et al. 2012a). These gaps will be filled with continued measurement, mostly using airborne systems for efficient accessibility to remote regions. Determining the required resolution and analyzing the effect or significance of the gravitational field at various scales for particular applications often rely on some a priori knowledge of the field. Also, the interpolation and extrapolation of the field from given discrete data and the prediction or estimation of field quantities other than those directly measured requires a weighting function based on the essential spatial correlative characteristics of the gravitational field. For these reasons, the study and development of correlation or covariance functions of the field have occupied geodesists and geophysicists in tandem with the advancements of measurement and instrument technology. The rather slow attenuation of the field as a function of resolution gives it at regional scales a kind of random character, much like the Earth’s topography. Indeed, the shorter spatial wavelengths of the gravitational field are in many cases highly correlated with the topography; and, profiles of topography, like coastlines, are known to be fractals, which arise from certain random fluctuations, analogous to Brownian motion (Mandelbrot 1983). Thus, we may argue that the Earth’s gravitational field at fine scales also exhibits a stochastic nature (Jekeli 1991). This randomness in the field has been argued and counterargued for decades, but it does form the basis for one of the more successful estimation methods in physical geodesy, called least-squares collocation (Moritz 1980). In addition, the
Correlation Modeling of the Gravity Field in Classical Geodesy
1809
correlative description of the field is advantageous in more general error analyses of the problem of field modeling; and, it is particularly useful in generating synthetic fields for deterministic simulations of the field for Monte Carlo types of analyses. The stochastic nature of the gravitational field, besides assumed primarily for the shorter wavelengths, is also limited to the horizontal dimensions. The variation in the vertical (above the Earth’s surface) is constrained deterministically by the attenuation of the gravitational potential with distance from its source, as governed by the solution to Laplace’s differential equation in free space. However, this constraint also extends the stochastic interpretation in estimation theory, since it analytically establishes mutually consistent correlations for vertical derivatives of the potential, or between its horizontal and vertical derivatives, or between the potential (and any of its derivatives) at different vertical levels. Thus, with the help of the corresponding covariance functions, one is able to estimate, for example, the geoid undulation from gravity anomaly data in a purely operational approach using no other physical models, which is the essence of the method of least-squares collocation. It is necessary to distinguish and relate correlation and covariance functions as used in this text. The covariance function refers to random or stochastic processes and is the statistical expectation of the product of the centralized process at two points of the process (i.e., of two random variables with their means removed). The correlation function has more than one definition. As a natural extension of the Pearson correlation coefficient, it is the covariance function normalized by the square roots of the variances of the process at the two points (Priestley 1981). An alternative definition is the statistical expectation of the non-centralized product of the process at two points (Maybeck 1979). A third definition characterizes the correlation of deterministic (nonrandom) functions on the basis of averages of products over the domain of the function (de Coulon 1986). Ultimately, the covariance function and the correlation function, in its various incarnations, are related, but there is an advantage to distinguish between the stochastic and the non-stochastic versions. Minimum error variance estimation requires a stochastic interpretation, and the gravitational field is characterized stochastically in terms of covariance functions. If interpolation or filtering or simulation through arbitrary synthesis is the principal application, then it may be sufficient to dispense with the stochastic interpretation. If the stochastic process is ergodic then the average-based correlation function of its realization is the same as the its covariance function if the means are known and removed. Thus, one may start with the formulation of the physical correlation of the gravitational field without the stochastic underpinning and introduce the stochastic interpretation as needed. Since one of the main applications is the popular leastsquares collocation in physical geodesy, the terminology of covariance functions dominates the later chapters. Whether from the more general or the stochastic viewpoint, the correlative methods can be extended to other fields on the Earth’s surface and to fields that are harmonic in free space. For example, the anomalous magnetic potential (due to the magnetization of the crust material induced by the main, outer-core-generated field of the Earth) also satisfies Laplace’s differential equation.
1810
C. Jekeli
Thus, it shares basic similarities to the anomalous gravitational field. Under certain, albeit rather restrictive assumptions, one field may even be represented in terms of the other (Poisson’s relationship; Baranov 1957). Although this relationship has not been studied in detail from the stochastic or more general correlative viewpoint, it does open numerous possibilities in estimation and error analysis. Finally, it is noted that spatial data analyses in geophysics, specifically the optimal prediction and interpolation of geophysical signals, known as kriging (Olea 1999), rely as does collocation in geodesy on a correlative interpretation of the signals. Semi-variograms, instead of correlation functions, are used in kriging, but they are closely related. Therefore, a study of modeling one (correlations or covariances, in the present case) immediately carries over to the other. The following chapters review correlation functions on the sphere and plane, as well as the transforms into their respective spatial frequency domains. For the stochastic understanding of the geopotential field, the covariance function is introduced, under the assumption of ergodicity (hence, stationarity). Again, the frequency domain formulation, that is, the power spectral density of the field, is of particular importance. The method of covariance propagation, which is indispensable in such estimation techniques as least-squares collocation, naturally motivates the analytic modeling of covariance functions. Models have occupied physical geodesists since the utility of least-squares collocation first became evident, and myriad types of models and approaches exist. In this paper, a single yet comprehensive, adaptable, and flexible model is developed that offers consistency among all derivatives of the potential, whether in spherical or planar coordinates, and in the space or frequency domains. Methods to derive appropriate parameters for this model conclude the essential discussions of this paper.
2
Correlation Functions
We start with functions on the sphere and develop the concept of the correlation function without the need for a stochastic foundation. The statistical interpretation may be imposed at a later time when it is convenient or necessary to do so. As it happens, the infinite plane as functional domain offers more than one option for developing correlation functions, depending on the class of functions, and, therefore, will be treated after considering the unit sphere, . Other types of surfaces that approximate the Earth’s surface more accurately (ellipsoid, geoid, topographic surface) could also be contemplated. However, the extension of the correlation function into space according to potential theory and the development of a useful duality in the spatial frequency domain then become more problematic, if not impossible. In essence, we require surfaces on which functions have a spectral decomposition and such that convolutions transform into the frequency domain as products of spectra. The latter requirement is tied to the analogy between convolutions and correlations. Furthermore, the surface should be sufficiently simple as a boundary in the solution to Laplace’s equation for the gravitational potential. To satisfy all these requirements
Correlation Modeling of the Gravity Field in Classical Geodesy
1811
and with a view toward practical applications, the present discussion is restricted to the plane and the sphere. Although data on the surface are always discrete, we do not consider discrete functions. Rather, it is always assumed that the data are samples of a continuous function. Then, the correlation functions to be defined are also continuous, and correlations among the data are interpreted as samples of the correlation function.
2.1
Functions on the Sphere
Let g and h be continuous, square-integrable functions on the unit sphere, , i.e., 1 4
ZZ 2
g d < 1;
1 4
ZZ
h2 d < 1;
(1)
and suppose they depend on the spherical polar coordinates, f.; / j0 ; 0 < 2g. Each function may be represented in terms of its Legendre transform as an infinite series of spherical harmonics, g .; / D
n 1 X X
Gn;m YNn;m .; /;
(2)
nD0 mDn
where the Legendre transform, or the Legendre spectrum of g, is Gn;m
1 D 4
ZZ g .; / YNn;m .; / d ;
(3)
and where the functions, YNn;m .; /, are surface spherical harmonics defined by YNn;m .; / D PNn;jmj .cos /
cos m; m 0 sin jmj ; m < 0
(4)
The functions, PNn;m , are associated Legendre functions of the first kind, fully normalized so that ZZ 1 1; n0 D n and m0 D m N N (5) Yn0 ;m0 .; / Yn;m .; / d D 0; n0 ¤ n or m0 ¤ m 4
A similar relationship exists between h and its Legendre transform, Hn;m . The degree and order, .n; m/, are wave numbers belonging to the frequency domain. The unit sphere is used here only for convenience, and any sphere (radius, R) may be used. The Legendre spectrum then refers to this sphere.
1812
C. Jekeli
We define the correlation function of g and h as
gh . ; ˛/ D
1 4
ZZ
g .; / h 0 ; 0 sin d d ;
(6)
where the points .; / and . 0 ; 0 / are related by cos
D cos cos 0 C sin sin 0 cos 0 ;
tan ˛ D
sin 0 sin . 0 / ; sin cos 0 cos sin 0 cos . 0 /
(7) (8)
and where the integration is performed over all pairs of points, .; / and . 0 ; 0 /, separated by the fixed spherical distance, , and oriented by the fixed azimuth, ˛. If the spherical harmonic series, Eq. (2), for g and h are substituted into Eq. (6), we find that, due to the special geometry of the sphere, no simple analytic expression results unless we further average over all azimuths, ˛, thus imposing isotropy on the correlation function. Therefore, we redefine the correlation function of g and h (on the sphere) as follows: 1
gh . / D 8 2
Z2 Z Z
g .; / h 0 ; 0 sin d d d ˛:
(9)
0
More precisely, this is the cross-correlation function of g and h. The autocorrelation function of g is simply gg . /. The prefixes, cross- and auto-, are used mostly to emphasize a particular application and may be dropped when no confusion arises. Because of its sole dependence on , gh can be expressed as an infinite series of Legendre polynomials:
gh . / D
1 X
.2n C 1/ ˚gh n Pn .cos /;
(10)
nD0
where the coefficients, ˚gh n , constitute the Legendre transform of gh : 1 ˚gh n D 2
Z
gh . / Pn .cos / sin d :
(11)
0
Substituting the decomposition formula for the Legendre polynomial, Pn .cos / D
n X 1 YNn;m .; / YNn;m 0 ; 0 ; 2n C 1 mDn
(12)
Correlation Modeling of the Gravity Field in Classical Geodesy
1813
and Eq. (9) into Eq. (11) and then simplifying using the orthogonality, Eq. (5), and the definition of the Legendre spectrum, Eq. (3), we find: ˚gh n D
ZZ n X 1 1 g .; / YNn;m .; / 2n C 1 mDn 4
0 @ 1 4
1 0 0 0 0 h ; YNn;m ; sin d d ˛ A sin d d
ZZ
D
n X 1 Gn;m Hn;m 2n C 1 mDn
(13)
where .; / is constant in the inner integral. The quantities, ˚gh n , constituting the Legendre transform of the correlation function, may be called the (cross-) power spectral density (PSD) of g and h. They are determined directly from the Legendre spectra of g and h. The (auto-) PSD of g is simply
˚gg
n
D
n X 1 G2 : 2n C 1 mDn n;m
(14)
The terminology that refers the correlation function to “power” is appropriate since it is an integral divided by the solid angle of the sphere. For functions on the plane, we make a distinction between energy and power, depending on the class of functions.
2.2
Functions on the Plane
On the infinite plane with Cartesian coordinates, f.x1 ; x2 / j 1 < x1 < 1; 1 < x2 < 1g, we consider several possibilities for the functions. The situation is straightforward if the functions are periodic and square integrable over the domain of a period or are square integrable over the plane. Anticipating no confusion, these functions again are denoted, g and h. For the periodic case, with periods, Q1 and Q2 , in the respective coordinates, 1 Q1 Q2
ZQ1ZQ2 2
g dx1 dx2 < 1; 0
1 Q1 Q2
0
ZQ1ZQ2 h2 dx1 dx2 < 1I 0
(15)
0
and each function may be represented in terms of its Fourier transform as an infinite series of sines and cosines, conveniently combined using the complex exponential: 1 X 1 g .x1 ; x2 / D Q1 Q2
1 X
k1 D1 k2 D1
Gk1 ;k2 e
i 2
k1 x1 Q1
C
k2 x2 Q2
;
(16)
1814
C. Jekeli
where the Fourier transform, or the Fourier spectrum of g, is given by ZQ1ZQ2 Gk1 ;k2 D
g .x1 ; x2 / e 0
i 2
k1 x1 Q1
C
k2 x2 Q2
dx1 dx2 ;
(17)
0
and a similar relationship exists between h and its transform, Hk1 ;k2 . Again, the integers, k1 , k2 , are the wave numbers in the frequency domain. Assuming both functions have the same periods, the correlation function of g and h is defined by Q Z1 =2
1
gh .s1 ; s2 / D Q1 Q2
Q Z2 =2
g x10 ; x20 h x10 C s1 ; x20 C s2 dx10 dx20 ;
(18)
Q1 =2 Q2 =2
where g is the complex conjugate of g (we deal only with real functions but need this formal definition). The independent variables are the differences between points of evaluation of h at .x1 ; x2 / and g at x10 ; x20 , respectively, as follows: s1 D x1 x10 ;
s2 D x2 x20 :
(19)
The integration is performed with s1 and s2 fixed and requires recognition of the fact that g and h are periodic. The correlation function is periodic with the same periods as for g and h, and its Fourier transform, that is, the power spectral density (PSD), is discrete and given by
˚gh
k1 ;k2
ZQ1ZQ2 D
gh .s1 ; s2 / e 0
i 2
k1 s1 Q1
C
k2 s2 Q2
ds1 ds2 :
(20)
0
Substituting the correlation function, defined by Eq. (18) into Eq. (20), yields after some straightforward manipulations (making use of Eq. (17) and the periodicity of its integrand): ˚gh k1 ;k2 D
1 G Hk ;k : Q1 Q2 k1 ;k2 1 2
(21)
Analogous to the spherical case, Eq. (13), the PSD of periodic functions on the plane can be determined directly from their Fourier series coefficients.
Correlation Modeling of the Gravity Field in Classical Geodesy
1815
A very similar situation arises for nonperiodic functions that are nevertheless square integrable on the plane: Z1 Z1
Z1 Z1 2
h2 dx1 dx2 < 1:
g dx1 dx2 < 1; 1 1
(22)
1 1
In this case, the Fourier transform relationships for the function are given by Z1 Z1 G .f1 ; f2 / e i 2.f1 x1 Cf2 x2 / df1 df2 ;
g .x1 ; x2 / D
(23)
1 1
Z1 Z1 g .x1 ; x2 / e i 2.f1 x1 Cf2 x2 / dx1 dx2
G .f1 ; f2 / D 1 1
where f1 and f2 are corresponding spatial (cyclical) frequencies. The correlation function is given by Z1 Z1
gh .s1 ; s2 / D
g x10 ; x20 h x10 C s1 ; x20 C s2 dx10 dx20 I
(24)
1 1
and its Fourier transform is easily shown to be ˚gh .f1 ; f2 / D G .f1 ; f2 / H .f1 ; f2 / :
(25)
This Fourier transform of the correlation function is called more properly the energy spectral density, since the correlation function is simply the integral of the product of function. The square integrability of the functions implies that they have finite energy. Later we consider stochastic processes on the plane that are stationary, which means that they are not square integrable. For this case, one may relax the integrability condition to
1 lim lim E1 !1 E2 !1 E1 E2
E Z1 =2
E Z2 =2
jgj2 dx1 dx2 < 1; E1 =2 E2 =2
(26)
1816
C. Jekeli
and we say that g has finite power (energy per domain unit). Analogously, the correlation function is given by
gh .s1 ; s2 / E Z1 =2
1 D lim lim E1 !1 E2 !1 E1 E2
E Z2 =2
g x10 ; x20 h x10 C s1 ; x20 C s2 dx10 dx20 ;
E1 =2 E2 =2
(27) but the Fourier transforms of the functions, g and h, do not exist in the usual way (as in Eq. (23)). On the other hand, the correlation function is square integrable and, therefore, possesses a Fourier transform, that is, the PSD of g and h: Z1 Z1
gh .s1 ; s2 / e i 2.f1 s1 Cf2 s2 / ds1 ds2 :
˚gh .f1 ; f2 / D
(28)
1 1
Consider truncated functions defined on a finite domain: g .x1 ; x2 / ; x1 2 ŒE1 =2; E1 =2 and x2 2 ŒE2 =2; E2 =2 gN .x1 ; x2 / D 0 otherwise
(29)
N Then gN and hN are square integrable on the plane and have Fourier and similarly for h. N transforms, G and HN , respectively; e.g., E Z1 =2
E Z2 =2
g .x1 ; x2 / e i 2.f1 x1 Cf2 x2 / dx1 dx2 :
GN .f1 ; f2 / D
(30)
E1 =2 E2 =2
It is now straightforward to show that in this case, the Fourier transform of gh is given by ˚gh .f1 ; f2 / D lim
lim
E1 !1 E2 !1
1 N G .f1 ; f2 / HN .f1 ; f2 / : E1 E2
(31)
In practice, this power spectral density can only be approximated due to the required limit operators. However, the essential relationship between (truncated) function spectra and the PSD is once more evident.
2.3
From the Sphere to the Plane
For each class of functions on the plane, we did not need to impose isotropy on the correlation function. However, isotropy proves useful in comparisons to
Correlation Modeling of the Gravity Field in Classical Geodesy
1817
the spherical correlation function at high spatial frequencies. In the case of the nonperiodic functions on the plane, a simple averaging over azimuth changes the Fourier transform of the correlation function to its Hankel transform: Z1 ˚gh .f / D 2
Z1
gh .s/s J0 .2fs/ ds; gh .s/ D 2
0
˚gh .f /f J0 .2fs/ df ; 0
(32)
q
q
where s D s12 C s22 and f D f12 C f22 , and J0 is the zero-order Bessel function of the first kind. An approximate relation between the transforms of the planar and spherical isotropic correlation functions follows from the asymptotic relationship between Legendre polynomials and Bessel functions: x lim Pn cos D J0 .x/ ; n!1 n
for x > 0:
(33)
If we let x D 2f s, where s D R , and R is the radius of the sphere, then with 2f n=R, we have x=n D . Hence, for large n (or small ), Pn .cos / J0 .2f s/ :
(34)
Now, discretizing the second of Eqs. (32) (with df D 1=.2R/) and substituting Eq. (33) yields (again, with 2f n=R)
gh .s/
1 X nD0
n s n cos P : ˚ gh n 2R2 2R R
(35)
Comparing this with the spherical correlation function, Eq. (10), we see that .2n C 1/ ˚gh n
n ˚gh .f / ; 2R2
where f
n : 2R
(36)
This relationship between planar and spherical PSDs holds only for isotropic correlation functions and for large n or f .
2.4
Properties of Correlation Functions and PSDs
Correlation functions satisfy certain properties that should then also hold for corresponding models and may aid in their development. The autocorrelation is a positive definite function, since its eigenvalues defined by its spectrum, the PSD, are positive; e.g., see Eq. (14) or from Eq. (31):
1818
C. Jekeli
ˇ2 1 ˇˇ N G .f1 ; f2 /ˇ 0: E1 !1 E2 !1 E1 E2
˚gg .f1 ; f2 / D lim
lim
(37)
The values of the autocorrelation function for nonzero argument are not greater than at the origin: q
gg . / gg .0/ ;
> 0I
gg .s1 ; s2 / gg .0; 0/ ;
s12 C s22 > 0I (38)
where equality would imply a perfectly correlated function (a constant). The inequalities (38) are proved using Schwartz’s inequality applied to the Eqs. (6) and (24), respectively. Note that cross correlations may be larger in absolute value than their values at the origin (e.g., if they vanish there). Because of the imposed isotropy, spherical correlation functions are not defined for < 0. On the other hand, planar correlation functions may be formulated for all quadrants; and, they satisfy:
gh .s1 ; s2 / D hg .s1 ; s2 /;
(39)
which follows readily from their definition, given by Eqs. (24) or (27). Clearly, the autocorrelation function of a real function is symmetric with respect to the origin, even if not isotropic. The correlation function of a derivative is the derivative of the correlation. For finite energy functions, we find immediately from Eq. (24) that @
gh .s1 ; s2 / D @sk
Z1 Z1 1 1
@h 0 x1 C s1 ; x20 C s2 dx10 dx20 g x10 ; x20 @sk
D g; @h .s1 ; s2 / ; @xk
k D 1; 2:
(40)
From this and Eq. (39), we also have @ @
gh .s1 ; s2 / D
.s1 ; s2 / D @g .s1 ; s2 / D @g ;h .s1 ; s2 /; h; @x @xk @sk @sk hg k k D 1; 2:
(41)
The minus sign may be eliminated with the definition of sk , Eqs. (19). We have .g/ .g/ .h/ .h/ @=@sk D @=@xk D @=@xk , where xk and xk refer, respectively, to the coordinates of g and h. Therefore,
@g ;h .g / @xk
g;
@h .h/ @xk
.s1 ; s2 / D
@ .g / gh @xk
.s1 ; s2 / D
@ .h/ gh @xk
.s1 ; s2 /;
.s1 ; s2 /:
(42)
Correlation Modeling of the Gravity Field in Classical Geodesy
1819
The same results may be shown for correlation functions of other types of functions on the plane (where the derivations in the case of the limit operators require a bit more care). Higher-order derivatives follow naturally, and indeed, we see that the correlation function of any linear operators on functions, L.g/ g and L.h/ h, is the combination of these linear operators applied to the correlation function:
L.g/ g;L.h/ h D L.g/ L.h/ gh :
(43)
Independent variables are omitted since this property, known as the law of propagation of correlations, holds also for the spherical case. The PSDs of derivatives of functions on the plane follow directly from the inverse transform of the correlation function: Z1 Z1 ˚gh .f1 ; f2 / e i 2.f1 s1 Cf2 s2 / df1 df2 :
gh .s1 ; s2 / D
(44)
1 1
With Eqs. (42), we find Z1 Z1
@g @h .g / ; @x .h/ @xk k
.s1 ; s2 / D
˚gh .f1 ; f2 / 1 1
@2 .g/ .h/ @xk @xk
e i 2.f1 s1 Cf2 s2 / df1 df2 :
(45)
From this (and Eqs. (19)) one may readily infer the following general formula for the PSD of the derivatives of g and h of any order: ˚gp1 p2 ;hq1 q2 .f1 ; f2 / D .1/p1 Cp2 .i 2f1 /p1 Cq1 .i 2f2 /p2 Cq2 ˚gh .f1 ; f2 / ;
(46)
ı p p ı q q where gp1 p2 D @p1 Cp2 g @x1 1 @x2 2 and hq1 q2 D @q1 Cq2 h @x1 1 @x2 2 . These expressions could be obtained also through Eqs. (21), (25), or (31), from the spectra of the function derivatives, which have a corresponding relationship to the spectra of the functions. For functions on the sphere, the situation is hardly as simple. Indeed, this writer is unaware of formulas for the PSDs of horizontal derivatives, with the exception of an approximation for the average horizontal derivative, s dH g .; / D
@g @
2
C
1 @g sin @
2 :
(47)
1820
C. Jekeli
Making use of an orthogonality proved by Jeffreys (1955): 1 4
ZZ
! 1 @YNn;m @YNp;q @YNn;m @YNp;q n .n C 1/ ; n D p and m D q C d D 0; n ¤ p or m ¤ q @ @ sin2 @ @ (48)
the autocorrelation function of dHg at 1
dHg;dHg .0/ D 4
ZZ
D 0 from Eq. (9) becomes
@g @
2
C
1 @g sin @
2 ! sin d d
D
1 X
n X
n .n C 1/
nD0
2 Gnm :
(49)
mDn
It is tempting to identify the PSD by comparing this result to Eq. (10), but Eq. (49) proves this form of the correlation function only for D 0. The error in this approximation of the PSD of dHg is an open question. _ _ For functions, g .x1 ; x2 I z/, that satisfy Laplace’s equation, r 2 g D 0, in the space exterior to the plane (i.e., they are harmonic for z > 0) and satisfy the boundary _ condition, g .x1 ; x2 I 0/ D g .x1 ; x2 /, the Fourier spectrum on any plane with z D z0 > 0 is related to the spectrum of g: _
G .f1 ; f2 I z0 / D G .f1 ; f2 / e 2f z0 ;
(50)
_
where f 2 D f12 C f22 . Similarly, for functions, g .; I r/, harmonic outside the _ sphere (r > R) that satisfy g .; I R/ D g .; /, the Legendre spectrum on any sphere with r D r0 > R is related to the spectrum of g according to _
G n;m .r0 / D
R r0
nC1 Gn;m :
(51)
Therefore, the corresponding spectral densities are analogously related. In general, the cross PSD of g at level, z D zg , and h at level, z D zh , is given (e.g., substituting _
_
Eq. (50) for g and h into Eq. (31)) by ˚__ f1 ; f2 I zg ; zh D e 2f .zg Czh / ˚gh .f1 ; f2 / : gh
(52)
Note that the altitudes add in the exponent. Similarly, for cross PSDs of functions on spheres, r D rg and r D rh , we have 2 nC1 R ˚__ rg ; rh ˚gh n : D gh rg rh n
(53)
Correlation Modeling of the Gravity Field in Classical Geodesy
1821
Although the altitude variables were treated strictly as parameters in these PSDs, one may consider briefly the corresponding correlation functions as “functions” of z and r, respectively, for the sole purpose of deriving the correlation functions of vertical (radial) derivatives. Indeed, it is readily seen from the definitions, Eqs. (9) _ _ and (27), for the cross correlation of g ; I rg and h .; I rh / that
@_g
I rg ; rh D
_ @h @rg ; @rh
@2
_ _ I r g ; r h ; @rg @rh g h
@_g @_h s1 ; s2 I zg ; zh @zg ; @zh
D
@2
__ s1 ; s2 I zg ; zh ; @zg @zh g h
(54)
and the law of propagation of correlations, Eq. (43), holds also for this linear operator. It should be stressed, however, that the correlation function is essentially a function of variables on the plane or sphere; no integration of products of functions takes place in the third dimension. The cross PSDs of vertical derivatives, therefore, are easily derived by applying Eqs. (54) to the inverse transforms of the correlation functions, Eqs. (44) and (10), with extended expressions for the PSDs, Eqs. (52) and (53). The result is ˚_ g
_ j zg
h zk
h
˚_ g
_
j rg
@j Ck 2f .zg Czh / f1 ; f2 I zg ; zh D j e ˚gh .f1 ; f2 / @zg @zkh
_
h rk h
_
D .2f /j Ck e 2f .zg Czh / ˚gh .f1 ; f2 / ; ! 2 nC1 R @j Ck D j rg ; rh ˚gh n ; k rg rh @rg @r _
(56)
h
n
j
(55)
_
_
_
j
_
_
where g zjg D @j g=@zg , hzk D @k h=@zkh , g rgj D @j g=@rg , and hr k D @k h=@rhk . Thus, h h the PSD for any combination of horizontal and vertical derivatives of g and h on horizontal planes in Cartesian space may be obtained by appending the appropriate factors to ˚gh . The same holds for any combination of vertical derivatives of g and h on concentric spheres.
3
Stochastic Processes and Covariance Functions
A stochastic (or random) process is a collection, discrete or continuous, of random variables that are associated with a deterministic variable, in our case, a point on the plane or sphere. At each point, the process is random with an underlying probability distribution. A probability domain or sample space for each random
1822
C. Jekeli
variable is implied but omitted in the following simplified notation; in fact, the distribution may be unknown. If each random variable takes on a specific value from the corresponding sample spaces, then the process is said to be realized, and this realization is a function of the point coordinates. Thus, we continue to use the notation, g, to represent a continuous stochastic process, with the understanding that for any fixed point, it is a random variable. We assume that the process is wide-sense stationary, meaning that all statistics up to second order are invariant with respect to the origin of the space variable. Then, the expectation of g at all points is the same constant, and the covariance between g at any two points depends only on the displacement (vector) of one point relative to the other. Typically, besides not knowing the probability distribution, we have access only to a single realization of the stochastic process, which makes the estimation of essential statistics such as the mean and covariance problematic, unless we invoke an additional powerful condition characteristic of many processes – ergodicity. For ergodic processes the statistics associated with the underlying probability law, based on the statistical (ensemble) expectation, are equivalent to the statistics derived from space-based averages of a single realization of the process. Stationarity is necessary but not sufficient for ergodicity. Also, we consider only wide-sense ergodicity. It can be shown that stationary stochastic processes whose underlying probability distribution is Gaussian is also ergodic (Moritz 1980; Jekeli 1991). We do not need this result since the probability distribution is not needed in our developments; and, indeed, ergodic processes on the sphere cannot be Gaussian (Lauritzen 1973). For stochastic processes on the sphere, we define the space average as ZZ 1 M ./ D ./ d : (57) 4
Let g and h be two such processes that are ergodic (hence, also stationary) and let their means, according to Eq. (57), be denoted g and h . Then, the covariance function of g and h is given by 1 M g g .h h / D 4
ZZ
g .; / g
0 0 h ; h sin d d ;
(58) which, by the stationarity, depends only on the relative location of g and h, that is, on . ; ˛/, as given by Eqs. (7) and (8). We will assume without loss in generality that the means of the processes are zero (if not, redefine the process with its constant mean removed). Then, clearly, the covariance function is like the correlation function, Eq. (6), except the interpretation is for stochastic processes. We continue to use the same notation, however, and further redefine the covariance function to be isotropic by including an average in azimuth, ˛, as in Eq. (9). The Legendre transform of the covariance function is also the (cross) PSD of g and h and is given by Eq. (13). The quantities,
Correlation Modeling of the Gravity Field in Classical Geodesy
cgh
n
1823
D .2n C 1/ ˚gh n ;
(59)
are known as degree variances, or variances per degree, on account of the total variance being, from Eq. (10),
gh .0/ D
1 X
cgh n :
(60)
nD0
Ergodic processes on the plane are not square integrable since they are also stationary, and we define the average operator as 1 M ./ D lim lim E1 !1 E2 !1 E1 E2
E Z1 =2
E Z2 =2
./ dx1 dx2 :
(61)
E1 =2 E2 =2
The covariance function under the assumption of zero means for g and h is, again, the correlation function given by Eq. (27). However, the PSD requires some N defined as in additional derivation since the truncated stochastic processes, gN and h, Eq. (29), are not stationary and, therefore, not ergodic. Since both gN and hN are random for each space variable, their Fourier transforms, N G and HN , are also stochastic in nature. Consider first the ensemble expectation of the product of transforms, given by Eq. (30), 0 E Z1 =2 EZ2 =2 EZ1 =2 EZ2 =2 B N N E G .f1 ; f2 / H .f1 ; f2 / D @ E g x10 ; x20 h .x1 ; x2 / E1 =2 E2 =2 E1 =2 E2 =2
e
i 2 .f1 .x10 x1 /Cf2 .x20 x2 //
! dx1 dx2 dx10 dx20
(62)
The expectation inside the integrals is the same as the space average and is the covariance function of g and h, as defined above, which because of their stationarity depends only on the coordinate differences, s1 D x1 x10 and s2 D x2 x20 . It can be shown Brown (1983, p. 86) that the integrations reduce to E GN .f1 ; f2 / HN .f1 ; f2 / D 0 E E 1 Z1 Z2 js js j j 1 2 E1 E2 @ 1 1
gh .s1 ; s2 / e i 2.f1 s1 Cf2 s2 / ds1 ds2 A : 2E1 2E2 E1 E2
(63) In the limit, the integrals on the right side approach the Fourier transform of the covariance function, that is, the PSD, ˚gh .f1 ; f2 /; and, we have
1824
C. Jekeli
˚gh .f1 ; f2 / D lim
lim E
E1 !1 E2 !1
1 N N G .f1 ; f2 / H .f1 ; f2 / : E1 E2
(64)
Again, in practice, this PSD can only be approximated due to the limit and expectation operators. We have shown that under appropriate assumptions (ergodicity), the covariance functions of stochastic processes on the sphere or plane are essentially identical to the corresponding correlation functions that were developed without a stochastic foundation. The only exception occurs in the relationship between Fourier spectra and the (Fourier) PSD (compare Eqs. (31) and (64)). Furthermore, from Eqs. (62) through (64) we have also shown that the covariance function of a stochastic process is the Fourier transform of the PSD, given by Eq. (64). This is a statement of the more general Wiener-Khintchine theorem (Priestley 1981). Although there are opposing schools of thought as to the stochastic nature of a field like Earth’s gravitational potential, we will argue (see below) that the stochastic interpretation is entirely legitimate. Moreover, the stochastic interpretation of the gravitational field is widely, if not uniformly, accepted in geodesy (e.g., Moritz 1978, 1980; Hofmann-Wellenhof and Moritz 2005), as is the covariance nomenclature. Moritz (1980) provided compelling justifications to view the gravitational field as a stochastic process on the plane or sphere. The use of covariance functions also emphasizes that the significance of correlations among functions lies in their variability irrespective of the means (which we will always assume to be zero). For these reasons, we will henceforth in our applications to the Earth’s gravitational field refer only to covariance functions, use the same notation, and use all the properties and relationships derived for correlation functions.
3.1
Earth’s Anomalous Gravitational Field
The masses of the Earth, including all material below its surface, as well as the atmosphere, generate the gravitational field, which in vacuum is harmonic and satisfies Laplace’s differential equation. For present purposes we neglect the atmosphere (and usually its effect is removed from data) so that for points, x, above the surface, the gravitational potential, V , fulfills Laplace’s equation, r 2 V .x/ D 0:
(65)
Global solutions to this equation depend on boundary values of V or its derivatives on some mathematically convenient bounding surface. Typically this surface is a sphere with radius, a, and the solution is then expressed in spherical polar coordinates, ı .r; ; /, as an infinite series of solid spherical harmonic functions, YNn;m .; / r nC1 , for points outside the sphere:
Correlation Modeling of the Gravity Field in Classical Geodesy
V .r; ; / D
n 1 GM X X a nC1 Cn;m YNn;m .; /; a nD0 mDn r
1825
(66)
where GM is Newton’s gravitational constant times the total mass of the Earth (this scale factor is determined from satellite tracking data); and Cn;m is a coefficient of degree, n, and order, m, determined from V and/or its radial derivatives on the bounding sphere (obtained, e.g., from measurements of gravity). Modern solutions also make use of satellite tracking data and in situ measurements of the field by satellite-borne instruments to determine these coefficients. In a coordinate system fixed to the Earth, we define the gravity potential as the sum of the gravitational potential, V , due to mass attraction and the (nongravitational) potential, ', whose gradient equals the centrifugal acceleration due to Earth’s rotation: W .x/ D V .x/ C ' .x/ :
(67)
If we define a normal (i.e., reference) gravity potential, U D V el lip C , associated with a corotating material ellipsoid, such that on this ellipsoid, U jx2ellip D U0 , then the difference, called the disturbing potential, T .x/ D W .x/ U .x/ ;
(68)
is also a harmonic function in free space and may be represented as a spherical harmonic series: T .r; ; / D
n 1 GM X X a nC1 ıCn;m YNn;m .; /; a nD2 mDn r
(69)
where the ıCn;m are coefficients associated with the difference, V V el lip . The total ellipsoid mass is set equal to the Earth’s total mass, so that ıC0;0 D 0; and, the coordinate origin is placed at the center of mass of the Earth (and ellipsoid), implying that the first moments of the mass distribution all vanish: ıC1;m D 0 for m D 1; 0; 1. The set of spherical harmonic coefficients, tn;m D .GM =a/ ıCn;m , represents the Legendre spectrum of T . Practically, it is known only up to some finite degree, nmax ; for example, the model, EGM2008, has nmax D 2;190 (Pavlis et al. 2012a,b). The harmonic coefficients of this model refer to a sphere of boundary values whose radius is equated with the semimajor axis of the best-fitting Earth ellipsoid. The uniform convergence of the infinite series, Eq. (69), is guaranteed for r a, but effects of divergence are evident in the truncated series, EGM2008, when r < a, and due care should be exercised in evaluations on or near the Earth’s surface. The disturbing potential may also be defined with respect to higher-degree reference potentials, although in this case one may need to account for significant
1826
C. Jekeli
ref . In particular, the local interpretation of the field as errors in the coefficients, Cn;m a stationary random process usually requires removal of a higher-degree reference field. In the Cartesian formulation, the disturbing potential in free space (z 0) is expressed in terms of its Fourier spectrum, .f1 ; f2 /, on the plane, z D 0, as Z1 Z1 .f1 ; f2 / e 2f z e i 2.f1 x1 Cf2 x2 / df1 df2 :
T .x1 ; x2 I z/ D
(70)
1 1
q where f D
3.2
f12 C f22 .
The Disturbing Potential as a Stochastic Process
In addition to the well-grounded reasoning already cited, an alternative justification of the stochastic nature of T is argued here based on the fractal (self-similar) characteristics of Earth’s topography (see also Turcotte 1987). This will provide also a basis for modeling the covariance function of T and its derivatives. The fractal geometry of the Earth’s topography (among fractals in general) was investigated and popularized by Mandelbrot in a number of papers and reviewed in his book (Mandelbrot 1983) using fundamentally the concept of Brownian motion, which is the process of a random walk. Thus, without going into the details of fractals, we have at least a connection between topography and randomness. Next, we may appeal to the well-known (in physical geodesy and geophysics) high degree of linear correlation between gravity anomalies and topographic height. This correlation stems from the theory of isostasy that explains the existence of topography on the Earth whose state generally tends toward one of hydrostatic equilibrium. Although this correlation is not perfect (or almost nonexistent in regions of tectonic subsidence and rifting), empirical evidence suggests that in many areas the correlation is quite faithful to this theory, even with a number of seemingly crude approximations. The gravity anomaly, g, and its isostatic reduction are defined in HofmannWellenhof and Moritz (2005). At a point, P , the isostatically reduced gravity anomaly is given by gI .P / D g .P / C .P / C A .P / ;
(71)
where C .P / is the gravitational effect of all masses above the geoid and A .P / is the effect of their isostatic compensation. Several models for isostatic compensation have been developed by geophysicists (Watts 2001). Airy’s model treats the compensation locally and assumes that there is no regional flexural rigidity in the lithosphere. With this model, the topography presumably floats in the denser mantle, and equilibrium is established according to the buoyancy principle (Fig. 1):
Correlation Modeling of the Gravity Field in Classical Geodesy P
Topographic surface
ρ
rh Density layers:
D
−(rm−r)h'
b
−r(1−rw /r)b Geoid
ρ b'
Crust h'
ρ
rw
h
1827
rm
(rm−r)b'
Mantle
Fig. 1 Isostatic compensation of topography according to the Airy model
h D .m / h0 D h0 ;
(72)
where h0 is the (positive) depth of the “root” with respect to the depth of compensation, D (typically, D D 30 km), and the crust density, , and the mantle density, m , are assumed constant. Similarly, in ocean areas, the lower density of water relative to the crust allows the mantle to intrude into the crust, where equilibrium is established if . w / b D b 0 , and b is the (positive) bathymetric distance to the ocean floor, b 0 is the height of the “anti-root” of mantle material, and w is the density of seawater. Removing the mass that generates C .P / makes the space above the geoid homogeneous (empty). According to Airy’s model, the attraction, A .P /, is due, in effect, to adding that mass to the root so as to make the mantle below D homogeneous. If the isostatic compensation is perfect according to this model, then the isostatic anomaly would vanish because of this created homogeneity; and, indeed, isostatic anomalies tend to be small. Therefore, the free-air gravity anomaly according to Eq. (71) with gI .P / 0 is generated by the attraction due to the topographic masses above the geoid, with density, , and by the attraction due to the lack of mass below the depth of compensation, with density, : g .P / C .P / A .P / :
(73)
Expressions for the terms on the right side can be found using various approximations. One such approximation (Forsberg 1985) “condenses” the topography onto the geoid (Helmert condensation, Helmert 1884; Martinec 1998), and the gravitational effect is then due to a two-dimensional mass layer with density, H D h. Likewise, the gravitational effect of the ocean bottom topography can be modeled by forming a layer on the geoid that represents the ocean’s
1828
C. Jekeli
deficiency in density relative to the crust. The density of this layer is negative: B D . w / b D .1 w =/ b. The gravitational potential, v, at a point, P , due to a layer condensed from topography (or bathymetry) is given by ZZ N h .Q/ d Q ; v .P / D GR ` 2
Q
( hN .Q/ D
Q 2 land ; h .Q/ w 1 b .Q/ ; Q 2 ocean (74)
where ` is the distance between P and the integration point. Similarly, the potential of the mass added below the depth of compensation can 0 be approximated by that of another layer at level D with density, H D h0 , representing a condensation of material that is deficient in density with respect to the mantle and extends a depth, h0 , below D(see Fig. 1). For ocean areas, the antiroot is condensed onto the depth of compensation with density, B0 D b 0 . Equation (74) for a fixed height of the point, P , is a convolution of hand the inverse distance. Further making the planar (for local, or highqapproximation 2 0 2 x1 x1 C x2 x20 C z2 , with frequency applications), this distance is ` 0 0 x1 ; x2 being the planar coordinates of point Q. Applying the convolution theorem, the Fourier transform of the potential at the level of z > 0 is given by V .f1 ; f2 I z/ D
G N H .f1 ; f2 / e 2f z : f
(75)
Including the layer at the compensation depth, D, below the geoid with density, 0 H D h (in view of Eqs. (72); and similarly B0 D .1 w =/ b for ocean areas), the Fourier transform of the total potential due to both the topography and its isostatic compensation is approximately V .f1 ; f2 I z/ D
G N H .f1 ; f2 / e 2f z e 2f .DCz/ : f
(76)
Since the gravity anomaly is approximately the radial derivative of this potential, multiplying by 2f yields its Fourier transform: G .f1 ; f2 I z/ D 2GHN .f1 ; f2 / e 2f z e 2f .DCz/ :
(77)
Neglecting the upward continuation term, as well as the isostatic term (which is justified only for very short wavelengths), confirms the empirical linear relationship between the heights and the gravity anomaly. Figure 2 compares the PSDs of the topography and the gravitational field both globally and locally. The global PSDs were computed from spherical harmonic expansions EGM2008 for the gravitational potential and DTM2006 for the topography (Pavlis et al. 2012a) according to Eq. (14) but converted to spatial frequency
1.10 15 1.10 14 1.10 13 1.10 12 1.10 11 1.10 10 1.10 9 1.10 8 1.10 7 1.10 6 1.10 5 1.10 4 1.10 3 100 10 1 0.1 1.10 –7
1829
DTM2006 gra v topo
EGM2008
area 1 1.10 –6
area 2
geoid undulation psd [ m2/(cy/m)2]
Correlation Modeling of the Gravity Field in Classical Geodesy
gra v topo
1.10 –5 frequency [cy/m]
1.10 –4
1.10 –3
Fig. 2 Comparison of gravitational and topographic PSDs, scaled to the geoid undulation PSD. Global models are EGM2008 and DTM2006, and local PSDs were derived from gravity and topographic data in the indicated areas 1 and 2
using Eq. (36). In addition, both were scaled to the PSD of the geoid undulation, which is related to the disturbing potential as N D T = , where D 9:8 m/s2 is an average value of gravity. The topographic height is related to the potential through Eq. (76). Both expansions are complete up to degree, nmax D 2;160 (fmax D 5:4105 cy/m). DTM2006 is an expansion for both the topographic height above mean sea level and the depth of the ocean and, therefore, does not exactly N as defined in Eq. (74). This contributes to an overestimation of correspond to h, the power at lower frequencies. The obviously lower power of EGM2008 at higher frequencies results from the higher altitude, on average, to which its spectrum refers, that is, the sphere of radius, a. The other PSDs in Fig. 2 correspond to the indicated regions and were derived according to Eq. (31) from local terrain elevation and gravity anomaly data provided by the US National Geodetic Survey. The data grids in latitude and longitude have resolution of 30 arcsec for the topography and 1 arcmin for the gravity. With a planar approximation for these areas, the Fourier transforms were calculated using their discrete versions. The PSDs were computed by neglecting the limit (and expectation) operators and were averaged in azimuth. Dividing the gravity PSD by .2 f /2 then yields the geoid undulation PSD; and, as before, Eq. (76) relates theıtopography PSD to the potential PSD that scales to the geoid undulation PSD by 1 2 . In these regions, the gravity and topography PSDs match well at the higher frequencies at least, attesting to their high linear correlation. Moreover, these PSDs follow a power law in accord with the presumed fractal nature of the
1830
C. Jekeli
topography. These examples then offer a validation of the stochastic interpretation of the gravitational field and also provide a starting point to model its covariance function.
4
Covariance Models
Since the true covariance function of a process, such as the Earth’s gravity field, rarely is known and local functions can vary from region to region (thus we allow global non-stationarity in local applications), it must usually be modeled from data. We consider here primarily the modeling of the autocovariance function, that is, when g D h. Models for the cross covariance function could follow similar procedures, but usually g and h are linearly related and the method of propagation of covariances (see Sect. 2.4) should be followed to derive gh from gg . Modeling the covariance (or correlation) function of a process on the plane or sphere can proceed with different assumptions and motivations. We distinguish in the first place between empirical and analytic methods and in the second between global and local models. Global models describe the correlation of functions on the sphere; whereas, local models usually are restricted to applications where a planar approximation suffices. Empirical models are derived directly from data distributed on the presumably spherical or planar surface of the Earth. Rarely, if ever, are global empirical covariance models determined for the sphere according to the principal definition, Eq. (9). Instead, such models are given directly by the degree variances, Eq. (59). For local modeling with the planar approximation, the empirical model comes from a discretization of Eq. (27), where we neglect also the limit processes, 1 XX 0 0 0
O gg .s1 ; s2 / D g x1 ; x2 g x1 C s1 ; x20 C s2 ; M 0 0 x1
(78)
x2
and where M is the total number of summed products for each .s1 ; s2 /. A corresponding approximation for an isotropic model additionally averages the products q 2 0 2 x1 x1 C x2 x20 , over of g that are separated by a given distance, s D all directions. Typically, the maximum s considered is much smaller than (perhaps 10–20 % of) the physical dimension of the data area, since the approximation of Eq. (27) by Eq. (78) worsens as the number of possible summands within a finite area decreases. Also, M may be fixed at the largest possible value (the number of products for s D 0) in order to avoid a numerical nonpositive definiteness of the covariance function (Marple 1987, p. 148). However, this creates a biased estimate of the covariance, particularly for the larger distances. Another form of empirical covariance model is its Fourier (or Legendre) transform, derived directly from the data, as was illustrated for the gravity anomaly and topography in Fig. 2. The inverse transform then yields immediately the covariance function. The disadvantage of the empirical covariance model, Eq. (78), is the limited ability (or inability) to derive consistent covariances of functionally
Correlation Modeling of the Gravity Field in Classical Geodesy
1831
related quantities, such as the derivatives of g through the law of propagation of covariances, Eq. (43). This could only be accomplished by working with its transform (see Eqs. (46)), but generally, an analytic model eventually simplifies the computational aspects of determining auto- and cross covariances.
4.1
Analytic Models
Analytic covariance models are constructed from relatively simple mathematical functions that typically are fit to empirical data (either in the spatial or frequency domains) and have the benefit of easy computation and additional properties useful to a particular application (such as straightforward propagation of covariances). An analytic model should satisfy all the basic properties of the covariance function (Sect. 2.4), although depending on the application some of these may be omitted (such as the harmonic extension into space for the Gaussian model, .s/ D 2 2 e ˇs ). An analytic model may be developed for the PSD or the covariance function. Ideally (but not always), one leads to a mathematically rigorous formulation of the other. Perhaps the most famous global analytic model is known as Kaula’s rule, proposed by W. Kaula (1966, p. 98) in order to develop the idea of a stochastic interpretation of the spherical spectrum of the disturbing potential: .˚T T /n D
GM R
2
1010 4 4 m /s ; n4
(79)
where R is the mean Earth radius. It roughly described the attenuation of the harmonic coefficients known at that time from satellite tracking observations, but it is reasonably faithful to the spectral attenuation of the field even at high degrees (see Fig. 4). Note that Kaula’s rule is a power-law model for the PSD of the geopotential, agreeing with our arguments above for such a characteristic based on the fractal nature of the topography. The geodetic literature of the latter part of the last century is replete with different types of global and local covariance and PSD models for the Earth’s residual gravity field (e.g., Jordan 1972; Tscherning and Rapp 1974; Heller and Jordan 1979; Forsberg 1987; Milbert 1991; among others); but, it is not the purpose here to review them. Rather the present aim is to promote a single elemental prototype model that (1) satisfies all the properties of a covariance model for a stochastic process, (2) has harmonic extension into free space, (3) has both spherical and planar analytic expressions for all derivatives of the potential in both the space and frequency domains, and (4) is sufficiently adaptable to any strength and attenuation of the gravitational field. This is the reciprocal distance model introduced by Moritz (1976, 1980), so called because the covariance function resembles an inverse-distance weighting function. It was also independently studied by Jordan et al. (1981).
1832
4.2
C. Jekeli
The Reciprocal Distance Model
Consider the disturbing potential, T , as a stochastic process on each of two possibly different horizontal parallel planes or concentric spheres. Given a realization of T on one plane (or sphere), its realization on the other plane (or sphere) is well defined by a solution to Laplace’s equation, provided both surfaces are on or outside the Earth’s surface (approximated as a plane or sphere). The reciprocal distance covariance model between T on one plane and T on the other is given by
T T .sI z1 ; z2 / D q
2
;
(80)
˛ 2 s 2 C .1 C ˛ .z1 C z2 //2
q where with Eq. (19), s D s12 C s22 ; z1 , z2 are heights of the two planes; and 2 , ˛ are parameters. The Fourier transform, or the PSD, is given by ˚T T .f I z1 ; z2 / D
2 2f .z1 Cz2 C1=˛/ e ; ˛f
f ¤ 0:
(81)
For spheres with radii, r1 R and r2 R, the spherical covariance model is 2 .1 0 / =0 ;
T T . I r 1 ; r 2 / D p 1 C 2 2 cos
(82)
where is given by Eq. (7), 0 D .R0 =R/2 and 2 are parameters, and D ı 2 R0 .r1 r2 /. The Legendre transform, or PSD, is given by .˚T T /n D
2 .1 0 / nC1 : .2n C 1/ 0
(83)
In all cases, the heights (or radii) refer to fixed surfaces that define the spatial domain of the corresponding stochastic process. Since we allow z1 ¤ z2 or r1 ¤ r2 , the models, Eqs. (80) and (82), technically are cross covariances between two different (but related) processes; and Eqs. (81) and (83) are cross PSDs. The equivalence of the models, Eqs. (80) and (82), as the spherical surface approaches a plane, is established by identifying z1 D r1 R, z2 D r2 R and s 2 2R2 .1 cos /, from which it can be shown (Moritz 1980, p. 183) that 1 p 1 C 2 2 cos
R q ; s 2 C .1=˛ C z1 C z2 /2
(84)
where 1=˛ D 2 .R R0 / and terms of order .R R0 /=R are neglected. The variance parameter, 2 , is the same in both versions of the model. It is noted that
Correlation Modeling of the Gravity Field in Classical Geodesy
1833
this model, besides having analytic forms in both the space and frequency domains, is isotropic, depending only on the horizontal distance. Moreover, it correctly incorporates the harmonic extension for the potential at different levels. It is also positive definite since the transform is positive for all frequencies. The analytic forms permit exact propagation of covariances as elaborated in Sect. 2.4. Since many applications involving the stochastic interpretation of the field nowadays are more local than global, only the (easier) planar propagation is given here (Appendix A) up to second-order derivatives. The covariance propagation of derivatives for similar spherical models was developed by Tscherning (1976). Note that the covariances of the horizontal derivatives are not isotropic. One further useful feature of the reciprocal distance model is that it possesses analytic forms for hybrid PSD/covariance functions, those that give the PSD in one dimension and the covariance in the other: Z1 ˚T T .f1 ; f2 I z1 ; z2 / e i 2f2 s2 df2
ST T .f1 I s2 I z1 ; z2 / D 1
Z1
T T .s1 ; s2 I z1 ; z2 / e i 2f2 s2 ds1
D
(85)
1
The first integral transforms the PSD to the covariance in the second variable, while the second equivalent integral transforms the covariance function to the frequency domain in the first variable. When a process is given only on a single profile (e.g., along a data track), one may wish to model its along-track PSD, which is the hybrid PSD/covariance function with s2 D 0. Appendix B gives the corresponding analytic forms for the (planar) reciprocal distance model.
4.3
Parameter Determination
The reciprocal distance PSD model, Eq. (81), clearly does not have the form of a power law, but it nevertheless serves in modeling the PSD of the gravitational field when a number of these models are combined linearly:
˚T T .f I z1 ; z2 / D
J X j2
˛ f j D1 j
e 2f .z1 Cz2 C1=˛j / :
(86)
The parameters, ˛j , j2 , are chosen appropriately to yield a power-law attenuation of the PSD. This selection is based on the empirical PSD of data that in the case of the gravitational field are usually gravity anomalies, g @T =@r, on the Earth’s surface (z1 D z2 D 0).
1834
C. Jekeli 1.1012
Fig. 3 Fitting a reciprocal distance model component to a power-law PSD
1.1011 Power law model
psd
1.1010 1.109
Recip. dist. model
1.108 1.107 1.106 1.10−7
1.10−6
1.10−5
1.10−4
1.10−3
Frequency, f
Multiplying the PSD for the disturbing potential by .2f /2 , we consider reciprocal distance components of the PSD of the gravity anomaly (from Eq. (86)) in the form ˚ .f / D Af e Bf ;
(87)
ı where A D .2/2 2 ˛ and B D 2=˛ are constants to be determined such that the model is tangent to the empirical PSD. Here we assume that the latter is a power-law model (see Fig. 3), p .f / D Cf ˇ ;
(88)
where the constants, C and ˇ, are given. In terms of natural logarithms, the reciprocal distance PSD component is ln .˚ .f // D ln .A/ C ! Be ! , where ! D ln f ; and its slope is d .ln .˚ .f ///=d ! D 1 Be ! . The slope should be ˇ, which yields Be ! D 1 C ˇ. Also, the reciprocal distance and power-law models should intersect, say, at f D fN, which requires ln .C / ˇ! D ln .A/ C ! Be ! . Solving for A and B, we find: ADC
e fN
1Cˇ ;
BD
1Cˇ : fN
(89)
With a judicious selection of discrete frequencies, fNj , a number of PSD components may be combined to approximate the power law over a specified domain. Due to the overlap of the component summands in Eq. (86), an appropriate scale factor may still be required for a proper fit. This modeling technique was applied to the two regional PSDs shown in Fig. 2. Additional low-frequency components were added to model the field at frequencies,
Correlation Modeling of the Gravity Field in Classical Geodesy
1835
Table 1 Reciprocal distance PSD parameters Area 1
Area 2
j 2 (m4 /s4 ) ˛ (1/m)
j
2 (m4 /s4 )
˛ (1/m)
1 105
3 107
9
1.59 104
5.03 104 1 105
3 107
2 3,300
9.69 107 10 9.97 106
1.13 103 2 3,300
9.69 107 10 3.34 104 1.23 103
6
3 650
4.76 10
6
7
11 6.26 10
j
2 (m4 /s4 ) ˛ (1/m)
9
3.98 103 5.47 104
3
3 640
7.56 106 11 2.81 105 2.74 103
3
2.52 10
4 162
8.94 10
4 951
9.73 106 12 2.36 106 6.14 103
5 10.2
2.00 105 13 2.47 109
1.26 102 5 79.9
2.18 105 13 1.98 107 1.37 102
6 0.641
4.48 105 14 1.55 1010 2.83 102 6 6.71
4.88 105 14 1.66 108 3.08 102
2
7 4.02 10
4
1.00 10
8
j 2 (m4 /s4 ) ˛ (1/m)
12 3.93 10
12
15 9.74 10
5.64 10
2
6.33 10
8 2.53 103 2.25 104
7 0.564
1.09 104 15 1.40 109 6.89 102
8 4.74 102 2.44 104
1.1016 Gravity anomaly psd [mGal2/(cy/m)2]
1.1015 1.1014 1.1013 1.1012
EGM2008
1.1011
Kaula’s rule
1.1010
EGM2008
1.109
emp. psd RD model area 2
1.108 1.107 1.106 −7 1.10
area 1
1.10−6
RD model emp. psd
1.10−5 Frequency [cy/m]
1.10−4
1.10−3
Fig. 4 Comparison of empirical and reciprocal distance (RD) model PSDs for the gravity anomaly in the two areas shown in Fig. 2
f < 105 cy/m. Table 1 lists the reciprocal distance parameters for each of the regions in Fig. 2; and Fig. 4 shows various true and corresponding modeled PSDs for the gravity anomaly. The parameters may be used to define consistently the cross PSDs and cross covariances of any of the derivatives of the disturbing potential in the respective regions.
5
Summary and Future Directions
The preceding sections have developed the theory for correlation functions on the sphere and plane for deterministic functions and stochastic processes using standard spherical harmonic (Legendre) and Fourier basis functions. Assuming an ergodic
1836
C. Jekeli
(hence stationary) stochastic process, its covariance function (with zero means) is essentially the correlation function defined for a particular realization of the process. These concepts were applied to the disturbing gravitational potential. Based on the fractal nature of Earth’s topography and its relationship to the gravitational field, the power spectral density (PSD) of the disturbing potential was shown to behave like a power law at higher spatial frequencies. This provides the basis for the definition and determination of an analytic model for the covariance function that offers mutually consistent cross covariances (and PSDs) among its various derivatives, including vertical derivatives. Once established for a particular region, such models have numerous applications from least-squares collocation (and the related kriging) to more mundane procedures such as interpolation and filtering. Furthermore, they are ideally suited to generating a synthetic field for use in simulation studies of potential theory, as well as Monte Carlo statistical analyses in estimation theory. The details of such applications are beyond the present scope but are readily formulated. The developed reciprocal distance model is quite versatile when combined linearly using appropriate parameters and is able to represent the PSD of the disturbing potential (and any of its derivatives) with different spectral amplitudes depending on the region in question. Two examples are provided in which a combination of 15 such reciprocal distance components is fitted accurately to the empirical gravity anomaly PSD in either smooth or rough regional fields. Although limited to some extent by being isotropic (for the vertical derivatives, only), the resulting models are completely analytic in both spatial and frequency domains; and thus, the computed cross covariances and cross PSDs of all derivatives of the disturbing potential are mutually consistent, which is particularly important in estimation and error analysis studies. The global representation of the gravitational field in terms of spherical harmonics has many applications that are, in fact, becoming more and more local as the computational capability increases and models are expanded to higher maximum degree, nmax . The most recent global model, EGM2008, includes coefficients complete up to nmax D 2;190, and the historical trend has been to develop models with increasingly high global resolution as more and more globally distributed data become available. However, such high-degree models also face the potential problem of divergence near the Earth’s surface (below the sphere of convergence) and must always submit to the justifiable criticism that they are inefficient local representations of the field. In fact, the two PSDs presented here are based on the classical local approximation, the planar approximation, with traditional Fourier (sinusoidal) basis functions. Besides being limited by the planar approximation, the Fourier basis functions, in the strictest sense, still have global support for nonperiodic functions. However, there exists a vast recent development of local-support representations of the gravitational field using splines on the sphere, including tensor-product splines (e.g., Schumaker and Traas 1991), radial basis functions (Schreiner 1997; Freeden et al. 1998), and splines on sphere-like surfaces (Alfeld et al. 1996); see also Jekeli (2005). Representations of the gravitational field using these splines, particularly
Correlation Modeling of the Gravity Field in Classical Geodesy
1837
the radial basis functions and the Bernstein-Bézier polynomials used by Alfeld et al. depend strictly on local data, and the models can easily be modified by the addition or modification of individual data. Thus, they also do not depend on regularly distributed data, as do the spherical harmonic and Fourier series representations. On the other hand, these global support models based on regular data distributions lead to particularly straightforward and mutually consistent transformations among the PSDs of all derivatives of the gravitational potential, which greatly facilitates the modeling of their correlations. For irregularly scattered data, the splines lend themselves to a multiresolution representation of the field on the sphere, analogous to wavelets in Cartesian space. This has been developed for the tensorproduct splines by Lyche and Schumaker (2000) and for the radial basis splines by Freeden et al. (1998); see also Fengler et al. (2004). For the Bernstein-Bézier polynomial splines, a multiresolution model is also possible. How these newer constructive approximations can be adapted to correlation modeling with mutually consistent transformations (propagation of covariances and analogous PSDs) among all derivatives of the gravitational potential represents a topic for future development and analysis.
Appendix A The planar reciprocal distance model, Eq. (80), for the covariance function of the disturbing potential is repeated here for convenience with certain abbreviations
T;T .s1 ; s2 I z1 ; z2 / D
2 M 1=2
(90)
where M D ˇ 2 C˛ 2 s 2 ; ˇ D 1 C˛ .z1 C z2 / ; s 2 D s12 Cs12 ; s1 D x1 x10 ; s2 D x2 x20 : (91) The primed coordinates refer to the first subscripted function in the covariance, and the unprimed coordinates refer to the second function. The altitude levels for these functions are z1 and z2 , respectively. Derivatives of the disturbing potential with ı respect to the coordinates are denoted @T =@x1 D Tx1 , @2 T .@x1 @z/ D Tx1 z , etc. The following expressions for the cross covariances are derived by repeatedly using Eqs. (42) and (54). The arguments for the resulting function are omitted but are the same as in Eq. (90):
Tx1 ;T D
2 ˛ 2 s1 D T;Tx1 M 3=2
(92)
Tx2 ;T D
2 ˛ 2 s2 D T;Tx2 M 3=2
(93)
1838
C. Jekeli
2 ˛ˇ D T;Tz M 3=2 2 ˛2 M 3˛ 2 s12 D 5=2 M
Tz ;T D
Tx1 ;Tx1
Tx1 ;Tx2 D 3
2 ˛4 s1 s2 D Tx2 ;Tx1 M 5=2
2 ˛3 ˇ s1 D Tz ;Tx1 M 5=2 2 ˛2 2 2 M 3˛ D s 2 M 5=2
Tx1 ;Tz D 3
Tx2 ;Tx2
2 ˛3 ˇ s2 D Tz ;Tx2 M 5=2 2 ˛2 2M 3˛ 2 s 2 D Tx1 ;Tx1 C Tx2 ;Tx2 D 5=2 M D Tx1 ;Tx1 D Tx1 x1 ;T
Tx2 ;Tz D 3
Tz ;Tz
T;Tx1 x1
(94) (95) (96) (97) (98) (99) (100) (101)
T;Tx1 x2 D Tx1 ;Tx2 D Tx1 x2 ;T
(102)
T;Tx1 z D Tx1 ;Tz D Tx1 z ;T
(103)
T;Tx2 x2 D Tx2 ;Tx2 D Tx2 x2 ;T
(104)
T;Tx2 z D Tx2 ;Tz D Tx2 z ;T
(105)
T;Tzz D Tz ;Tz D Tzz ;T 3 2 ˛ 4 s1 3M C 5˛ 2 s12 D Tx1 x1 ;Tx1 7=2 M 3 2 ˛ 4 s2 D M C 5˛ 2 s12 D Tx1 x2 ;Tx1 D Tx2 ;Tx1 x1 M 7=2 D Tx1 x1 ;Tx2
Tx1 ;Tx1 x1 D
Tx1 ;Tx1 x2
(106) (107)
(108)
3 2 ˛ 3 ˇ M C 5˛ 2 s12 D Tx1 z ;Tx1 D Tz ;Tx1 x1 7=2 M D Tx1 x1 ;Tz
(109)
3 2 ˛ 4 s1 M C 5˛ 2 s22 D Tx2 x2 ;Tx1 D Tx2 ;Tx1 x2 7=2 M D Tx1 x2 ;Tx2
(110)
Tx1 ;Tx1 z D
Tx1 ;Tx2 x2 D
15 2 ˛ 5 ˇ s1 s2 D Tx2 z ;Tx1 D Tz ;Tx1 x2 D Tx2 ;Tx1 z D Tx1 x2 ;Tz M 7=2 D Tx1 z ;Tx2 (111)
Tx1 ;Tx2 z D
Correlation Modeling of the Gravity Field in Classical Geodesy
Tx1 ;Tzz D
Tx2 ;Tx2 x2 D
Tx2 ;Tx2 z D
1839
3 2 ˛ 4 s1 2 4ˇ ˛ 2 s 2 D Tzz ;Tx1 D Tz ;Tx1 z D Tx1 z ;Tz 7=2 M
(112)
3 2 ˛ 4 s2 3M C 5˛ 2 s22 D Tx2 x2 ;Tx2 M 7=2
(113)
3 2 ˛ 3 ˇ M C 5˛ 2 s22 D Tx2 z ;Tx2 D Tz ;Tx2 x2 D Tx2 x2 ;Tz 7=2 M (114) 3 2 ˛ 4 s2 2 4ˇ ˛ 2 s 2 D Tzz ;Tx2 D Tz ;Tx2 z D Tx2 z ;Tz 7=2 M
(115)
3 2 ˛ 3 ˇ 2M C 5˛ 2 s 2 D Tzz ;Tz 7=2 M
(116)
Tx1 x1 ;Tx1 x1 D
3 2 ˛ 4 3M 2 30M ˛ 2 s12 C 35˛ 4 s14 M 9=2
(117)
Tx1 x1 ;Tx1 x2 D
15 2 ˛ 6 s1 s2 3M C 7˛ 2 s12 D Tx1 x2 ;Tx1 x1 9=2 M
(118)
Tx1 x1 ;Tx1 z D
15 2 ˛ 5 ˇs1 3M C 7˛ 2 s12 D Tx1 z ;Tx1 x1 9=2 M
(119)
Tx2 ;Tzz D
Tz ;Tzz D
Tx1 x1 ;Tx2 x2 D
3 2 ˛ 4 2 M 5M ˛ 2 s 2 C 35s12 s22 D Tx2 x2 ;Tx1 x1 D Tx1 x2 ;Tx1 x2 9=2 M (120) 15 ˛ ˇs2 M C 7˛ 2 s12 D Tx2 z ;Tx1 x1 D Tx1 z ;Tx1 x2 9=2 M 2 5
Tx1 x1 ;Tx2 z D
D Tx1 x2 ;Tx1 z
Tx1 x1 ;Tzz D
(121)
3 2 ˛ 4 4M 2 C 5M ˛ 2 s22 C 35ˇ 2 ˛ 2 s12 D Tzz ;Tx1 x1 D Tx1 z ;Tx1 z M 9=2 (122)
Tx1 x2 ;Tx2 x2 D
Tx1 x2 ;Tx2 z D
15 2 ˛ 6 s1 s2 3M C 7˛ 2 s22 D Tx2 x2 ;Tx1 x2 9=2 M
(123)
15 2 ˛ 5 ˇs1 M C 7˛ 2 s22 D Tx2 z ;Tx1 x2 D Tx1 z ;Tx2 x2 9=2 M
D Tx2 x2 ;Tx1 z
(124)
1840
C. Jekeli
Tx1 z ;Tzz D
15 2 ˛ 5 ˇs1 3M 7ˇ 2 D Tzz ;Tx1 z 9=2 M
(125)
Tx2 x2 ;Tx2 x2 D
3 2 ˛ 4 3M 2 30M ˛ 2 s22 C 35˛ 4 s24 9=2 M
(126)
15 2 ˛ 5 ˇs2 3M C 7˛ 2 s22 D Tx2 z ;Tx2 x2 M 9=2
(127)
Tx2 x2 ;Tx2 z D
Tx2 x2 ;Tzz D
Tx2 z ;Tzz D
Tzz ;Tzz D
3 2 ˛ 4 4M 2 C 5M ˛ 2 s12 C 35ˇ 2 ˛ 2 s22 D Tzz ;Tx2 x2 D Tx2 z ;Tx2 z 9=2 M (128) 15 2 ˛ 5 ˇs2 3M 7ˇ 2 D Tzz ;Tx2 z 9=2 M
(129)
3 2 ˛ 4 4 8ˇ 24ˇ 2 ˛ 2 s 2 C 3˛ 4 s 4 D Tx1 z ;Tx1 z C Tx2 z ;Tx2 z M 9=2
(130)
Appendix B The hybrid PSD/covariance function of the disturbing potential, given by Eq. (85), can be shown to be ST;T .f1 I s2 I z1 ; z2 / D
2 2 K0 .2f1 d / ; ˛
(131)
where K0 is the modified Bessel function of the second kind and zero order, and r d D
ˇ2 C s22 : ˛2
(132)
It is the along-track PSD if s2 D 0. In the following hybrid PSD/covariances of the derivatives of T , also the modified Bessel function of the second kind and first order, K1 , appears. Both Bessel function always have the argument, 2f1 d ; and, the arguments of the hybrid PSD/covariances are the same as in Eq. (131). ST;Tx1 D i 2f1 ST T D STx1 ;T ST;Tx2 D
(133)
2 2 .2f1 / s2 K1 D STx2 ;T ˛d
(134)
2 2 .2f1 / ˇ K1 D STz ;T ˛2 d
(135)
ST;Tz D
Correlation Modeling of the Gravity Field in Classical Geodesy
1841
STx1 ;Tx1 D .2f1 /2 ST T
(136)
STx1 ;Tx2 D i 2f1 STx2 ;T D STx2 ;Tx1
(137)
STx1 ;Tz D i 2f1 ST;Tz D STz ;Tx1 STx2 ;Tx2 D
2 2 .2f1 / ˛d
STx2 ;Tz D
2s 2 s2 1 22 K1 2f1 2 K0 d d
2 2 .2f1 / ˇs2 .2K1 C 2f1 d K0 / D STz ;Tx2 ˛2 d 3
(138) (139)
(140)
STz ;Tz D STx1 ;Tx1 C STx2 ;Tx2
(141)
ST;Tx1 x1 D STx1 ;Tx1 D STx1 x1 ;T
(142)
ST;Tx1 x2 D STx1 ;Tx2 D STx1 x2 ;T
(143)
ST;Tx1 z D STx1 ;Tz D STx1 z ;T
(144)
ST;Tx2 x2 D STx2 ;Tx2 D STx2 x2 ;T
(145)
ST;Tx2 z D STx2 ;Tz D STx2 z ;T
(146)
ST;Tzz D STz ;Tz D STzz ;T
(147)
STx1 ;Tx1 x1 D i .2f1 /3 ST;T D STx1 x1 ;Tx1
(148)
STx1 ;Tx1 x2 D .2f1 /2 ST;Tx2 D STx1 x2 ;Tx1 D STx2 ;Tx1 x1 D STx1 x1 ;Tx2
(149)
STx1 ;Tx1 z D .2f1 /2 ST;Tz D STx1 z ;Tx1 D STz ;Tx1 x1 D STx1 x1 ;Tz STx1 ;Tx2 x2 D i 2f1 STx2 ;Tx2 D STx2 x2 ;Tx1 D STx2 ;Tx1 x2 D STx1 x2 ;Tx2
(150) (151)
STx1 ;Tx2 z D i 2f1 STx2 ;Tz D STx2 z ;Tx1 D STz ;Tx1 x2 D STx2 ;Tx1 z D STx1 x2 ;Tz D STx1 z ;Tx2
(152)
STx1 ;Tzz D i 2f1 STz ;Tz D STzz ;Tx1 D STz ;Tx1 z D STx1 z ;Tz STx2 ;Tx2 x2 D
2 2 .2f1 / s2 ˛d 3
D STx2 x2 ;Tx2
(153)
8s 2 4s 2 6 22 .2f1 s2 /2 K1 C2f1 d 3 22 K0 d d (154)
1842
C. Jekeli
2 2 .2f1 / ˇ STx2 ;Tx2 z D ˛2 d 3
8s22 4s22 2 2 2 .2f1 s2/ K1C2f1 d 1 2 K0 d d
D STx2 z ;Tx2 D STz ;Tx2 x2 D STx2 x2 ;Tz STx2 ;Tzz D STx2 ;Tx1 x1 STx2 ;Tx2 x2 D STzz ;Tx2 D STz ;Tx2 z D STx2 z ;Tz STz ;Tzz D STx1 ;Tx1 z C STx2 ;Tx2 z D STzz ;Tz
(155) (156) (157)
STx1 x1 ;Tx1 x1 D .2f1 /4 ST;T
(158)
STx1 x1 ;Tx1 x2 D i .2f1 /3 STx2 ;T D STx1 x2 ;Tx1 x1
(159)
STx1 x1 ;Tx1 z D i .2f1 /3 ST;Tz D STx1 z ;Tx1 x1 STx1 x1 ;Tx2 x2 D .2f1 /2 STx2 ;Tx2 D STx2 x2 ;Tx1 x1 D STx1 x2 ;Tx1 x2 STx1 x1 ;Tx2 z D .2f1 /2 STx2 ;Tz D STx2 z ;Tx1 x1 D STx1 z ;Tx1 x2 D STx1 x2 ;Tx1 z STx1 x1 ;Tzz D .2f1 /2 STz ;Tz D STzz ;Tx1 x1 D STx1 z ;Tx1 z STx1 x2 ;Tx2 x2 D i 2f1 STx2 x2 ;Tx2 D STx2 x2 ;Tx1 x2 STx1 x2 ;Tx2 z D i 2f1 STx2 ;Tx2 z D STx2 z ;Tx1 x2 D STx1 z ;Tx2 x2 D STx2 x2 ;Tx1 z STx1 x2 ;Tzz D i 2f1 STx2 z ;Tz D STzz ;Tx1 x2 D STx1 z ;Tx2 z D STx2 z ;Tx1 z STx1 z ;Tzz D STx1 x1 ;Tx1 z C STx2 x2 ;Tx1 z D STzz ;Tx1 z STx2 x2 ;Tx2 x2
(160) (161) (162) (163) (164) (165) (166) (167)
2 2 2 .2f1 / 24s22 24s24 2 s2 D 2f1 d 3 2 C 4 C .2f1 s2 / 2 K0 C ˛d 3 d d d 24s 2 24s 4 s2 C2 3 22 3 .2f1 s2 /2 C 42 C 4 .2f1 s2 /2 22 K1 d d d
STx2 x2 ;Tx2 z
24s22 2 2 .2f1 / ˇs2 2 2f K0 D d 12 .2f s / 1 1 2 ˛2 d 5 d2 48s22 2 2 C 24 2 C 3 .2f1 d / 8 .2f1 s2 / K1 d
(168)
D STx2 z ;Tx2 x2
(169)
Correlation Modeling of the Gravity Field in Classical Geodesy
STx2 x2 ;Tzz D STx1 x1 ;Tx2 x2 STx2 x2 ;Tx2 x2 D STzz ;Tx2 x2 D STx2 z ;Tx2 z STx2 z ;Tzz D STx1 x1 ;Tx2 z C STx2 x2 ;Tx2 z D STzz ;Tx2 z STzz ;Tzz D STx1 z ;Tx1 z C STx2 z ;Tx2 z
1843
(170) (171) (172)
References Alfeld P, Neamtu M, Schumaker LL (1996) Fitting scattered data on sphere-like surfaces using spherical splines. J Comput Appl Math 73:5–43 Baranov V (1957) A new method for interpretation of aeromagnetic maps: pseudo-gravimetric anomalies. Geophysics 22:359–383 Brown RG (1983) Introduction to random signal analysis and Kalman filtering. Wiley, New York de Coulon F (1986) Signal theory and processing. Artech House, Dedham Fengler MJ, Freeden W, Michel V (2004) The Kaiserslautern multiscale geopotential model SWITCH-03 from orbit perturbations of the satellite CHAMP and its comparison to models EGM96, UCPH2002_02_05, EIGEN-1S and EIGEN-2. Geophys J Int 157:499–514 Forsberg R (1985) Gravity field terrain effect computations by FFT. Bull Géod 59(4):342–360 Forsberg R (1987) A new covariance model, for inertial gravimetry and gradiometry. J Geophys Res 92(B2):1305–1310 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere, with applications in geomathematics. Clarendon, Oxford Heller WG, Jordan SK (1979) Attenuated white noise statistical gravity model. J Geophys Res 84(B9):4680–4688 Helmert FR (1884) Die Mathematischen und Physikalischen Theorien der Höheren Geodäsie, vol 2. BD Teubner, Leipzig Hofmann-Wellenhof B, Moritz H (2005) Physical geodesy. Springer, Berlin Jeffreys H (1955) Two properties of spherical harmonics. Q J Mech Appl Math 8(4):448–451 Jekeli C (1991) The statistics of the Earth’s gravity field, revisited. Manuscr Geod 16(5):313–325 Jekeli C (2005) Spline representations of functions on a sphere for geopotential modeling. Report no. 475, Geodetic Science, Ohio State University, Columbus. http://www.geology.osu.edu/~ jekeli.1/OSUReports/reports/report_475.pdf Jordan SK (1972) Self-consistent statistical models for the gravity anomaly, vertical deflections, and the undulation of the geoid. J Geophys Res 77(20):3660–3669 Jordan SK, Moonan PJ, Weiss JD (1981) State-space models of gravity disturbance gradients. IEEE Trans Aerosp Electron Syst AES 17(5):610–619 Kaula WM (1966) Theory of satellite geodesy. Blaisdell, Waltham Lauritzen SL (1973) The probabilistic background of some statistical methods in physical geodesy. Report no. 48, Geodaestik Institute, Copenhagen Lyche T, Schumaker LL (2000) A multiresolution tensor spline method for fitting functions on the sphere. SIAM J Sci Comput 22(2):724–746 Mandelbrot B (1983) The fractal geometry of nature. Freeman, San Francisco Marple SL (1987) Digital spectral analysis with applications. Prentice-Hall, Englewood Cliffs Martinec Z (1998) Boundary-value problems for gravimetric determination of a precise geoid. Springer, Berlin Maybeck PS (1979) Stochastic models, estimation, and control, vols I and II. Academic, New York Milbert DG (1991) A family of covariance functions based on degree variance models and expressible by elliptic integrals. Manuscr Geod 16:155–167 Moritz H (1976) Covariance functions in least-squares collocation. Report no. 240, Department of Geodetic Science, Ohio State University, Columbus
1844
C. Jekeli
Moritz H (1978) Statistical foundations of collocation. Report no. 272, Department of Geodetic Science, Ohio State University, Columbus Moritz H (1980) Advanced physical geodesy. Abacus Press, Tunbridge Wells Olea RA (1999) Geostatistics for engineers and earth scientists. Kluwer Academic, Boston Pavlis NK, Holmes SA, Kenyon SC, Factor JF (2012a) The development and evaluation of earth gravitational model (EGM2008). J Geophys Res 117:B04406. doi:10.1029/2011JB008916 Pavlis NK, Holmes SA, Kenyon SC, Factor JF (2012b) Correction to “The development and evaluation of Earth Gravitational Model (EGM2008)”. J Geophys Res, 118, 2633, doi:10.1002/jgrb.50167 Priestley MB (1981) Spectral analysis and time series analysis. Academic, London Rummel R, Yi W, Stummer C (2011) GOCE gravitational gradiometry. J Geod 85:777–790 Schreiner M (1997) Locally supported kernels for spherical spline interpolation. J Approx Theory 89:172–194 Schumaker LL, Traas C (1991) Fitting scattered data on sphere-like surfaces using tensor products of trigonometric and polynomial splines. Numer Math 60:133–144 Tscherning CC (1976) Covariance expressions for second and lower order derivatives of the anomalous potential. Report no. 225, Department of Geodetic Science, Ohio State University, Columbus. http://geodeticscience.osu.edu/OSUReports.htm Tscherning CC, Rapp RH (1974) Closed covariance expressions for gravity anomalies, geoid undulations and deflections of the vertical implied by anomaly degree variance models. Report no. 208, Department of Geodetic Science, Ohio State University, Columbus. http:// geodeticscience.osu.edu/OSUReports.htm Turcotte DL (1987) A fractal interpretation of topography and geoid spectra on the Earth, Moon, Venus, and Mars. J Geophys Res 92(B4):E597–E601 Watts AB (2001) Isostasy and flexure of the lithosphere. Cambridge University Press, Cambridge
Inverse Resistivity Problems in Computational Geoscience ˇ and Balgaisha Mukanova Alemdar Hasanov (Hasanoglu)
Contents 1 2 3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scientific Relevance of Coefficient Inverse Problems in Geomathematics . . . . . . . . . . Least-Square (Quasisolution) Approach for Resistivity Prospecting Problem . . . . . . . 3.1 Formulation of Inverse Problems Corresponding to Two Models of a Medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Quasisolution Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Numerical Method Based on Conjugate Gradient Algorithm . . . . . . . . . . . . . . . . . . . . . 4.1 Gradient Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The Solution to Direct Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 A Gradient Method Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Numericals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1846 1846 1848 1848 1853 1855 1855 1855 1855 1856 1859 1860 1861
Abstract
We study coefficient inverse problems arising in modeling of resistivity prospecting problems. Numerical simulations are investigated in the cases of vertically and cylindrically layered medium. Conductivity coefficients are assumed to be sufficiently smooth 1D functions. The model leads to an inverse problem of identification of an unknown coefficient (conductivity) in an elliptic equation in R2 inside a slab or in a cylinder. The direct problem is formulated as a mixed BVP in R2 . Measured data are assumed to be available on the upper boundary
A. Hasanov () Mathematics and Computer Science, Izmir University, Izmir, Turkey e-mail: [email protected] B. Mukanova Eurasian National University, Astana, Kazakhstan © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_62
1845
1846
A. Hasanov and B. Mukanova
of the medium or along the axis of the well. A logarithmic transformation is applied to the unknown coefficient, and the inverse problem is studied as a minimization problem for the residual functional. A numerical method is discussed for interpreting the data of a resistivity prospecting in both considered models of layered medium. The method is implemented for realistic conductivity distributions, with both noise-free and noisy data.
1
Introduction
Resistivity sounding method appears in geophysical prospecting techniques in 1927, after brothers Schlumberger works. The main ideas of the method is described in Stefanescu and Shlumberger (1930). The method is based on measurements of surface potentials produced by currents injected into a medium. Corresponding mathematical models has been first stated by Slichter (1933) and Langer (1933). In most practical cases, the model of vertically layered medium is used as a preliminary approximation to a medium resistivity distribution, and a conductivity function is assumed to be a piecewise constant one. Then, data interpretation technique bases on the layer stripping method, proposed firstly by Pekeris (1940). That technique’s modern formulation and applications are described by Sylvester (2000) in the review book. The method is attractive due to its simplicity and computational efficiency. Further, the results of data interpretation are improved by using another medium models and/or additional prospecting methods. Here we discuss how the resistivity sounding problem can be solved by using a model of the medium with continuously distributed electrical properties. We consider only cylindrically and vertically layered medium.
2
Scientific Relevance of Coefficient Inverse Problems in Geomathematics
A mathematical model of resistivity prospecting bases on equations of field potential written for a nonhomogeneous medium. One needs to determine medium resistivity distribution by using some additional measurement on the available boundary. Thus, the problem reduces to a coefficient inverse problem (CIP) for an elliptic equation. The original formulation of Langer (1933) is regarded as a classical statement of the resistivity sounding problem. The model of Langer assumes a vertical electrical sounding where the electrical properties of the medium depend only on depth and a source electrode is placed on the surface of the medium. Tikhonov (1949) also studied this model and strictly proved the uniqueness of the solution to the inverse problem. This problem is an example of a coefficient inverse problem that is severely ill-posed (Alessandrini 1988). The general principles of the solution to such problems, including quasisolution and regularization methods, are stated by Ivanov (1963) and Tikhonov and Arsenin (1977). Nowadays, the theory of coefficient inverse problems (CIPs) is one of the most active areas of applied mathematics.
Inverse Resistivity Problems in Computational Geoscience
1847
The most known application is the electrical impedance tomography used in medicine, defectoscopy, and geophysics. Note that the first complete statement of the coefficient inverse problem (CIP) for an elliptic equation and elliptic system of equations has been presented by Hasanov (1995). A review of the literature in this field of geophysical research can be found in the book Spichak (2011). The newest methods to solve CIPs with many numerical examples are described in the book Beilina and Klibanov (2012). Resistivity prospecting method is used for vertically layered model of medium and for well logging as well. Well logging has been studied extensively in the oil and gas industry since the 1940s (Archie 1942; Dvoretckiy and Yarmakhov 1998; Epov et al. 2010; Peng 1997; Tabarovsky et al. 1994, Onegova and Epov (2011), and references therein). Archie (1942) demonstrated for the first time how the electrical log can be used to provide qualitative indications of the presence of oil and gas reservoirs. Kaufman and Dashevsky (2003) provide an introduction to the basic principles and techniques of well electromagnetic sounding. The numerical solutions to the direct problem, when the conductivity is a given function, have been considered by many authors for different well logging tools (see, for instance, Epov et al. 2010; Geng et al. 2012; Lv et al. 2009, and references therein). In Peng (1997), the spontaneous potential well logging inverse problem is formulated as a variational problem for an elliptic equation, in which the resistivity of the objective layer appears as a coefficient. The existence of solutions to the model is then proven, and sufficient conditions to ensure the uniqueness of the solution are obtained. In Tabarovsky et al. (1994) and Pen’kovskii and Korsakova (2010), the inverse problem has been formulated for an inductive logging tool and several numerical results are obtained. The first successful attempt at numerical solution to the problem posed by Langer (1933) and Tikhonov (1949) has been briefly described in the unfinished research of Alekseev et al. (1989). These authors employed a regularization method, then smoothed the solution in each iteration step, and obtained some examples of recovery of the unknown coefficient. In the paper Mukanova and Orunkhanov (2010) the problem is solved numerically for various assumed conductivity distributions. Unlike Alekseev et al. no regularization of a residual functional nor a smoothing procedure is used. Instead, a two-stage procedure to recover the unknown coefficient is proposed that leads to a successful numerical solution of the considered inverse problem. In the first step, a logarithmic derivative, p.z/, of an unknown conductivity .z/ is recovered. In the second step, the function .z/ is found using an analytical formula. In the paper Mukanova (2012), this method is applied to the well logging problem when electrical properties of the medium depend only on the distance to the polar axis and do not depend on the depth. In addition, whereas the cited above papers (Langer 1933; Tikhonov 1949; Alekseev et al. 1989) consider the case when the source electrode is placed on the surface of the vertical layered medium, it is assumed that the current source is placed inside a well and that the surrounding medium is cylindrically layered. The problem is expressed in cylindrical coordinates
1848
A. Hasanov and B. Mukanova
.r; '; z/ and uses a statement of a direct problem described in the book Dvoretckiy and Yarmakhov (1998). This statement differs from the classical treatment of Langer and Tikhonov in both the boundary conditions and the geometry of the physical domain.
3
Least-Square (Quasisolution) Approach for Resistivity Prospecting Problem
3.1
Formulation of Inverse Problems Corresponding to Two Models of a Medium
1. We consider two different models of the medium. Denote them by abbreviation M1 and M2: M1.
M2.
Assume that the medium is vertically layered, i.e., a conductivity function .z/ depends on the depth z only. Suppose that its value 0 D .0/ on the surface of the medium is given and .z/ continuously changes along z 2 Œ0; H , where H is a thickness of the slab. Assume that the conductivity of the medium in a well has a given value, w D const, and let rw be the radius of the well. Suppose that the well is surrounded by a penetration zone, rw r re , in which the conductivity of the medium changes continuously in the radial direction. Outside the penetration zone, the containing environment has another known and constant conductivity, e D const for r > re .
For convenience, we introduce a number of dimensionless variables. Let values H , 0 and rw ; w be the units of the length and the conductivity for models M1 and M2 respectively. Let the current source be placed at the origin, with current amplitude I . We take M D
I I and M D 40 H 4w rw
(1)
to be the corresponding units of potential for models M1 and M2. In the case of model M1, we assume that the set S of dimensionless conductivity functions .z/ satisfies the following conditions: S WD f.z/ 2 C 2 Œ0; 1 W .0/ D 1; 0 .0/ D 0I 0 < 1 .z/ 2 g: For the model M2, 8 < 1; 0 r < 1; total .r/ D .r/; 1 r r1 ; : 1 D const; r1 < r < 1:
Inverse Resistivity Problems in Computational Geoscience
1849
and the set of admissible conductivity functions S is defined as follows: .r/ 2 S D f.r/ W 0 < 1 .r/ 2 < 1; .r/ 2 C 2 Œ1; r1 g where r1 D re =rw and 1 D e =w . The commonly used mathematical model of the resistivity prospecting is an equation for the field potential u.r; z/. Let us formulate coefficients inverse problems (CIPs) in terms of this model in the cases M1 and M2: Case M1 (CIP1): value problem
Let the function u.r; z/ be the solution to the following boundary
8 .z/ @ @u @ @u ˆ .z/ D 0; 0 < r < 1; 0 < z < 1 r C ˆ r @r @r @z @z ˆ ˆ < @u j D ı.r/; @z zD0 ˆ @u j D 0; u.r; z/jzD1 D 0; rD0 ˆ @r ˆ ˆ : lim u.r; z/ D 0:
(2)
r!1
Determine the unknown electric conductivity coefficient .z/ 2 S from the measured data u1 .r/ where @u jzD0 D u1 .r/: @r Case M2 (CIP2): value problem
(3)
Let the function u.r; z/ be the solution to the following boundary
8 2 u.r;z/ @u.r;z/ 1 @ ˆ C .r/ @ @z D 0; 0 < r < 1; 1 < z < C1 2 ˆ r @r r.r/ @r ˆ ˆ @u < j D ı.r/; rD0 @r lim u.r; z/ D 0; ˆ ˆ r!1 ˆ ˆ : lim u.r; z/ D 0
(4)
z!˙1
Determine the unknown electric conductivity coefficient .r/ 2 S from the measured data U .z/ where u.r; z/rD0 D U .z/:
(5)
Remark: in the sequel we formulate CIP2 anew. 2. Let us reformulate first the direct problems. Case M1. Note that the boundary condition at z D 1 in the statement (2) is an approximate form of the requirement that the potential should vanish at infinity. Namely, the condition at infinity is shifted to the boundary z D 1.
1850
A. Hasanov and B. Mukanova
Let us apply the Hankel transform to the problem (2) with respect to variable r. Then we obtain the two-point problem for a second order ordinary differential equation: dV i d h .z/ 2 .z/V D 0; dz dz with the following boundary conditions: ? dV ? ? D 1; V jzDH D 0: d z ?zD0
(6)
(7)
Here the function Z1 u.r; z/J0 .r/rdr
V .; z/ D
(8)
0
is a Hankel transform of the potential u.r; z/. Having the function V .; z/, we can derive the solution to the problem (2) by the following inversion formula: Z1 u.r; z/ D
V .; z/J0 .r/d : 0
Case M2. In the cylindrically symmetric case, the governing equation is given by (4). Further we consider the solution on the cylindrical layer 1 < r < r1 . Then the function u.r; z/ should satisfy the condition lim u.r; z/ D 0; 1 < r < r1
z!˙1
(9)
Now, we need boundary conditions at r D 1 and r D r1 . One should express the continuity of the potential u.r; z/ and the current density .r/@u=@r at the boundaries r D 1 and r D r1 . We will derive them out later. First, let us apply the Fourier transform to the electric potential u.r; z/ with respect to z: Z
1
W .r; s/ D
u.r; z/ cos.sz/d z:
(10)
W .r; s/ cos.sz/ds:
(11)
0
Then u.r; z/ D
2
Z 0
1
Inverse Resistivity Problems in Computational Geoscience
1851
Introduce the notation a.r/ D r.r/:
(12)
Then, the transformed governing equation is the following ordinary differential equation with parameter s: dW d a.r/ a.r/s 2 W .r; s/ D 0; dr dr
s 2 Œ0; 1/; r 2 .1; r1 /;
(13)
Boundary conditions for Eq. (13) are first derived in Dvoretckiy and Yarmakhov (1998). The proof is available in Mukanova (2012). It has been shown that boundary conditions at r D 1 and r D r1 are the following: W 0 .1; s/ D k.s/W .1; s/ l.s/; W 0 .r1 ; s/ D m.s/W .r1 ; s/;
(14)
where the functions k.s/ D sI1 .s/=I0 .s/;
l.s/ D 1=I0 .s/;
m.s/ D sK1 .sr1 /=K0 .sr1 / (15)
are defined in terms of the modified Bessel functions K1 ; K0 ; and I0 , I1 . These conditions express the continuity of the potential and the current at the boundaries r D 1 and r D r1 . Remark. Another form of the condition at z D 1 for the model M1 would be derived using a technique used for the model M2; but in that technique we need the value of .1/, which is unknown due to the physical meaning of the problem. Statements (6)–(7) and (13)–(14) comprise the forward problem for models M1 and M2 respectively, when the conductivity function is known. 3. Now we reformulate the coefficient inverse problem in case M2. To state an inverse problem in the layer 1 < r < r1 , we first consider the solution inside the well. Suppose that the solution is just the potential of a point source in a small region around the origin and can in general be expressed in the form 1 u.r; z/ D p C ˚.r; z/; r 2 C z2
(16)
where the function ˚.r; z/ is bounded, vanishes as z ! ˙1, and has a well-defined Fourier transform (with respect to z). In the region where the conductivity function is constant, i.e., for r 2 .0; 1/ (and for r 2 .r1 ; 1/ as well), the function W .r; s/ is a solution to the modified Bessel equation 1 W 00 C W 0 s 2 W D 0; r
1852
A. Hasanov and B. Mukanova
whose general solution can be expressed in terms of modified Bessel functions of the first and second kind of order zero: W .r; s/ D
.s/.s/I0 .sr/ C &.s/K0 .sr/:
(17)
From the formula 1
2 p D 2 2 r Cz
Z1 K0 .sr/ cos.sz/ds 0
that holds for Bessel functions and expressions (16) and (17), we find that inside the well, the function W .r; s/ can be written as W .r; s/ D
.s/I0 .sr/ C K0 .sr/:
(18)
Then by (16), Z W .0; s/ D
1
.s/ D
˚.0; z/ cos.sz/d z
(19)
0
is the Fourier transform of the function ˚.0; z/. The function ˚.0; z/ is known from measured data: ˚.0; z/ D U .z/
1 z
then the solution to equation (13) on on the boundary r D 1 of the well is given by W .1; s/ D
.s/I0 .s/ C K0 .s/ '.s/:
(20)
Thus, the considered inverse problem is formulated as follows: CIP2.
Find the function .r/ a.r/=r using the solution of the BVP stated in equation (13) and conditions (14), satisfying the additional condition (20), where the function '.s/ is given.
We have thus reduced the inverse problem CIP2 to equations (13) and conditions (14) and (20), which is similar to the statement for the model M1 excepting different boundary conditions. We can therefore apply the same numerical method to both cases.
Inverse Resistivity Problems in Computational Geoscience
3.2
1853
Quasisolution Method
Due to measurement errors, these problems CIP1 and CIP2 might not have a solution in any suitable class of admissible coefficients. For this reason, use the quasisolution method (Ivanov 1963) to solve the state inverse problems. First, make some preliminary transformations. Let us introduce the logarithmic derivative of the coefficient in Eqs. (6) and (13): p.z/ D .ln .z//0 D
0 1 0 .r/0 ; and p.r/ D .ln a.r//0 D D C : r r
(21)
It has be shown in Mukanova and Orunkhanov (2010, Appendix) that the measured data are sensible to a reflection factor and not sensible to an electrical properties’ contrast of the adjacent medium. The function p introduced above is just a continuous analog of a reflection factor. Then it is preferable to express the CIPs in terms of this function. Evidently, the conductivity function can be expressed via the reflection factor function p as follows: Z z .z/ D exp
p.z/d z ;
(22)
0
in the case M1, and Zr 1 p.r/dr ; .r/ D exp r
(23)
1
in the case M2. To construct residual functionals, we reformulate the CIPs above. Case M1. Let .z/ be a given coefficient and p.z/ its logarithmic derivative. Denote by u D u.r; zI p/ the unique solution to the direct problem (2), corresponding to the coefficient .z/. Introduce the operator Œp WD
@u.r; zI p/ jzD0 : @r
Then the CIP1 can be formulated in the following operator form Hasanov (1997): .p/.r/ D u1 .r/; r 2 Œ0; 1/: Therefore, the CIP1 can be reduced to solving the operator equation above.
1854
A. Hasanov and B. Mukanova
Now, define residual functionals (Tikhonov and Arsenin 1977).
1 J .p/ D 2
Z1 ..p/ u1 .r//2 rdr;
(24)
0
and consider the minimization problem J .p / D inf J .p/; p2P
p 2 P;
(25)
in the class of admissible reflection coefficients P D fp.z/ C 1 Œ0; H ; p.z/ D .ln..z///0 ; .z/ 2 Sg:
(26)
The function .z/ calculated via p by the formula (21) will be defined to be a quasisolution of the inverse problem CIP1. Define the transformed measured data Z1 './ D
u1 .r/J1 .r/rdr: 0
Due to a unitary property of the Hankel transformation, differentiation rules for transformed functions, and formulas (8), the residual functional is equal to
J .p/ D
1 2
R1 .'./ V .; 0//2 d :
(27)
0
We will consider the above minimization problem (24) and (25) for the transformed functional (27) in the set of the admissible reflection factor functions p.z/. By the same way, we introduce a residual functional in the case CIP2. Let
J Œp D
1 2
Z
s2
.W .1; s/ '.s//2 ds
(28)
s1
be the residual functional, where 0 s1 < s2 < 1. We construct a quasisolution to CIP2 defined in (13), (14), and (20), by minimizing the functional (28) with respect to the function p.r/.
Inverse Resistivity Problems in Computational Geoscience
4
Numerical Method Based on Conjugate Gradient Algorithm
4.1
Gradient Formulas
1855
Thus, we need first to reformulate the problem in terms of reflection functions; then we have to solve minimization problem numerically. The most common way to solve it numerically is gradient methods. There exist different ways to obtain gradient formulas. In most practical cases, it could be expressed via the solution to corresponding adjoint problem. But in our case, the gradient is expressed in closed form via the solution to direct problem. Note that closed form is preferable to obtain higher numerical accuracy. In Mukanova (2009, 2012), the formulas of Fréchet derivatives of the considered functionals are obtained in the following forms: CIP1: Z1 rJ Œp D .z/ .V .; 0/ './/V .; z/V 0 .; z/2 d ;
(29)
0
CIP2: Z
s2
rJ Œp D a.r/
l 1 .s/.W .1; s/ '.s//W .r; s/W 0 .r; s/ds
(30)
s1
Here the functions .z/ and a.r/ should be replaced by their corresponding expression via p./: The expressions (29) and (30) are similar to each other. The differences are the weighting coefficient l 1 .s/ and the statement of boundary conditions in direct problems.
4.2
The Solution to Direct Problems
When one implements any gradient method, one needs to solve multiple direct problem. The number of repetition depends on an iteration number of a gradient method and on a grid point number for Fourier transform parameter. Evidently, the total number of repetitions in solving direct problem might be very high. Then the method should be at the same time accurate and efficient. Because the formulated above direct problems are very simple, they can be solved by standard FDM or FEM methods.
4.3
A Gradient Method Algorithm
The minimization problem (25) for residuals (27) and (28) can be solved using the following iterative algorithm (see, for instance, Vasil’ev 1981):
1856
A. Hasanov and B. Mukanova
(1) Specify the value of " used in the termination criterion (see step (4)) and the tolerance of the parameter ˛n and one of the minima of J Œp .n/ ˛n q .n/ for step (3). (2) Choose an initial guess p .0/ .z/ D ln. .0/ .z//0 (case M1) or p .0/ .r/ D ln.r .0/ .r//0 (case M2). (3) Find the new values of p and q using the conjugate gradient method formulas: q .0/ D rJ Œp .0/ ; q .1/ D q .0/ ; q .nC1/ D q .n/ ˛n .rJ Œp .n/ ˇn q .n1/ /; p .nC1/ D p .n/ ˛n q .n/ ; n D 0; 1; 2 : : :
n D 1; 2; 3 : : :
(31)
where the coefficient ˇn is computed as ˇn D
hrJ Œp .n/ ; rJ Œp .n1/ i krJ Œp .n1/ k2
and the value ˛n is defined by the conditions ˛n 0;
J Œp .n/ ˛n q .n/ D
min
˛2Œ0;˛max
J Œp .n/ ˛q .n/
(32)
(4) Repeat step (3) until the following stopping criterion is satisfied: max.jJ Œp .n/ j; krJ Œp .n/ k/ < max.k!kL2 Œs1 ;s2 ; "/:
(33)
Here, k!kL2 Œs1 ;s2 is the estimated norm of the additive noise in the measurements. Let us give several notes on practical use of the gradient method: (a) It is important to start with very small values of parameter ˛n ; in our practice it was above 105 ; (b) When choosing an initial guess, it is recommended to set it with the possible lowest values of the function ./; (c) It is useful to use a logarithmic grid for Fourier or Hankel transformations parameter, like D ln.1 C s/. (d) The most time-consuming step in an implementation of the method is when one defines ˛n by (32); especially it is important to choose a convenient value of the ˛max . Computations show that it is useful to set ˛max krJ Œp .n1/ k1
4.4
Numericals
The method described above has been tested with noisy and noise- free synthetic data. Computations show that the conditions under which the solution can be obtained numerically are similar for models M1 and M2 (see Mukanova 2012; Mukanova and Orunkhanov 2010).
Inverse Resistivity Problems in Computational Geoscience
1857
Fig. 1 A recovery of the increasing .z/ and corresponding p.z/ in the case of noise free synthetic measured data
The results of numerical simulations of CIP1 in the case of monotonic increasing .z/ are depicted in Fig. 1, left-hand panel. The corresponding reflection factor functions p.z/ are compared in the right-hand panel. To check the stability of the method we introduced a random additive noise into synthetic measured data. The model of the noise is described in details in articles Mukanova and Orunkhanov (2010) and Mukanova (2012). The noise is presented as a sum of harmonics with random amplitudes. The number of harmonics is equal to a grid point number. Different cases of noise functions with their Fourier transforms are depicted in Fig. 2. A noise level is expressed as a ratio between a maximum of the noise and a maximum of the measured data. The results obtained for a noise level equal to 5 % are presented in Fig. 3. Left-hand panels of the figures represents transformed noised measured data and data that correspond to initial guess. The most favorable case is when the function .z/ is decreasing one. The results obtained with 5 % noise level are compared in Fig. 4. When the conductivity is an increasing function, the admissible results are obtained up to 2.5 % noise level (see Fig. 5). The most unfavorable case occurs when the conductivity has a local minimum. The reason for this is an “overshadowing” effect. We show an example corresponding to this case with 1 % noise level (Fig. 6). In general, the results obtained for the cylindrically symmetric case turn out to be significantly worse than those for a vertical-layered medium model and require further improvement. In particular, the conductivity function can be recovered with satisfactory quality for contrast ratio amax =amin 10 and for simple distributions of .r/. More acceptable results are obtained for the cases where ./ is monotonic and ones with a local maximum near an available boundary. We show several numerical results obtained for CIP2 in Figs. 7 and 8.
A. Hasanov and B. Mukanova
noises
1858
r
Transformed noises 1.0 11.5
8.8
10.1
7.7
6.7
5.8
5.1
4.4
3.8
3.2
2.8
2.3
2.0
1.6
1.3
1.1
0.8
0.6
0.4
0.3
0.1
0.0
0.0 −1.0 −2.0 −3.0
0.6 0.4 0.2 0.0 0 0. 1 3 4 6 8 1 .3 .6 0 .3 8 2 8 .4 1 .8 7 7 8 .1 5 −0.2 0. 0. 0. 0. 0. 1. 1 1 2. 2 2. 3. 3. 4 5. 5 5. 7. 8. 10 11. −0.4 −0.6
1.0 0.5 0.0 0 1 0. 0.1 0.3 0.4 0.6 0.8 1.1 1.3 1.6 2.0 2.3 2.8 3.2 3.8 4.4 5.1 5.6 6.7 7.7 8.8 10. 1.5 1 −0.5 −1.0
Fig. 2 Different examples of noise functions and their Fourier transforms
Inverse Resistivity Problems in Computational Geoscience
4.5
12.0
8.4
10.1
7.0
5.8
4.8
4.0
3.2
2.6
2.1
1.6
1.2
0.9
0.6
0.4
0.2
0.0
0.00 –0.20
1859
4.0 3.5
–0.40
3.0 2.5
–0.60
2.0 –0.80
1.5 1.0
–1.00
0.5 –1.20
0.0 0.00
–1.40
0.50 z
0.75
1.00
0.00
0.25
0.50 z
0.75
1.00
0.00
0.25
0.50
0.75
1.00
4.5
12.0
–0.20
10.1
8.4
7.0
5.8
4.8
4.0
3.2
2.6
2.1
1.6
1.2
0.9
0.6
0.4
0.2
0.0
0.00
0.25
4.0 3.5
–0.40
3.0 2.5
–0.60
2.0 –0.80
1.5 1.0
–1.00
0.5 –1.20
0.0
–1.40
5.0 12.0
8.4
7.0
10.1
–0.20
5.8
4.8
4.0
3.2
2.6
2.1
1.6
1.2
0.9
0.6
0.4
0.2
0.0
0.00
4.5 4.0 3.5
–0.40 s (z)
3.0
–0.60
2.5 2.0
–0.80
1.5 1.0
–1.00
0.5 0.0
–1.20 –1.40 j(l)noised
j(l)recovered
j(l)initial
s recovered
z s sintetic
s initial
Fig. 3 Transformed noised measured data, initial guess, and recovered functions in the case of a conductivity function with a maximum for different 5 % random noises
5
Future Directions
We described how the data interpretation process could be made by including into consideration layered medium models with 1D smooth conductivity functions. Another relatively simple type of medium models are not layered ones with piecewise constant conductivity function. Such kind of models are useful if one considers medium with several local inclusions or with buried relief. We expect that direct problems in these cases can be efficiently solved with high accuracy by using integral equations method. Then, different directions in solving inverse problems are possible.
1860
A. Hasanov and B. Mukanova
Fig. 4 Recovered conductivity distributions with 5 % noised data in the case of monotonic decreasing .z/
Fig. 5 Recovered conductivity distributions with 2.5 % noised data in the case of monotonic increasing .z/
6
Conclusion
The considered mathematical models form a next step in a complication of medium models as compared with standard layered models. Discussed models are still simple and efficient and can be used as additional alternative to interpret measurement data.
Inverse Resistivity Problems in Computational Geoscience
1861
Fig. 6 Recovered conductivity distribution with 1 % noised data in the case of .z/ having local minimum
12.0
1.2 exact
recovered 1.0
8.0
0.8 σ (r)
σ (r)
exact 10.0
6.0
0.6
4.0
0.4
2.0
0.2
0.0 1.0
1.3
1.5
1.8
2.0 r
2.3
2.5
2.8
3.0
recovered
0.0 1.0
1.3
1.5
1.8
2.0 r
2.3
2.5
2.8
3.0
Fig. 7 Examples of numerical solutions to CIP2 with 1 % noise level of measured data
Fig. 8 The recovery of conductivity distributions with 5 % noised data in the cases of .z/ having local extremuma
References Alekseev AS, Tcheverda VA, Niambaa Sh (1989) Optimization method for solving the inverse problem of geophysical prospecting by electric means under direct current for verticallyinhomogeneous media. In: A. Vogel et al (eds) Inverse Modeling in Exploration Geophysics. Vieweg & Sohn, Braunschweig/Wiesbaden, pp 171–189
1862
A. Hasanov and B. Mukanova
Alessandrini G (1988) Stable determination of conductivity by boundary measurements. Appl Anal 27:153–172 Archie GE (1942) The electrical resistivity log as an aid in determining some reservoir characteristics. AIME Trans 146:54–62 Beilina L, Klibanov MV (2012) Approximate global convergence and adaptivity for coefficient inverse problems. Springer, New York/Dordrecht/Heidelberg/London Dvoretckiy PI, Yarmakhov IG (1998) Electromagnetic and hydrodynamic methods in oil and gas deposit exploration. Nedra, Moscow (in Russian) Epov MI, Mironov VL, Muzalevskiy KV, Yeltsov IN (2010). UWB electromagnetic borehole logging tool. In: IEEE International Symposium on Geoscience and Remote Sensing, Honolulu, HI, USA, pp. 3565–3567 Geng M, Liang H, Yin H, Liu D, Gao Y (2012) Numerical simulation in whole space for resistivity logging through casing under approximate conditions. Procedia Eng 29:3600–3607 Hasanov A (1995) An inverse coefficient problem for an elasto-plastic medium. SIAM J Appl Math 55:1736–1752 Hasanov A (1997) Inverse coefficient problems for monotone potential operators. Inverse Probl 13:1265–1278 Ivanov VK (1963) On ill-posed problems. Math Sb 61(103) 2:211–223 Kaufman AA, Dashevsky YA (2003) Principles of induction logging. Elsevier, Amsterdam Langer RE (1933) An inverse problem in differential equations. Am Math Soc Bull 39:814–820 Lv W-G, Chu Z-T, Zhao X-Q, Fan Y-X, Song R-L, Han W (2009) Simulation of electromagnetic wave logging response in deviated wells based on vector finite element method. Chin Phys Lett 26:014102 Mukanova B (2009) An inverse resistivity problem: 1. Lipschitz continuity of the gradient of the objective functional. Appl Anal 88:749–765 Mukanova B (2012) A numerical solution to the well resistivity-sounding problem in the axisymmetric case. Inverse Probl Sci Eng. doi:10.1080/17415977.2012.727085. Taylor & Francis Mukanova B, Orunkhanov M (2010) Inverse resistivity problem: geoelectric uncertainty principle and numerical reconstruction method. Math Comput Simul 80:2091–2108 Onegova EV, Epov MI (2011) 3D simulation oftransient electromagnetic field for geosteering horizontal wells. Russ Geol Geophys 52(7):725–729 Pekeris SL (1940) Direct method of interpretation in resistivity prospecting. Geophysics 5(1): 31–42 Pen’kovskii VI, Korsakova NK (2010) The new method of data interpretation of well electromagnetic sounding. Inverse Probl Sci Eng 18(7):983–995 Peng Y-J (1997) An inverse problem in petroleum exploitation. Inverse Probl 13:1533 Slichter LV (1933) The interpretation of resistivity prospecting method for horisontal structures. Physics 4:307–311 Spichak V (2011) Electromagnetic sounding of the Earth’s Interior. Elsevier, Amsterdam Stefanescu SS, Shlumberger C (1930) Sur la distribution electrique potencielle dans une terrain a couches horizontals, homogenes etisotropes. J Phys Radium 7:132–141 Sylvester J (2000) Layer stripping. In: Colton D et al (eds) Surveys on solution methods for inverse problems. Springer, New York Tabarovsky LA, Bear DR, Mezzatesta A (1994) Induction logging: resolution analysis and optimal tool design using block spectrum analysis. In: SPWLA 35th Annual Logging Symposium, 1922 June, Tulsa, Oklahoma, Society of Petrophysicists & Well Log Analysts, pp 1–19 Tikhonov AN (1949) About uniqueness of geoelectrics problem solution. Dokl Acad Sci USSR 69(6):797–800 Tikhonov A, Arsenin V (1977) Solution of ill-posed problems. Wiley, New York Vasil’ev FP (1981) Methods for solving extremal problems. Nauka, Moscow
Identification of Current Sources in 3D Electrostatics Aron Sommer, Andreas Helfrich-Schkarbanenko, and Vincent Heuveline
Contents 1 2 3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Direct Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inverse Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Numerical Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Direct Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Inverse Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Noise in the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Real-Life Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1864 1866 1868 1871 1873 1875 1876 1877 1879 1880 1881 1882 1884
A. Sommer () Institut für Informationsverarbeitung (TNT), Leibniz Universität Hannover, Hannover, Germany e-mail: [email protected] A. Helfrich-Schkarbanenko Institute for Applied and Numerical Mathematics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany e-mail: [email protected] V. Heuveline Engineering Mathematics and Computing Lab (EMCL), Karlsruhe Institute of Technology, Karlsruhe, Germany Institute for Applied and Numerical Mathematics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany University of Heidelberg, Interdisciplinary Center for Scientific Computing, Engineering Mathematics and Computing Lab, Heidelberg, Germany e-mail: [email protected]; [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_85
1863
1864
A. Sommer et al.
Abstract
Motivated by passive airborne geoexploration we consider a source identification problem. This problem setting arises in electrostatics and it turns out to be a linear, ill-posed inverse problem. After developing a theoretical framework for corresponding elliptic forward problem, an approach for reconstructing current sources from local electric potential data is illustrated. A pseudo-solution is achieved by means of Tikhonov regularization. The performance of the method is shown by three-dimensional synthetic and real-life numerical examples. For numerical modeling, we choose Method of Finite Elements provided by COMSOL Multiphysics and apply MATLAB for developing a reconstruction algorithm.
1
Introduction
In general, inverse problems arise in many branches of mathematics and science including tomography, nondestructive testing, physics, geophysics, and many other fields. Their objective is a conversion of observed measurements into information about a (physical) system that we are interested in Kirsch (1996) and Rieder (2003). The inverse problem considers the “inverse” to the forward problem which relates the system parameters to the data that we observe. The set of inverse problems can be classified into linear/non-linear problems divided themselves into ill-posed/well-posed problems. Inverse problems are mostly ill-posed in sense of Hadamard (1915), who introduced this term w.r.t. ordinary differential equations. That means that at least one of the following three properties is violated – existence, uniqueness, stability of the solution – and it is this that makes inverse problems challenging and mathematically interesting. This class of problems had a profound influence on mathematics and led to the found of a new field of study – The Calculus of Variations. Furthermore, inverse problems have led to major physical advances, perhaps the most popular of which was the discovery of the planet Neptune after predictions made by Le Verrier and Adams on the basis of the inverse perturbation theory (Groetsch 1993). We investigate a source identification problem (or inverse source problem) which arises in electrostatics motivated by passive airborne geoexploration. It turns out to be a linear and ill-posed inverse problem (Sommer 2012). By source identification we mean a reconstruction of the electric current density based on some local electric potential measurements for given electrical conductivity. The term passive exploration means that the investigated physical system contains its own excitation. In contrast, an active exploration implies a measurement methodology which contains an excitation device. The reconstruction of electrical conductivity based on some measurements is called parameter identification problem. It is a non-linear, ill-posed problem (Helfrich-Schkarbanenko 2011), and it is not an objective of this work. In the following, we introduce briefly the physical system and the inverse problem we are interested in. A hydrocarbon reservoir exited by weak seismic activities generates characteristic electromagnetic waves which propagate through the medium. Hereby, the shape of the reservoir coincides with the current source
Identification of Current Sources in 3D Electrostatics
1865
Fig. 1 Scenario for the problem
density support. We aim to reconstruct this support based on local measurements of the electric field in the air. Let R3 be a bounded domain representing the ground and the air layer, cf. Fig. 1. By we denote the electrical conductivity, by u the electric potential and f WD r Jb defines the density of the electric current stimulating u, where Jb is the bound current density with ŒJb SI D Am2 (Korovkin et al. 2007, Chap. 1.3). Starting with Maxwell equations and assuming time harmonic, lowfrequency regime, cf. Sommer (2012), we obtain the Laplace equation r .ru/ D f
in :
(1)
The Robin boundary condition on @ completes (1) to an elliptic boundary value problem playing the starting point of our investigation. After turning it into a weak formulation, we apply the method of finite elements (FEM) for numerical modeling of the forward problem. Facing the inverse problem, we have to reconstruct f in on the basis of some local electric potential data assuming is a given parameter. By local data we mean the restriction uj onto a curve which is a one-dimensional manifold, Fig. 1. The local character of the measurements leads to a non-injective forward operator, and thus the inverse problem is ill-posed, cf. analytical example in Sect. 3. That means the forward operator cannot be inverted in general. However, applying Tikhonov regularization we achieve a unique pseudo-solution. In Hanke and Rundell (2011) the authors consider a relative problem for const in with the Cauchy data on @ in two- and three-dimensional case. They propose an algorithm for the determination of the support of f by solving a simpler “equivalent point source problem.” A corresponding full-space problem for homogeneous material parameters was solved for electromagnetic field via multipole expansions (Marengo and Devaney 1999). For the minimum L2 norm current source distribution, the authors assumed measurements on a sphere containing the support of the stimulation. The reconstruction of current distribution ru in a bounded three-dimensional domain from its magnetic field observed on the boundary was considered in Kress (2002). Note that (1) models steady state, diffusive phenomena with source term f . Thus, the results achieved in this work are applicable for example onto steady state heat problems (then, u would be the temperature and the thermal diffusion coefficient)
1866
A. Sommer et al.
or underground steady state aquifers (then, u would represent the hydraulic head and the aquifer transmissivity coefficient), cf. Groetsch (1993, Sect. 3.5). This chapter is organized as follows: In Sect. 2, we consider the strong formulation of the forward problem and derive its weak formulation fundamental for the FEM implementation. The ill-posedness of the inverse problem is demonstrated in Sect. 3. Furthermore, we set up a minimizing problem equivalent to the Tikhonov regularized inverse problem and prove its unique solvability applying Tikhonov result in combination with Sobolev Embedding Theorem and a diffeomorphism. Section 4 covers numerical implementation based on FEM followed by Sect. 5 containing some 3D synthetic and real-life examples and numerical investigations.
2
Direct Problem
The analysis of the forward problem, i.e., the determination of u for given f in a conducting medium is the first step of facing the corresponding inverse problem. Let R3 be a bounded domain with C 2 -boundary @, which ensures a regularity property of u, see Theorem 1. By we denote the outer unit normal vector on @. In general, we assume supp.f / \ D ;. Note that the model reduction from Maxwell equations to Laplace equation bounds the diameter L of by satisfying !L2 .1 C
!" /
1;
where is the magnetic permeability and ! the maximum occurring frequency of the source term in Maxwell model. For further details see Sommer (2012). The strong formulation of the direct problem is given as follows: Problem 1 (Strong Formulation). Let f 2 C ./ with supp.f / , 2 2 1 C 1 ./ \ C ./ and g 2 L1 >0 .@/. Find u 2 C ./ \ C ./ that fulfills the partial differential equations r .ru/ D f
in ;
@u C gu D 0 on @: @ To tackle this problem numerically by FEM, we need its weak formulation. Hence, in weak sense the electric potential has to be set in a Sobolev space of an order reduced by one, i.e., u 2 H 1 ./. The Sobolev space H m ./, see e.g., Brenner and Scott (1994), consists of all functions u 2 L2 ./ such that for every multi-index ˛ with j˛j m, the weak partial derivative D ˛ u belongs to L2 ./, i.e., ˚
H m ./ WD u 2 L2 ./ W D ˛ u 2 L2 ./ 8 j˛j m :
Identification of Current Sources in 3D Electrostatics
1867
Multiplying each side of Laplace’s equation by a test function v 2 H 1 ./, integrating over and applying the integration by parts yields the weak formulation of Problem 1. That representation is more convenient for numerical implementation (Braess 2003). Problem 2 (Weak Formulation). Let f 2 L2 ./ with supp.f / and provided by Problem 1. Find u 2 H 1 ./ that fulfills the integral equation Z
Z
Z
ru rv dx C
guv ds D @
f v dx
for all v 2 H 1 ./:
(2)
Because of the measurement methodology, see the operator defined in (4), and the Sobolev Embedding Theorem 3 we need a stronger regularity for the inverse problem. Assuming some restrictions on , g, and the bound @ following regularity theorem prescribes the space for the solution u. Theorem 1. Let be a bounded C 2 -domain. Assume is Lipschitz in with 0 < C1 < .x/ < C for all x 2 , f 2 L2 ./ and 0 g.x/ g0 a.e. on @. Then u 2 H 2 ./ and kukH 2 ./ C .kuk0 C kf k0 / : A general form of this theorem can be found in Gilbarg and Trudinger (2001, Chap. 8.4) and Salsa (2008, Thm. 8.13, 8.14). For this more regular case we denote the forward operator mapping given source onto electric potential by ƒW
L2 ./ ! H 2 ./; f 7! u:
Because of the properties of an integral operator, ƒ is a bounded linear operator (Sommer 2012). It represents the governing physics. So, the brief formulation of the Problem 2 reads: For given f solve the equation ƒŒf D u:
(3)
The uniform ellipticity of the differential operator r .r/ is provided by the condition (Salsa 2008, Chap. 8.5 (8.46)) 0
0 and concentric layers, see Fig. 3. The ball Bc .0/ represents the ground and nBc .0/ the air layer. Inside each layer the functions and f are constant and especially 1 in Bc .0/. Here, since is piecewise constant, the origin Strong Formulation 1 has to be reformulated as a Transmission Problem, see Sommer (2012) for details. Then, because of the fundamental solution of Laplace’s equation (Evans 2008, Thm. 2.2.1)
Fig. 2 Domains, co-domains and operators acting in Inverse Problem 3. Note that R3 and is a curve in
(TΛ)−1 L 2(Ω)
Λ
H 2(Ω)
T
L 2(Γ)
Identification of Current Sources in 3D Electrostatics
1869
Fig. 3 Two different current density sources, see black layers, generating the same potential field u in the air layer. (a) f1 1 in Bb1 .0/nBa1 .0/ and f1 0 else. (b) f2 2 in Bb2 .0/nBa2 .0/ and f2 0 else
and the simplicity of f and , the potentials u1 and u2 can be represented in an explicit way as follows: 8 4 b13 a13 ; for c < jxj R; ˆ ˆ 3x ˆ 1 1 3 ˆ 3 3 3 ˆ for b1 jxj c; < 3x b1 a1 C c b1 a1 ; u1 .x/ D 1 2 2a13 x 2 1 3 3 ˆ b 3x 3 C c b1 a1 ; for a1 < jxj < b1 ; ˆ ˆ 2 1 ˆ ˆ : 1 b 2 a2 C 1 b 3 a3 ; for 0 jxj a1 : 1 1 1 2 1 c 8 8 b23 a23 ; for c < jxj R; ˆ ˆ 3x ˆ 2 3 ˆ 2 3 3 3 ˆ for b2 jxj c; < 3x b2 a2 C c b2 a2 ; u2 .x/ D 2 2a23 x 2 2 3 ˆ b2 3x 3 C c b2 a23 ; for a2 < jxj < b2 ; ˆ ˆ ˆ ˆ : b 2 a2 C 2 b 3 a3 ; for 0 jxj a2 : 2 2 2 2 c Here the parameter a1 ; b1 ; a2 , and b2 are still arbitrary with 0 < a1 < b1 < c < R < 1 and 0 < a2 < b2 < c. To achieve the identity u1 u2 in air layer we have to set b13 a13 D 2.b23 a23 /: Obviously we obtain the non-injectivity of the operator T ƒ for local measurements, since for c jxj R we have the identity u1 .jxj/ D T ƒŒf1 .x/ D T ƒŒf2 .x/ D u2 .jxj/; while f1 6 f2 , see Fig. 4. This result leads to the following Lemma. Lemma 1. T ƒ is non-injective.
1870
A. Sommer et al.
Fig. 4 Analytic solutions u1 and u2 of the radial symmetric problem q generated by two different current sources. Chosen parameters: a1 D 1, b1 D a2 D 4, b2 D 3 12 .b13 a13 / C a23 , c D 7 and R D 10
We show that a solution of the Inverse Problem 3 exists, but it is not unique in general. In Devaney and Sherman (1982) the fields radiated by spherically symmetric time-harmonic sources satisfying the Helmholtz equation in homogeneous medium are used to illustrate how little can be learned about a source from knowledge of the radiated field outside of the source volume. Note that the corresponding discrete version of Problem 3 is strongly underdetermined (Sommer 2012). However, applying the Tikhonov regularization (Tikhonov 1963), which was independently developed by Phillips (1962) as well, we can enforce the uniqueness of a pseudo-solution. The idea of Tikhonov consists of perturbation of an operator via spectral shift to enforce the uniqueness of a pseudo-solution (Rieder 2003). Doing so, we obtain the following Regularized Inverse Problem 4 for local measurements, which plays the central role in this chapter. Problem 4 (Regularized Inverse Problem). Let ud 2 L2 ./ be the local measurements on a curve . Solve the minimization problem n arg min
f 2L2 ./
1 2
kT ƒŒf ud k2L2 ./ C ˛2 kLŒf k2L2 ./
o (5)
for a bounded linear operator L W L2 ./ ! L2 ./, which is continuously invertible on its image, and ˛ > 0. We denote the solution by fL;˛ .
Identification of Current Sources in 3D Electrostatics
1871
That means we are seeking for a f that minimizes the error in the data kT ƒŒf ud k2L2 ./ simultaneously considering an a priori information about the true solution involved by L.
3.1
Preliminary
Before we consider the unique solvability of Problem 4 we have to remember some definitions in differential geometry, as well as the cone property of a domain. Definition 1 (Toponogov 2006, def. 1.2.1). A regular k-fold continuously differentiable curve (or path) in the space R3 is described by a homeomorphism (parametrization) ' W I ! R3 , where I WD Œa; b is the parameter interval, satisfying the following conditions: 1. ' 2 C k , k 1; 2. The rank of ' is maximal (equal to 1). This definition except any intersections of and nf'.a/; '.b/g is a onedimensional submanifold. Note that a regular curve of class C k , k 1, is diffeomorphic to a line segment (Toponogov 2006). In rectangular Cartesian coordinate system is determined by its so-called parametric functions '.t/ WD .x.t/, y.t/, z.t//> , where t 2 Œa; b . The first condition in Definition 1 means that x, y and z belong to class C k , and the second condition means that the derivatives x 0 , y 0 , z0 cannot simultaneously vanish for any t. Definition 2 (Adams and Fournier 2003, Def. 4.6). The domain satisfies the cone condition if there exists a finite cone C such that each x 2 is the vertex of a finite cone Cx contained in and congruent to C. For the unique solvability of the Regularized Inverse Problem 4 we apply the following Theorem, Remark and Sobolev embedding theorem. Firstly we introduce the Tikhonov functional JL;˛ .x/ WD
1 kKx 2
yk2Y C ˛2 kLxk2Z
for x 2 X
for a given bounded linear operator K W X ! Y and y 2 Y . So, the term in (5) is exactly of that form. The following Theorem establishes the relation between that functional and the corresponding normal equation. Theorem 2. Let both K W X ! Y and L W X ! Z be bounded linear operators between Hilbert spaces and K the adjoint operator to K. If ˛ > 0 and L continuously invertible on its image, then the Tikhonov functional JL;˛ has a
1872
A. Sommer et al.
unique minimum xL;˛ 2 X . This minimum xL;˛ is the unique solution of the normal equation K K C ˛L L x D K y: So, solving the minimizing problem (5) is equivalent to dealing with the corresponding normal equation. Note that ˛L L in (5) shifts the nonnegative eigenvalues of K K away from zero and makes K K C ˛L L invertible. Remark 8. For the case that L is the identity operator, the proof of Theorem 2 can be found in Kirsch (1996, Thm. 2.11). If L is continuously invertible on its image, then there exists ˇ > 0 with ˇkxkX kLxkZ
for all x 2 X:
According to Rieder (2003, Thm. 8.1.15) this estimate is sufficient to extend the proof in Kirsch (1996, Thm. 2.11) for the general case given in Theorem 2. Note that the solution xL;˛ 2 X is unique and depends continuously on y 2 Y . Because of the measurement methodology we apply the Sobolev Embedding Theorem and generalize the results via a diffeomorphism. Theorem 3 (Sobolev Embedding Theorem Adams and Fournier 2003, Thm. 4.12). Let Rn satisfy the cone condition and k is a k-dimensional domain generated by intersection of and k-dimensional plane in Rn , where 1 k n. If one of both requirements 1. 2m < n and n 2m < k, or 2. 2m D n holds for j; m 2 N with j 0 and m 0, then the following embedding exists H j Cm ./ ,! H j .k /: Theorem 4 (Dobrowolski 2006, Thm. 6.8). Let x 2 , y 2 0 and d W 0 ! be a C m -diffeomorphism, so that Du.y/ D u.d .y// is the transformation of the function u W ! R. Then the map D W H m ./ ! H m .0 / is bijective and bounded with bounded inverse, which means that c1 kukm; kDukm;0 c2 kukm; : Theorem 5 (Transformation Formula Königsberger 2004, Thm. 9.1). Let , 0 be open subsets of Rn and d W 0 ! an invertible diffeomorphism. Then
Identification of Current Sources in 3D Electrostatics
1873
the function u W ! R is integrable if and only if y 7! u.d .y//jdet.grad.d /j is integrable. In this case the following equation holds: Z
Z d .0 /
u.x/ dx D
0
u.d .y// jdet.grad.d .y///j dy:
The term det.grad.d .y/// DW J .y/ is called Jacobian or functional determinant of d . Now we can present the main result of this chapter.
3.2
Main Result
Theorem 6. Let be a regular curve of class C 2 . Then the regularized inverse problem 4 has a unique pseudo-solution fL;˛ 2 L2 ./. Remark 9. Deriving the weak formulation 2 requires less regularity on @ than C 2 , namely has to be a Lipschitz domain, see Steinbach (2008, Def. 2.1). Furthermore, Theorem 3 assumes a domain that fulfills the cone condition, see Definition 2. We emphasize that a C 2 boundary of a bounded domain satisfies the Lipschitz condition, cf. Steinbach (2008, Def. 2.1), and every bounded Lipschitz domain satisfies the cone condition (Adams and Fournier 2003, Chap. 4). If does not start and end on the boundary @, could be extended to another curve, which fulfills this property. Then, the Theorem 6 is applicable again. Proof. In the first part of the proof we show the well-definition of the linear restriction operator T , defined in (4). In case of is a line segment the proof consists of applying the Sobolev Embedding Theorem 3 (valid only for hyperplanes) and the Remark in the preliminary. To achieve the well-definition for general , we apply a smooth diffeomorphism d with properties given by Theorems 4 and 5 to diffeomorph to a line segment and then apply the Sobolev Embedding Theorem 3. The second part of the proof consists in showing the uniqueness of the pseudo-solution fL;˛ by using the well-known Theorem 2 for the Tikhonov-Phillips regularization and the attached Remark. Part 1. Without loss of generality we choose d as an automorphism, such that D 0 . For the sake of simplicity let be a curve whose starting and end point lie on the boundary @. The preimage of the curve under a C 2 -diffeomorphism d W 0 ! has to be a line segment 0 which starts and ends on the boundary @0 , as well. For clarification see Fig. 5. By assumption u 2 H 2 ./ holds. Applying Theorem 4 a generally nonlinear operator D W H 2 ./ ! H 2 .0 / exists, which depends on d and has a
1874
A. Sommer et al.
Fig. 5 Diffeomorphism d applied on 0 and on 0 , respectively
bounded inverse. The connection between D and d is, cf. Dobrowolski (2006, Sect. 6.3): Du.y/ D u d .y/ D u.x/; where x 2 , y 2 0 and hence Du 2 H 2 .0 / holds. Since D 0 the regularity property of @ transfers onto @0 . Now, applying the Sobolev Embedding Theorem 3 yields H 2 .0 / ,! H 0 . 0 / D L2 . 0 /:
(6)
This embedding can be described by a bounded linear (embedding) operator T W H 2 .0 / ! L2 . 0 / such that T Du.y/ D Du.y/ for all y 2 0 holds. Consequently T is well-defined and .Du/j 0 D T Du 2 L2 . 0 /. Now we have to show that uj D T u 2 L2 ./ holds w.r.t. given diffeomorphism d . For a C 2 -diffeomorphism d the estimation 0 < jJ .y/j < c
(7)
holds for all y 2 0 by Dobrowolski (2006, Chap. 6, p. 100). Thus, the length of 0 and is finite. This implies the following estimation.
Identification of Current Sources in 3D Electrostatics
1875
Z kuk2L2 ./ D
ju.x/j2 dx Z
D
d . 0 /
Theorem 5
D
.7/
c
Z 0
Z D
ju.x/j2 dx
0
ju.d .y//j2 jdet.grad.d .y///j dy
ju.d .y//j2 jJ .y/j dy
Z 0
jDu.y/j2 dy < 1:
The last estimation follows from .Du/j 0 2 L2 . 0 / due to Sobolev Embedding Theorem 3. We show that uj 2 L2 ./. Thus, the operator T W H 2 ./ ! L2 ./ is well defined for C 2 -smooth curves . Part 2. Because of the fact that T and ƒ are bounded linear operators, their composition T ƒ W L2 ./ ! L2 ./ is also a bounded linear operator with estimation kT ƒk kT k kƒk, so that Theorem 2 can be applied. Because L is assumed to be continuously invertible, the operator L L has non-negative eigenvalues and the operator ƒ T T ƒ C ˛ L L is continuously invertible for ˛ > 0, as well (Rieder 2003, Chap. 4). Thus, according to Theorem 2, the unique pseudo-solution fL;˛ 2 L2 ./ of the regularized inverse Problem 4 can be computed for fixed L and fixed ˛ by fL;˛ D .ƒ T T ƒ C ˛L L/1 ƒ T Œud ; which completes the proof.
(8) t u
Remark 10. The first parameter L of the pseudo-solution fL;˛ can be chose by means of problem setting. We will touch this topic in Sect. 4.2. The determination of an optimal ˛ turns out to be a challenging task, in particular w.r.t. the real-life scenario/data. For this purpose we apply the L-curve criterion, see Sect. 4.2.
4
Numerical Implementation
After the theoretical framework, we concentrate on the numerical implementation of the forward and inverse problem as well. Due to the Céa lemma (Brenner and Scott 1994) we can discretize the forward weak problem, for
1876
A. Sommer et al.
example via FEM by means of Galerkin (Braess 2003) without disturbing the unique solvability. The Sobolev space H 1 ./ is modeled by a finite dimensional subspace Hh1 ./ consisting of n basis functions f i gniD1 which are linear finite elements in this work. That means the direct and inverse problem has n degrees of freedom. In Sects. 2 and 3 we assumed to have a C 2 boundary. We justify the modeling of by means of tetraeders since the boundary of a Lipschitz domain L can be approximated arbitrarily close by a C 2 boundary via the convolution WD D L ' ; where ' is a mollifier satisfying lim!0 ' .x/ D ı.x/ with ı the Dirac impulse in R3 (Adams and Fournier 2003, Chap. 2). In particular, the approximation of identity lim D lim L ' D L
!0
!0
and the inclusion supp D supp.L ' / suppL ˚ supp' ; hold, where ˚ indicates the Minkowski addition. In the following we emphasize by bold letters that the numerical linear operators are matrices and the discrete functions are represented by vectors. A detailed procedure can be found in Sommer (2012).
4.1
Direct Problem
Let Hh1 ./ be the finite dimensional subspace of H 1 ./. The weak formulation 2 of the Forward Problem 1 is given in matrix notation by Au D M f
for all k 2 Hh1 ./;
where A 2 Rn n is the stiffness matrix with components Z
Z
A i;j D
r i r j dx C
g i j ds; @
(9)
Identification of Current Sources in 3D Electrostatics
1877
M 2 Rn n is the mass matrix with entries Z
i j dx; M i;j D
(10)
u 2 Rn is the nodal electric potential vector and f 2 Rn is the nodal current density source vector. Taking u D .u1 ; : : : ; un /> 2 Rn we show that A is a positive definite matrix: >
u Au D
n n X X
Z
D
n X
@
n X r.ui i / r.uj j / dx
i D1
Z C
g
n X
@
g i j ds uj
r i r j dx C
ui
i D1 j D1
Z
Z
j D1
u i i
n X
i D1
uj j ds
j D1
Z
Z jruh j2 dx C
D
gjuh j2 ds 0; @
P where uh WD nj D1 uj j 2 Hh1 ./. Since ; g > 0 a.e., bounded and measurable, equality holds if and only if ruh 0 in and uh 0 on @. Hence, u> Au is zero if u 0. So, u> Au > 0 whenever u 6D 0. That means the stiffness matrix A is positive definite and thus invertible. Based on Eq. (9) we can now setup the discrete forward operator by ƒ WD A 1 M : The existence of A 1 and its positive definition follows from the positive definition of A as well as from Céa Lemma and Brenner and Scott (1994). Analog to the arguments shown above the mass matrix M is symmetric, positive definite, too. Hence, the product ƒ is invertible, but in general non-positive definite and nonsymmetric.
4.2
Inverse Problem
The set h of measure grid points h WD fp j pi is measuring grid point; i D 1; : : : ; kg
1878
A. Sommer et al.
is a discrete FE model for the path . We start with the implementation of the operator T given in (4). Let T 2 Rk n , where k n is the number of measuring grid points. Its components are:
T i;j
8 < 1; if the measuring grid point i has WD the global grid point number j; : 0; else;
such that ud WD ujh D T ƒ f :
(11)
Applying (8) the pseudo-solution f L;˛ can be computed by 1 f L;˛ D ƒ> T > T ƒ C ˛L> L ƒ> T > ud ;
˛ > 0;
(12)
for experimental data ud or synthetic measurements ud generated by (11). In general the pseudo-solution f L;˛ depends on the penalty operator L and on the regularization parameter ˛. To find the optimal ˛ we apply the L-curve criterion (Hansen 1998, Chap. 4.6), which is a heuristic parameter choice rule. It investigates the graph .kT ƒŒfL;˛ ud k2L2 ./ ; kLŒfL;˛ k2L2 ./ / as a function of ˛ 2 Œ˛min ; ˛max for maximal curvature w.r.t. ˛ identifying the optimal one. For the discretized problem the L-curve consists of a finite set of points
kT ƒŒf L;˛i ud k22 kLŒf L;˛i k22
;
i D 1; : : : ; m:
Numerical experiments show that the penalty operator L has a strong influence on the pseudo-solution f L;˛ . Some possible L-matrices are listed in Table 1. If the vertical position of the current source support is available, we can formulate a very effective penalty operator D. It forces the pseudo-solution to damp above a chosen depth and to vanish above , so that its support has to exist below the depth value . Let piz be the z-coordinate of the i -th grid point p, then we chose a diagonal matrix D with corresponding matrix elements Table 1 A set of L-matrices; I 2 Rnn is the identity matrix. M D h i ; j iL2 ./ is given in (10), G D hr i ; r j iL2 ./ and D from (13) are symmetric, positive definite matrices Regularization Zero-order Zero-order with L2 -norm Zero-order with damping First-order with L2 -norm
Impact on pseudo-solution Minimal amplitude Minimal amplitude involving grid element size A priori information about depth Minimal gradient involving grid element size
L> L I M D G
Identification of Current Sources in 3D Electrostatics
D i;i
8 ˆ if piz ; < ˇ; ˇ1 WD . /2 .piz /2 C ˇ; if < piz < ; ˆ : 1; if piz ;
1879
(13)
where is the stop band depth, is the cutoff depth and ˇ is the damping factor. We chose these matrix entities of D to represent a continuous monotonic damping function depending on the depth of each grid point. Note that all eigenvalues of D are greater than one. For the numerical implementation we apply the physics modeling and simulation software COMSOL Multiphysics 4.2 based on FEM to generate the mesh and to solve the elliptic direct problem in weak form 2. The inverse problem solver is implemented in MATLAB R2012b, which is connected to COMSOL by COMSOL Matlab Live-Link to import the necessary components like the stiffness matrix A into MATLAB environment. In Colton and Kress (1992, pp. 133, 304) the authors coin the expression inverse crime to denote the act of employing the same model to generate, as well as to invert, synthetic data. Moreover, they warn against committing the inverse crime, “in order to avoid trivial inversion” and go on to state: “it is crucial that the synthetic data be obtained by a forward solver which has no connection to the inverse solver.” Thus, the inverse problem mesh is essentially coarser than the direct problem mesh and it does not contain any information about the structure of the source. The question whether there is a risk of committing the inverse crime in real-life problems is answered by the following. We see no such risk since the forward problem solver for real data is unknown (Wirgin 2008).
5
Numerical Examples
Consider a ground cuboid WD .0; 20/ .0; 30/ .6; 2/ with length unit km. Its discretization contains n 1:8 104 degrees of freedom. The conductivity is set to a constant value 4 102 S m1 in the ground and 105 S m1 in the air. The curve h , on which the measurement takes place, is represented by six lines, see Fig. 6. We model a dipole current source density by 8 < 1; for x 2 .8; 12/ .11; 15/ .3; 2/; f .x/ WD 1; for x 2 .8; 12/ .15; 19/ .3; 2/; : 0; else; where Œf SI D A m3 since f D r Jb . The first step is generating synthetic data ud by (11). So, we have to solve the system of linear equations (9) representing the Direct Problem 3, see Fig. 6. Subsequently, we generate another (coarser) mesh for computing the pseudosolution f L;˛ from these synthetic data ud . Assuming the minimal depth of the source, we use the penalty operator L> L WD D and obtain the results presented
1880
A. Sommer et al.
Fig. 6 Illustration for the Direct Problem (9). The current source f generates the measurements ud , see the colors at six lines in the air layer (in parts faded out)
in Fig. 7. In general the reconstruction of the depth position of the current source density is a challenging task. Thus, in our experiments we applied the operator D for penalty purpose.
5.1
Noise in the Data
In the previous computations we considered exact data. Here we aim to show the behavior of the reconstruction algorithm w.r.t. the noise in the data. Assuming additive white Gaussian noise uı WD u C ı with noise level ı, we compute the pseudo-solution f ıL;˛ , see Fig. 8b, d. Only the components f ıL;˛;i which fulfill 0:3 min f ıL;˛ > f ıL;˛;i
or f ıL;˛;i > 0:3 max f ıL;˛
are visible. We notice that even a signal-to-noise ratio of 10 dB does not cause a big noise in the solution, cf. Fig. 8b, d. Note that the regularization parameter ˛ has
Identification of Current Sources in 3D Electrostatics
1881
Fig. 7 Illustration for Inverse Problem (12). Data ud , see six lines in the air layer of and the pseudo-solution f L;˛ , see the iso-surfaces in the middle. Note that f was designed as a dipol
to be adopted for every noise level individually. Otherwise the computation of the pseudo-solutions would fail in general. To model a channel with additive white Gaussian noise we applied the MATLAB function awgn.u; snr;0 measured 0 /. It measures the energy of the desired vector signal u and adds white Gaussian noise to it. The variable snr describes the signalto-noise ratio (SNR) in dB.
5.2
Real-Life Data
In (airborne) electromagnetic exploration one measures the amplitude of the magnetic field or its component(s) instead of electric potential. In general, the conversion from magnetic field to electric potential is non-linear. This conversion can be represented by a bounded linear operator via the linear approximation, such that the presented approach based on the Tikhonov regularization can be applied again (Sommer 2012). Numerical investigation shows that the choice of the optimal regularization parameter ˛ is more difficult than in the synthetic case (Sommer 2012). In detail, the L-curve criterion yields unfortunately non-unique pseudo-solutions. We find out that applying L> L D G , see Table 1 for the penalty term in (12), leads to a smooth enough L-curve. Similar to synthetic case,
1882
a
10
107
A. Sommer et al.
500m height
8
106
8
1000m height
b
6
1
6 4
4 2
2
0
0
–2
2000 0 –2000 –4000 –6000 3
–2
10
–4
–8
1.5
0
107
1 2 3 north-south 104
500m height
–6
1.5 1 0.5
10
4
–1
0.5 1 1.5 2 east-west 104
106
1000m height
12
1
–0.5
1.5 1 0 0
0
2
2 0.5
d
10
0.5 0.4
8
0.3
6 0.5
4
0
0
10
–4
1 2 3 north-south 104
–8
0 –0.1
4
2
2 1.5
1.5 1
1 0.5
–6 0
0.1
2.5
–2 –0.5
0.2
2000 0 –2000 –4000 –6000 3
2
–1
0 2.5 4
–4
–6
c
0.5
0.5 0 0
0
10
4
–0.2 –0.3 –0.4
0.5 1 1.5 2 east-west 104
Fig. 8 Measurements at six lines with corresponding reconstruction depending on SNR. The regularization parameter ˛ is adapted to the SNR. (a) Measurements ud with SNR = 30 dB. (b) Reconstructed f for SNR = 30 dB. Points represent FE nodes. (c) Measurements ud with SNR = 10 dB. (d) Reconstructed f for SNR = 10 dB. Points represent FE nodes
the measurements in practice are stable against relatively high noise levels. One example of reconstruction from experimental data is shown in Fig. 9. The number of degrees of freedom for the corresponding FE model is 14103. Applying L> L D G for the penalty term yields a compact support of f L;˛ . The parameter ˛ is chosen by means of L-curve criterion, Fig. 9a.
6
Summary and Outlook
We discussed a linear inverse problem derived from an elliptic boundary value problem that arises in electrostatics. It turned out that the direct problem is well-posed and uniquely solvable by Lax-Milgram Lemma. The corresponding restricted inverse problem, i.e., identification of current source density f from local electric potential data ud , is not unique solvable and consequently it is
Identification of Current Sources in 3D Electrostatics
1883
1011
a 14 12
||f ||2
10 8 6 4 2 0 –2 0
200
400
600 800 ||udata–u|||2
1000
L-curve for experimental measurements. Red crosses represent the environment of the optimal α.
b
10 9 2
Reconstructed f
1.5 1000 0 –1000 –2000 –3000 10000
1 0.5 0 –0.5 8000
–1
6000 6000
4000
4000
2000 0 0
2000
–1.5 –2
Reconstructed fL,α for optimal α. Points represent FE nodes. Fig. 9 Reconstruction results (b) for real-life measurement data. Third red marked cross in (a) represents the optimal ˛. Measurements took place at six flight lines in North-South direction (faded out) in the middle of the air layer
1884
A. Sommer et al.
ill-posed. The uniqueness of a pseudo-solution was achieved by means of the Tikhonov regularization. For the corresponding theoretical framework we applied the Sobolev Embedding Theorem and a smooth diffeomorphism. In particular, the penalty term of the Tikhonov functional incorporates a priori information about the solution f and so stabilizes the reconstruction. Remember that (1) models steady state, diffusive phenomena and not only the electrostatic phenomena. The FEM is an attractive tool to model, discretize and solve the continuous problem numerically. For mesh generating and assembling the stiffness matrix we took COMSOL Multiphysics software. The inverse problem solver was developed in MATLAB environment. Because of the high mesh resolution it makes sense to use high performance computers dealing with this problem. The numerical experiments confirm that different penalty operators deliver different pseudo-solutions. So, further analysis of this relation and providing the algorithm with suitable adapting penalty term is one focus of our research. Moreover, in context of inverse problem the electrical conductivity in the ground is actually unknown and has to be reconstructed, as well. This leads to a non-linear ill-posed inverse problem (of parameter identification class) which we can face by means of iterative Tikhonov regularized methods. Doing so we could increase the quality of the current source density reconstruction. Acknowledgements We thank Jörg Bäuerle for his fruitful comments.
References Adams RA, Fournier JJF (2003) Sobolev spaces. Second edition. Elsevier, Amsterdam Braess D (2003) Finite Elemente. Theorie, schnelle Löser und Anwendungen in der Elastizitätstheorie. Springer, Berlin Heidelberg New York Brenner SC, Scott LR (1994) The mathematical theory of finite element methods. Springer, New York Colton D, Kress R (1998) Inverse acoustic and electromagnetic scattering theory. Springer, Berlin Devaney AJ, Sherman G (1982) Nonuniqueness in inverse source and scattering problems. IEEE Trans Antennas Propag 30(5):1034–1037 Dobrowolski M (2006) Angewandte Funktionalanalysis. Funktionalanalysis, Sobolev-Räume und elliptische Differentialgleichungen. Springer, Berlin Heidelberg New York Evans LC (2008) Partial differential equations. Graduate studies in mathematics, vol 19. American Mathematical Society, Providence, Rhode Island Gilbarg D, Trudinger NS (2001) Elliptic partial differential equations of second order, 2nd edn. Springer, Berlin Groetsch HW (1993) Inverse problems in the mathematical sciences. Vieweg, Braunschweig Hadamard J (1915) Four lectures on mathematics. Columbia University Press, New York Hanke M, Rundell W (2011) On rational approximation methods for inverse source problems. Inverse Probl Imaging (IPI) 5(1):185–202 Hansen P (1998) Rank-deficient and discrete ill-podes problems. Numerical aspects of linear inversion. SIAM, Philadelphia Helfrich Schkarbanenko A (2011) Elektrische Impedanztomografie in der Geoelektrik. Dissertation, Karlsruher Institut für Technologie Kirsch A (1996) An introduction to the mathematical theory of inverse problems. Springer, New York
Identification of Current Sources in 3D Electrostatics
1885
Korovkin NV, Chechurin VL, Hayakawa M (2007) Inverse problems in electric circuits and electromagnetics. Springer, New York Königsberger K (2004) Analysis 2, 5. korrigierte Auflage. Springer, Berlin Heidelberg New York Kress R, Kühn L, Potthast R (2002) Reconstruction of a current distribution from its magnetic field. Inverse Probl 18:1127–1146 Marengo EA, Devaney AJ (1999) The inverse source problem of electromagnetics: linear inversion formulation and minimum energy solution. IEEE Trans Antennas Propag 47(2):410–412 Phillips DL (1962) A technique for the numerical solution of certain integral equations of the first kind. J Assoc Comput Mach 9:84–97 Rieder A (2003) Keine Probleme mit Inversen Problemen. Eine Einführung in ihre stabile Lösung. Vieweg Verlag, Wiesbaden Salsa S (2008) Partial differential equations in action. From modeling to theory. Springer, Milan Sommer A (2012) Passive Erdölexploration aus der Luft – Theorie und Numerik eines linearen inversen Problems. Diploma thesis, Karlsruher Institut für Technologie Steinbach O (2008) Numerical approximation methods for elliptic boundary value problems. Finite and boundary elements. Springer, New York Tikhonov AN (1963) Solution of incorrectly formulated problems and the regularization method. Sov Math Dokl 4:1035–1038 Toponogov VA (2006) Differential geometry of curves and surfaces. A concise guide. Brikhäuser, Boston Wirgin A (2008) The inverse crime. arXiv:math-ph/0401050v1
Transmission Tomography in Seismology Guust Nolet
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Solving the Forward Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Kernel Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Regularization of Large Matrix Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Adjoint Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Resolution Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Fundamental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1888 1889 1893 1894 1895 1896 1898 1900 1901 1901 1903 1904
Abstract
This chapter summarizes three important methods for seismic transmission tomography: the interpretation of delays in onset times of seismic phases using ray theory, of cross-correlation delays using finite-frequency methods, and of full waveforms using adjoint techniques. Delay-time techniques differ importantly in one key aspect from full waveform inversions in that they are more linear. The inverse problem for onset times is usually small enough that it can be solved by matrix inversion; for waveform inversions gradient searches are generally needed, and for cross-correlation delays the solver depends on the size of the problem. Onset times can simply be interpreted using the approximations of geometrical optics (ray theory). For cross-correlation delays one can use ray theory to
G. Nolet () Geosciences Azur, Université de Nice, Sophia Antipolis, France e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_58
1887
1888
G. Nolet
compute the linearized dependency on model perturbations in a volume around the ray, if the observed phase travels a well-identified raypath. However, for diffracted pulses or headwaves, numerical solvers for the wavefield are needed. This is also the case for waveform inversions. Whatever the technique that is used, the resulting linearized system is usually underdetermined and needs to be regularized. Progress in the near future is to be expected from efforts to densify the network of seismometers and extending it to the oceanic domain, as well as from the continued growth in the power of supercomputing that will soon push waveform inversions to embrace the full frequency range of observed seismic signals.
1
Introduction
Efforts to image the subsurface using seismic observations divide broadly into two groups: those that use reflected waves and those that use transmitted waves. Reflection seismology is very much like depth sounding with sonar on a ship: we know the speed of sound in water, and the arrival time for the reflected wave can therefore be converted into depth to the sea bottom. As the ship moves on, the reflection times trace a line of sea bottom depth. Similarly, on land we can observe reflecting surfaces using a large spread of geophones and an array of sources that move on like a ship. If the reflector is not horizontal, and the reflection point is thus not located vertically beneath the source, methods of “migration” exist to take reflector topography into account. Though the velocity in the subsurface is a priori unknown, an approximative value can be deducted from the shape of the reflection curve from each source. Reflector imaging is a crucial exploration tool for the oil and gas industry. Obtaining a more reliable estimate of the subsurface velocity is needed to improve the imaging, but this is difficult to obtain from reflected waves. Transmitted waves are more powerful to estimate the velocity and its variations in two or three dimensions. The first attempts at transmission tomography were made in the exploration industry by Bois et al. (1971) between two boreholes. The field developed mostly outside of industry, however, where lack of large arrays of sensors made transmission tomography the most promising tool to image the deep three-dimensional structure of the Earth. Nolet (2008) gives an account of the development of transmission tomography in the past 40 years. More recently, the line between reflection and transmission tomography has become less sharp; the deployment of very dense arrays over hundreds or thousands of km sometimes allows for the imaging of the deep Earth using reflected waves. On the other hand, the need for a more precise model of the shallow subsurface velocity motivates the industry to record reflected waves at large distance (“wide angle”) and abandon traditional migration (backprojection) algorithms for more sophisticated “full waveform inversions” to image energy that has been both reflected and
Transmission Tomography in Seismology
1889
transmitted. The model to be retrieved from such data thus consists of parameters of very different character – one attempts to map the topography of discontinuities at the same time as the variations of seismic velocities within each layer. These velocities, in turn, depend on density , shear modulus , and bulk modulus : for p compressional waves one has a velocity ˛ D . C 2=3/=, whereas for shear p waves ˇ D =. As a rule, these velocities are easier to determine than the density and elastic parameters separately. Information on the density can only be obtained if accurate amplitudes are available: reflection coefficients, for example, depend on the seismic impedances ˛ and ˇ. But amplitudes are influenced by many other factors such as attenuation and focusing/defocusing, and accurate estimation of density with seismic waves only has so far proven to be illusive. Apart from the crucial role that reflection seismics – and more recently more tomographic techniques like waveform inversion – play in the search for oil and gas reservoirs hidden at depths down to half a dozen kilometers, seismic transmission tomography is the only tool that allows us to map structural anomalies with a useful resolution down the fluid core or even to the center of the Earth. Aside from transmission tomography, the observation of the normal mode frequencies and their splitting due to lateral heterogeneity of the Earth has contributed to constrain the very long wavelength structure in our planet’s interior. A discussion of this topic is beyond the scope of this chapter that concentrates on transmission tomography. Interested readers are referred to the textbook by Dahlen and Tromp (1998).
2
Key Issues
Traditionally, seismic tomography has long relied on the approximations of geometrical optics to model the travel time of a seismic wave with a line integral along a seismic “ray”: Z T D P
ds ; c.r/
(1)
where T is the observed time it takes the wave to travel from the earthquake or explosion source to the receiver, c is the wave speed at location r, and P indicates a path satisfying Snel’s law. We use c as a general notation, it stands for ˛ or ˇ depending onpthe nature of the observed wave; for acoustic waves in fluids, the notation c D = is often used as well. If T is determined by picking the “onset” of the wave, which by definition satisfies Fermat’s principle of a stationary travel time, (1) provides the correct theory for its interpretation, since minimizing T leads to Euler-Lagrange equations that are equivalent to Snel’s law (Nolet 2008). Our knowledge of the velocity structure of the Earth is sufficient to calculate the path P with first-order accuracy. The fact that the travel time is stationary, and thus insensitive to small errors in the path, allows us to view (1) as quasi linear in the “slowness” c 1 . If it is not sufficiently linear, it can be linearized:
1890
G. Nolet
Z ıT D P
1 ıc.r/ds ; c0 .r/2
(2)
where ıT D Tobs T0 , the difference between the observed travel time and the one predicted by the background model velocity c0 .r/. The inversion problem is then handled by repeated application of (2) while adapting the trajectory P to the adjusted model c C ıc. Equation (2) is easy to use in tomography, and the ray-theoretical approach on which it is based still dominates the field. Ray-theoretical tomography has a number of limitations, though. The onset of a wave is often difficult to observe in the presence of noise. There exist reflected phases that follow not a strict minimum time path but a “minimax” time (it is at a stationary maximum for the position of the reflecting point), and a sharp onset would not even exist in the absence of noise. And, most importantly, when one observes only the onset of a wave, one ignores information that resides in the rest of the seismogram. In particular, energy diffracted around small heterogeneities influences the waveform even if it arrives after the onset. Finally, ray theory is inadequate to model amplitude variations since the ray-theoretical dependence of a wave amplitude on c.r/ is highly nonlinear and leads to amplitude variations that are far larger than observed for global seismic waves at typical frequencies of 0.1–0.3 Hz (Tibuleac et al. 2003). To improve on ray-theoretical seismology, Luo and Schuster (1991) and Dahlen et al. (2000) developed – in exploration seismics and global seismology respectively – the theory to interpret travel times estimated by picking the time of the maximum in the cross-correlation .t/ between the observed wave arrival u.t/ and a synthetic seismogram u0 .t/ computed for a “background” or “starting” model m0 : Z .t/ D
u t 0 u0 t 0 t dt 0 :
(3)
Note that, in contrast to many other applications of cross-correlograms, we do not require that u.t/ and u0 .t/ are the same waveforms that may perhaps only differ in amplitude and noise content. If they are the same, it means we are in a domain where ray theory is valid (absence of scattered or diffracted energy), and the maximum of the cross-correlation simply gives us the same delay for the observed wave as the onset time would give us. The importance of (3) is that it allows us to interpret energy arriving after the onset, which may move the time of the maximum of .t/ away from the ray-theoretical delay. If we low-pass the signal, we are forced to include more of the later arriving energy in the cross-correlation window; this means that energy that has ventured further away from the raypath may influence the delay. The size of the region in the Earth that influences the delay depends thus on the frequency of the wave, as we shall see. In order to interpret the cross-correlation travel time, we assume that u.t/ has a form slightly different from u0 .t/: u.t/ D u0 .t/ C ıu.t/
(4)
Transmission Tomography in Seismology
1891
Denoting the autocorrelation of u0 .t/ by 0 .t/, we find for the observed crosscorrelation Z 0 .t/ D 0 .t/ C ı.t/ D u0 t C ıu t 0 u0 t 0 t dt 0 ; (5) which reaches a maximum after a delay ıT : R C ı P .0/ C O ı 2 D 0 : Pobs .ıT / D P .ıT / C ı P .ıT / D P .0/ C .0/ıT
(6)
Since P .0/ D 0, we find to first order R1 R1 Re 0 i!u0 .!/ ıu.!/d! P 0 .t 0 / ıu .t 0 / dt 0 ı P .0/ 1 u D R1 : D R1 2 ıT D R .0/ R 0 .t 0 / u0 .t 0 / dt 0 1 u 0 ! u0 .!/ u0 .!/d!
(7)
The last expression in the frequency domain was obtained with Parseval’s theorem and a Fourier sign convention ei!t for the time signal. Equation (7) does not yet give us a direct link between an observed delay and the velocity structure of the Earth c.r/ like (1) does. For that we need one more linearization that relates perturbations ıc.r/ in the background velocity c0 .r/ to perturbations ıu.!/ in the wave arrival. Born theory, a first-order scattering theory, is the vehicle required for this. The wave field is satisfied by the elastodynamic equations which we write symbolically (using boldface we acknowledge that the displacement is a vector field even if we may observe only one component): A0 u0 D f ;
(8)
where f represents the source and A0 is an operator representing the second-order differential equations for elastic motion with density and elastic coefficients ci klm for the background model or more generally .A0 u/i D
@um @2 ui X @ cj klm : 2 @t @xk @xl
(9)
klm
If we discretize the model as well as time, A0 and its boundary conditions are represented by a matrix that operates on a displacement field represented by the vector u0 . If f D ı .x x 0 / ı.t/eO j , a unit vector in direction j at time t D 0 in the point location x 0 , we denote the i th component of the solution by the Green’s function Gij .x; x 0 ; t/. For a more general force distribution, the solution is then ui .r; t/ D
XZ j
1
1
Z
Gij x; x 0 ; t t 0 fj x 0 ; t 0 dt 0 dV 0 V
(10)
1892
G. Nolet
The perturbed system satisfies Au D ŒA0 C ıA .u0 C ıu/ D f ;
(11)
with .ıAu/i D ı
@um @2 ui X @ ıcj klm : 2 @t @xk @xl
(12)
klm
or A0 ıu D ıA u0 C O ı 2
(13)
We thus see that ıu satisfies the same equations as u0 , but the source term is replaced by ıAu0 . The heterogeneities in the medium act as a source of scattered energy. Since ıA is linear in the perturbations ı and ıcj klm , we have effectively linearized the inverse problem for these model parameters; the expression for the perturbed wavefield in a general anisotropic medium with its elastic moduli described by a fourth-order tensor cj klm is given with some algebra by inserting the scattering source term into (10): XZ t Z ıui .t/ D ı x 0 Gij x; x 0 ; t t 0 uR 0j x 0 ; t 0 j
C
X klm
0
V
@Gij .x; x 0 ; t t 0 / @u0m .x 0 ; t 0 / ıcj klm x 0 @xk @xl
dV 0 dt 0
(14)
Even though only 21 of the 81 constants cj klm are independent, in practice it is highly unrealistic to work with so many elastic constants of which the spatial variations can never be resolved, and one usually assumes an isotropic Earth: 2 cj klm D 3 ıj k ılm C ıj l ıkm C ıj m ıkl (15) (with ıij the Kronecker delta) or anisotropy with a single symmetry axis. Since the expression (7) is linear in ıu, and ıu itself can be linearized using Born theory, (7) also represents a linearized relationship between the cross-correlation delay ıT and the perturbations in the model density and elastic parameters. In other words, Z ıT D
KT .r/ım.r/ d3 r
(16)
where we write ım.r/ for the perturbation in any one of the model parameters (or a combination of them) and where the integration is over the volume of the Earth where the kernel KT .r/ is not negligible. If the volume sensitivity (16) is used for the interpretation, one speaks of finite-frequency tomography. If the wave arrivals
Transmission Tomography in Seismology
1893
are also filtered in different frequency bands – essentially capturing the dispersion in ıT – it is called multiple-frequency tomography.
2.1
Linearity
The linearity of the problem is a key issue. If we estimate ıT by cross-correlation, we rely on three linearizations to establish the dependence of this delay to perturbations in density and elastic parameters: the change in the location of the maximum in the cross-correlation function .t/ in (6), the change in itself as defined by (5), and the change in the wavefield u0 as obtained from the Born approximation (14). If the heterogeneity in the Earth is smooth, ray theory is valid: a pulse-like arrival can be delayed by ıT and may change amplitude by focusing, but its shape remains intact (there is no “dispersion”). In that case ıT is a linear function of the elastic perturbation: a perturbation double in amplitude, or over a layer thickness twice as large, will double ıT . The limitations of Born theory have no effect on this: Born is used to establish the functional derivative kernel KT .r/ in the limit ım ! 0. This derivative is always correct for very small perturbations, but as long as the delay ıT depends linearly on ım, we can use this derivative over a much larger range of perturbations than would be permitted by the Born approximation itself. If the heterogeneity in the Earth has sharp transitions generating reflections which reflect again (“second order scattering”), the linearity assumed in Born theory is affected if the impedance contrasts are strong enough that the energy transferred to scattered waves is non-negligible. In this case, the waveform will be affected and the cross-correlation delay becomes frequency dependent. Mercerat and Nolet (2013) test the linearity of the cross-correlation delays as a function of model heterogeneity and dominant frequency of the signal. Figure 1 shows the result for a pulse propagating in a model with random heterogeneity with a scale length close to the dominant wavelength of the wave. The observed delay approximately doubles when the velocity amplitude is doubled from an r.m.s. perturbation of 5– 10 %. Though there is some scatter around the purely linear relationship (the solid line in Fig. 1), the errors implied by this scatter are generally acceptable when compared to the observational errors. This validates the assumption of linearity for many applications in global tomography, where seismic anomalies below the crust rarely exceed 10 %. For crustal applications, or for near-surface studies, nonlinearity may pose a problem, though. In full waveform tomography, the observed seismogram u itself is the datum and we invert for the difference with the predicted seismogram; for one of the components, Z ıu.t/ D u.t/ u0 .t/ D
Ku .r; t/ım.r/ d3 r ;
where the time-dependent kernel Ku .r; t/ is implicitly given in (14).
(17)
1894
G. Nolet
Fig. 1 Left: a random model with a Gaussian autocorrelation, a standard deviation 300 m/s (5 %) in seismic velocity, and a correlation length of 12 m. Only a slice through the model is shown. Right: measured cross-correlation delays for a set of seismograms computed from 4 sources placed in boreholes at the vertical edges of the slice shown on the left, both for the 5 % model and for a model with the anomalies amplified by a factor of 2 (10 %). The dominant wavelength is 24 m. The line denotes the delay predictions for a fully linear relationship (After Mercerat and Nolet (2013))
It is important to notice that in this case, even if ray theory is valid, a third linearization enters into consideration that is not needed in the case of delay-time interpretation, since we need to assume that a first-order Taylor expansion for the delay ıT is valid: u.t/ D u0 .t ıT / u0 .t/ ıT uP 0 .t/ ;
(18)
which is clearly limited to ıT much less than the dominant period of the pulse. Even if ıT can still be correctly estimated using linearized theory (e.g., when ray theory is valid), the waveform perturbation becomes nonlinear. Waveform inversions are therefore much more nonlinear than delay-time inversions. In addition, their dependence on the amplitude of the observed seismogram requires an accurate knowledge of the amplitude response of the instrument as well as of impedance effects of the soil directly beneath the instrument.
2.2
Solving the Forward Problem
In practice, we face a choice how to compute u0 and with that the force term in (13). For very complex systems we have little choice but to discretize A and use purely
Transmission Tomography in Seismology
1895
numerical methods such as a finite difference solver or the spectral element method (Luo and Schuster 1991; Tromp et al. 2005). For sufficiently smooth models in which the wave travels as one coherent pulselike arrival, we may approximate the Green’s function in the spectral domain as Gij .x s ; x r ; !/
A ij .!/ i!Trs e ; Rrs
(19)
where A ij defines the amplitude and polarization of the wave radiated from the source s in the direction of the receiver r and where ray theory provides the geometrical spreading Rrs and the travel time Trs . Equation (19) is known as the raytheoretical solution. Its validity is limited to phases with a well-defined trajectory, away from focal points or caustic surfaces. Surface waves, headwaves, or other diffracted waves cannot be modeled using ray theory in this way and require a more numerical treatment. Fortunately, efficient numerical methods are available that rely on the symmetry of the background model. For frequencies below about 0.1 Hz, summation of normal modes can be used (Zhao et al. 2000). The direct solution method, a Galerkin-type method, can compute synthetic seismograms up to 2 Hz (Kawai et al. 2006). A 2D version of the spectral element method, applied to a spherically symmetric Earth model, provides another alternative (Nissen-Meyer et al. 2007).
2.3
Kernel Computation
Wavefields are reciprocal: a force in direction eO 1 observed on a seismometer component in direction eO 2 yields the same seismogram that we obtain if we interchange the source and receiver locations and observe a eO 1 component from a source in the eO 2 direction. This property is used to significantly reduce the computational effort needed to compute ıu.!/ in (7) and (14). Formally Gij x; x 0 ; t t 0 D Gj i x 0 ; x; t t 0
(20)
so that we only need to compute the wavefield u from a source location x s , and the Green’s function for a source at (receiver) location x r : Z tZ ıui .x r ; t/ D 0
ı x 0 Gj i x 0 ; x r ; t t 0 uR 0j x 0 ; t 0
V
@Gj i .x 0 ; x r ; t t 0 / @u0m .x 0 ; t 0 / C ıcj klm x 0 @xk @xl
dV 0 dt 0
(21)
Substitution of (21) and (15) into (7) gives the kernels that define the linearized relationship between the delay and perturbations in , , and (see also Sect. 2.5). To translate this parameterization into more convenient perturbations of seismic velocity(; ˛; ˇ), we use
1896
G. Nolet
K˛ D 2˛K ; 4 Kˇ D 2ˇ K K ; 3 K0 D
K C ˇ 2 K C K ;
where the accent on the density kernel indicates that we vary while keeping the seismic velocity – rather than and – constant. In practice the sensitivity to density is weak and generally ignored. So far we assumed we have to use finite difference or spectral element methods to compute the Green’s functions from source and receiver locations. But if we use a smooth background model and the ray-theoretical Green’s function (19) to find the kernel expressions for well-defined arrivals such as P or S, we are able to find analytical kernel expressions, directly in terms of the seismic velocity. If, in addition, we neglect differences in the amplitudes A ij .!/ for the initial amplitude of the direct and wave and the wave that departs in the direction of the scatterer, as well as the directivity of the scatterer, the expression for the kernel Kc (where c stands for ˛ or ˇ) becomes Rrs 1 Kc .x / D 2c.x r /c.x 0 / Rxr Rxs 0
R1 0
! 3 ju0 .!/j2 sin Œ!T .x 0 / d! R1 ; 2 2 0 ! ju0 .!/j d!
(22)
where x 0 is the location of the scatterer and Rxr the geometrical spreading of a ray from r to the scatterer. Though this expression is somewhat simplified, it captures the essential differences between ray-theoretical and finite-frequency sensitivity and is usually accurate enough to be used in this form. An example of a kernel computed in this way is shown in Fig. 2. It is ironical that we can use ray theory to improve on ray theory, but the use of (22) can speed up the computation by two to three orders of magnitude with respect to the spectral element method (Mercerat and Nolet 2012), so it is certainly worth the effort. For more complete expressions that include amplitude variations and angle dependence of the scattering as well as possible phase shifts caused by supercritical reflections or passage of a caustic, or for kernels for alternative delay-time definitions, see Nolet (2008).
2.4
Regularization of Large Matrix Systems
For the discretization of the model, PMwe face several choices, all of which can be written in the form ıc.x/ D j D1 mj hj .x/, using a set of basis functions hj .x/; j D 1; : : : M . For the basis hj we can choose a local parameterization into cells or “voxels” (hj D 1 in voxel j , 0 outside), a global parameterization involving spherical harmonics, or a compromise between the two using wavelets.
Transmission Tomography in Seismology
1897
Fig. 2 An example of a travel time kernel K˛ .x/ (Eq. 22) for a P wave arrival with a dominant period of 20 s. The color scale indicates values of the kernel in 107 s/km3 . Note the zero sensitivity at the location of the geometrical raypath in the center and the existence of the second Fresnel zone with reversed sensitivity. The black line plots the value of the kernel at a cross section through its center
Equations (2) and (7) for delays, or (14) for waveform data, can then be written in a general matrix notation: Am D d ;
(23)
The regularization is done by minimizing a penalty function, the generic form of which is J .m/ D
1 kAm dk C R.m/ : 2
(24)
For the data misfit one usually adopts the Euclidean norm (least squares fit). R.m/ is a measure of the size and/or complexity of the model that we wish to keep under control. There are many different choices, but the most important in seismic tomography are (Loris 2015) Z R.m/ D
ıc.x/2 dV
(Tikhonov or norm damping);
(25)
Z R.m/ D R.m/ D
jrıc.x/jdV X j
(total variation or Laplacian smoothing);
jmj j for a wavelet basis hj
(compressed sensing):
(26) (27)
1898
2.5
G. Nolet
Adjoint Inversion
If a waveform inversion is pursued, the inverse problem is easily too large to fit in the memory of computer clusters. But even in finite-frequency or multiple-frequency delay-time inversions, the large matrix size may be prohibitive. In that case one may attempt to find a solution by searching in the model space along the gradient of the penalty function J .m/, using “adjoint” equations to compute the gradient at each step. Since the gradient is recomputed anyway, adjoint inversions lend themselves very well to highly nonlinear problems, such as full waveform inversions. In travel time tomography of the Earth’s mantle, nonlinearity is weak, and it can be much more efficient to store the rows of the matrix A on disk than to recompute them. Excellent descriptions of adjoint inversion for waveform as well as delay-time tomography can be found in Tromp et al. (2005) and Fichtner et al. (2006). Here I first give a simple example of adjoint inversion for the matrix system (23) followed by the more complicated case for waveform inversion. If we use the Euclidean norm k:k D j:j2 in (24), the gradient of J with respect to the elements of the model vector m is rJ D
@J D A T .Am d/ D A T r ; @m
(28)
where A T is the transpose of A and r is a vector of travel time residuals. For simplicity we ignore here the contribution of the regularization term R. The computation of the data residuals r is straightforward and is usually done “on the fly” while computing each row of the matrix A, since each row corresponds to one particular source-station pair. For the multiplication of r with A T , we again need the matrix. We can either recompute it or read it back from disk – the first approach is doable for a homogeneous background model, but quickly becomes too slow even for simple layered models. There is an apparent difficulty that we compute (and store or read) the matrix in row order, whereas the multiplication with A T is normally done in column order (which is row order for the transpose matrix). This, however, is not really needed, as the following pseudo-code for a row-order multiplication shows To compute g D A T r: set g D 0 for i D 1; N Read row i from disk (or compute it) for j D 1; M gj C Aij ri gj The function of A T is to project the data residuals back into the model space. The residuals observed in a particular station are redistributed along the raypaths to that station. Where raypaths cross, and the sign of the residuals is the same, their sum will create a visible anomaly. Thus, once the gradient A T r has been obtained, the model can be updated. The simplest form would be
Transmission Tomography in Seismology
1899
miter+1 D miter C ˛rJ ;
(29)
where the optimal step size ˛ is typically found through quadratic interpolation on three values of J .miter C ˛rJ /, where ˛ is near 2J =jgj. Convergence can be speeded up using conjugate gradients (Fletcher and Reeves 1964). If we compute the kernels using a full waveform algorithm rather than ray theory, we must substitute the Born approximation (21) for ıu in (7); limiting the time integral to the cross-correlation window Œ0; T for the i th component of the seismogram, this gives ıT .x/ D
C
1 E
X klm
Z
T
X uP 0i x; t 0
0
Z tZ 0
j
ı x 0 Gj i x 0 ; x; t t 0 uR 0j x 0 ; t 0
V
@Gj i .x 0 ; x; t t 0 / @s0m .x 0 ; t 0 / ıcj klm x 0 @xk0 @xl0
dV 0 dtdt 0 ;
(30)
P R1 where E D i 1 uR 0i .t/u0i .t/dt. A similar “backprojection” interpretation can be obtained if we identify the travel time adjoint field as 1 - uj x 0 ; x; T t 0 D E
Z
T t t 0
Gj i x 0 ; x; T t 0 uP 0i x 0 ; T t dt ;
(31)
0
which is generated from the “adjoint source”: -
fi .x; t/ D
1 uP 0i x 0 ; T t ı x x 0 E
(32)
Equations (31) and (32) can be used to compute the kernel in (30) using a numerical algorithm that gives the wavefield in complicated media, in which the ray-theoretical expression (22) is not valid. It can also be used to interpret cross-correlation delays for arbitrary parts of the seismogram that are not identifiable as a ray arrival. This, for example, is needed in the case of headwaves such as the Pn wave that travels along the crust-mantle interface, or the core-diffracted Pdiff wave. Note that the cross-correlation delay itself is not needed to generate the adjoint field – in view of the weak nonlinearity, it can thus be done once and for all even if the system is solved by gradient searches. Equation (30) has the form (16) – the kernel is implicitly defined by this equation. If we do a full waveform inversion, i.e., if we invert (21) directly, this is not the case. We define the penalty function: 1X 2 rD1 N
J D
Z
T
ju0i .x r ; t/ ui .x r ; t/j2 dt : 0
(33)
1900
G. Nolet
A summation over components i can be implicitly assumed in case we invert for more than one component of the wavefield. Perturbing the model gives a perturbation in J : ıJ D
N Z X
T
Œu0i .x r ; t/ ui .x r ; t/ ıui .x r ; t/dt
(34)
0
rD1
Tromp et al. (2005) show that the adjoint field is now given by N X X - uj x 0 ; t 0 D rD1
i
Z
T t 0
Gj i x 0 ; x r ; T t t 0 Œu0i .x r ; T t/ ui .x r ; T t/ dt
0
(35) with an adjoint source that sums the waveform discrepancies in all receivers: -
fi .x; t/ D
N X X rD1
Œu0i .x r ; T t/ ui .x r ; T t/ ı.x x r /
(36)
i
and substituting this, and the expression for the perturbed field (21) in (34), gives again a kernel interpretation of the form Z ıJ D
K x 0 ı x 0 C K x 0 ı x 0 C K x 0 ı x 0 dV 0 ;
(37)
but this formulation cannot be used in practice since the matrices to be stored are too large. Instead, the backprojection is done as in the case of (28) by computing the wavefield back from each receiver using the virtual sources (36). However, in contrast to the backprojection of travel time delays, the source terms depend on the observed misfit in each station. Note that, even if we invert for only one component, all three components enter in the computation of the gradient – since horizontal components are often much noisier than the vertical ones, this is not a trivial observation.
2.6
Resolution Analysis
Errors in the data propagate into the solution. Moreover, since the problem is almost always underdetermined, the regularization introduces a bias into the solution obtained. Only for the smallest tomographic problems are we able to compute the posteriori covariance matrix of the solution. For large problems, the bias can be studied by generating a synthetic data set d synt for a known model msynt , solving the system Am D d synt , and comparing the solution m with msynt . If one adds an error to the synthetic data with a distribution equal to that estimated for the real data, and if one repeats the exercise many times for the same msynt but different
Transmission Tomography in Seismology
1901
realizations of the errors, the covariance matrix of the solution can be estimated as well. Since the synthetic model often takes the form of voxels with alternating positive and negative anomalies, typically of a few percent, such tests are widely referred to as “checkerboard tests,” referring to the checkerboard-like image of msynt when plotted in two-dimensional cross sections.
3
Fundamental Results
A number of intriguing and important discoveries have been made using global tomography and ray theory: for example, it has become clear that the oceanic lithosphere can subduct to great depths in the mantle, though this behavior varies with the tectonic setting (Fig. 3). Two major “superplumes” exist in the Southern Hemisphere just above the core, under South Africa and under the Society Islands in the Pacific. The origin and nature of these features is still debated, but strong indications exist that they are chemically distinct and intrinsically denser than the surrounding mantle and may have been in existence since the formation of our planet (Forte et al. 2010). These superplumes are the most pronounced of a series of much narrower lower mantle plumes, first discovered with finite-frequency tomography by Montelli et al. (2004), who combined cross-correlation delay data with onset time data. The difference in sensitivity between the two types of delays to features with a typical length scale of several hundred km allowed the imaging of plumes with diameters of 400 km and larger. These rise more or less vertically from the core-mantle boundary (see Fig. 4) and are located beneath volcanic islands such as Hawaii or Cape Verde. The surface separating the upper and lower mantle shows topography of several tens of kilometer – since this surface is a silicate phase transition, this shows that strong lateral temperature variations must exist in the interior of our planet (Lawrence and Shearer 2008). Most surprising is that lateral variations persist to the very center of the Earth, despite its high temperature (more than 5,000 ı C) and pressure (365 GPa): the solid inner core shows an Eastern and Western hemisphere with different seismic velocity and anisotropy (e.g., Irving and Deuss 2011).
4
Future Directions
The theory of seismic wave propagation is by now well developed, and stable algorithms exist that can predict seismic motion up to frequencies that approach 1 Hz; with exaflop computing facilities widely believed to be within reach before 2020, this means we can soon compute wavefields over the full observable frequency range. More fundamental progress is thus only to be expected at the side of the observations. Current seismic networks severely undersample the wavefield, even
1902
G. Nolet
Fig. 3 Vertical mantle sections across the Tonga-Kermadec arc, where oceanic lithosphere subducts back into the mantle. The top two rows show perturbations in the P velocity ˛ in two different models, obtained with transmission tomography. For comparison, the models in the lower half of the figure show the perturbation in the S velocity ˇ for the same locations, obtained with low frequency data (normal modes). The velocity variations are relative to a spherical average. Blue colors represent fast, red slow – to first order these can be interpreted as cold and hot regions in the mantle, respectively. The amplitude scale is different among the four models, as indicated by the numbers below each image. Circles denote earthquakes, clear indicators of active subduction (From Fukao et al. (2001), reproduced with permission from the American Geophysical Union)
in dedicated high-density deployments such as USArray with an average station distance of 70 km. Except for a few ocean island stations, and temporary and expensive deployment of seismometers on the ocean floor, the oceanic domain is void of sensors, severely hampering global tomography. Acoustic sensors (hydrophones) mounted on robotic floats that drift with the ocean currents may soon start to provide us with observations of P wave arrivals, the delays of which will be instrumental to resolve anomalies in the mantle beneath the oceans. Finite-frequency methods or numerical algorithms to compute the seismogram in a laterally heterogeneous Earth make it in principle also possible to interpret amplitudes. Amplitudes are influenced by focusing and defocusing effects, as well
Transmission Tomography in Seismology
1903
Fig. 4 If one vertically averages the P velocity anomalies over the deepest 1,000 km of the lower mantle in a tomographic model, such as to emphasize features with vertical continuity, the two superplumes present under South Africa and the South Pacific become apparent, as do others, such as Hawaii and the Canary Island plume. As in Fig. 3, reddish colors indicate slow – presumably hot – mantle rock (From Montelli et al. (2006), with permission from the American Geophysical Union)
as by the intrinsic attenuation of the rock. If the attenuation can be reliably inferred from seismic observations, important constraints on temperature can be obtained. The interpretation of amplitudes in terms of 3D structure is however still in a beginning stage, partly because the instrument response is usually better known for the phase, and with GPS clock corrections, the time keeping has become very reliable. On the contrary, the recorded amplitude of the seismic signal is influenced by the impedance of the local structure directly beneath the seismograph and thus less certain than the phase. This presents an obstacle that will be important to overcome in the near future.
5
Conclusion
Since its introduction in seismology more than 30 years ago, seismic tomography has become an indispensible tool to study the interior of our planet and is increasingly finding applications in our search for natural resources and in the monitoring of activities underground – ranging from the evolution of volcanic activity or the pumping of a gas reservoir to the detection of tunnels intended to escape border controls. The theory of seismic transmission tomography has evolved in the past decade from the simple ray-theoretical approach toward finite-frequency interpretation of cross-correlation delays and toward adjoint inversion of full waveforms. With these new methods we are able to extract significantly more information from observed seismograms than was possible until recently. Most of the improvement in the near future will come from an increase in data, rather than from theoretical
1904
G. Nolet
improvements. The growing size of the inverse problem will require a continued growth in the power of supercomputers or new mathematical techniques to reduce the linearized system without significant loss of information.
References Bois P, la Porte M, Lavergne M, Thomas G (1971) Essai de determination automatique des vitesses sismiques par mesures entre puits. Geophys Prospect 19:42–81 Dahlen FA, Hung S-H, Nolet G (2000) Fréchet kernels for finite-frequency traveltimes – I. theory. Geophys J Int 141:157–174 Dahlen FA, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton Fichtner A, Bunge H-P, Igel H (2006) The adjoint method in seismology I. Theory. Phys. Earth Planet Inter 157:86–104 Fletcher R, Reeves C (1964) Function minimizationby conjugate gradients. Comput J 7:149–154 Forte AM, Sandrine Q, Moucha R, Simmons NA, Grand SP, Mitrovica JX, Rowley DB (2010) Joint seismic-geodynamic-mineral physical modelling of African geodynamics: a reconciliation of deep-mantle convection with surface geophysical constraints. Earth Planet Sci Lett 295:329– 341 Fukao Y, Widiyantoro S, Obayashi M (2001) Stagnant slabs in the upper and lower mantle transition region. Rev Geophys 39:291–323 Irving JCE, Deuss A (2011) Hemispherical structure in inner core velocity anisotropy. J Geophys Res 116:B04307 Kawai K, Takeuchi N, Geller RJ (2006) Complete synthetic seismograms up to 2 Hz for transversely isotropic spherically symmetric media. Geophys J Int 164:411–424 Lawrence JF, Shearer PM (2008) Imaging mantle transition zone thickness with SdS-SS finitefrequency sensitivity kernels. Geophys J Int 174:143–158 Loris I (2015) Numerical algorithms for non-smooth optimization applicable to seismic recovery. In: Freeden et al. (Eds) Handbook of Geomathematics, 2nd Ed., Springer Luo Y, Schuster GT (1991) Wave-equation travel time tomography. Geophysics 56:645–653 Mercerat D, Nolet G (2012) Comparison of ray-based and adjoint-based sensitivity kernels for body-wave seismic tomography. Geophys Res Lett 39:L12301 Mercerat ED, Nolet G (2013) On the linearity of cross-correlation delay times in finite-frequency tomography. Geophys J Int 192:681–687 Montelli R, Nolet G, Dahlen FA, Masters G (2006) A catalogue of deep mantle plumes: new results from finite-frequency tomography. Geochem Geophys Geosys (G3) 7:Q11007 Montelli R, Nolet G, Dahlen FA, Masters G, Engdahl ER, Hung S-H (2004) Finite frequency tomography reveals a variety of plumes in the mantle. Science 303:338–343 Nissen-Meyer T, Dahlen FA, Fournier A (2007) Spherical-earth Fréchet sensitivity kernels. Geophys J Int 168:1051–1066 Nolet G (2008) A breviary of seismic tomography. Cambridge University Press, Cambridge Tibuleac IM, Nolet G, Michaelson C, Koulakov I (2003) P wave amplitudes in a 3-D Earth. Geophys J Int 155:1–10 Tromp J, Tape C, Liu Q (2005) Seismic tomography, adjoint methods, time reversal and bananadoughnut kernels. Geophys J Int 160:195–216 Zhao L, Jordan TH, Chapman CH (2000) Three-dimensional Fréchet kernels for seismic delay times. Geophys J Int 141:558–576
Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery Ignace Loris
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Convex Functions and Their Subdifferentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Projections and Proximity Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Iterative Algorithms for Convex Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 A Minimization Problem Involving a Single Convex Function . . . . . . . . . . . . . . 3.2 A Minimization Problem Involving Two Convex Functions . . . . . . . . . . . . . . . . 3.3 Penalized Least Squares Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Application to Linear Inverse Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Least Squares Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Robust Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Uniform Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1906 1908 1908 1910 1916 1916 1921 1926 1928 1930 1932 1933 1934 1938 1940
Abstract
Inverse problems in seismic tomography are often cast in the form of an optimization problem involving a cost function composed of a data misfit term and regularizing constraint or penalty. Depending on the noise model that is assumed to underlie the data acquisition, these optimization problems may be non-smooth. Another source of lack of smoothness (differentiability) of the cost function may arise from the regularization method chosen to handle the ill-posed nature of the inverse problem. A numerical algorithm that is well suited to handle minimization problems involving two non-smooth convex functions and two
I. Loris () Université libre de Bruxelles, Bruxelles, Belgium e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_65
1905
1906
I. Loris
linear operators is studied. The emphasis lies on the use of some simple proximity operators that allow for the iterative solution of non-smooth convex optimization problems. Explicit formulas for several of these proximity operators are given and their application to seismic tomography is demonstrated.
1
Introduction
Global seismic tomography deals with the determination of the Earth’s inner structure based on the measurement of earthquake data (Nolet 2008). Inverse problems in global seismic tomography are typically plagued by a lack of measurement data that would allow for a unique determination of the seismic wave-speed anomaly in the Earth’s mantle. In addition, available data is contaminated by measurement noise. Assuming a linear relationship between wave-speed anomaly u and measurement data y, seismic recovery can be written as an ill-posed linear problem Ku D y. Its solution may not exist or may not be unique; the singular value spectrum of the measurement operator K, in combination with the noise, may make recovery unstable. Inverse problems of this type are often regularized by changing them into minimization problems involving two parts. A first part relates to the minimizing of the difference between observed (noisy) data y and predicted data Ku, while a second part plays the role of regularizer. Regularization may be achieved by penalizing “large” solutions (Tikhonov regularization (Tikhonov 1963)) or by constraining the solution to lie in a bounded set (Ivanov regularization (Ivanov 1976)). In this tutorial paper, a number of algorithms applicable to optimization problems of this kind are studied. Let f .Ku/ represent the data misfit term, measuring the goodness of fit between predicted data Ku and experimental data y. We will assume that the function f is a “simple” nonnegative convex function. In this context, “simple” should be understood as f possessing a proximity operator that can be computed easily. Similarly, we will assume that the penalty function has the form g.Au/, where A is a linear operator and g is a simple convex non-negative function (possessing an easy-to-compute proximity operator). The function g thus serves to impose a penalty or a constraint on the set of possible models with the same data misfit f . Having introduced the two matrices K and A, and the corresponding function f and g, the problem we wish to solve is arg min f .Ku/ C g.Au/: u
(1)
Whereas the function f and the matrix K are dictated by the experimental setup, the function g (and the matrix A) is chosen by the scientist to impose desirable properties on the reconstruction u. In case the data is contaminated by Gaussian noise, the function f .Ku/ is just the usual least squares function ky Kuk22 =2. However, if the data is affected by
Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery
1907
outliers in the noise, a more robust data misfit term may be needed. One wellknown example is the sum of absolute deviations ky Kuk1 , which puts less weight on large deviations than ordinary least squares and which is therefore less sensitive to outliers. Early use in seismic tomography is found in Claerbout and Muir (1973), Taylor et al. (1979), and Santosa and Symes (1986). Yet, another noise model could assume a uniform distribution of the noise, whereby one would need to treat the minimization of the function ky Kuk1 . Clearly, the `1 norm and the `1 -norm provide two examples of non-smooth (non-differentiable) functions. Moreover, a single problem may contain a combination of different noise models. Even if Gaussian noise is assumed, non-differentiability of the cost function in minimization problem (1) may be introduced through the regularization term g. Traditionally, this has often just been the `2 -norm squared kuk22 of the unknown u (for imposing a limit on the size of u) or the `2 -norm squared of the gradient of u (for imposing some smoothness on the seismic wave speed u). Recently (Daubechies et al. 2004; Bruckstein et al. 2009), the use of the `1 -norm as a way of imposing certain a priori information on the solution of a linear inverse problems has gained popularity. One example consists of using the `1 -norm to impose sparsity on u or more precisely on a linear transformation Au of the vector u. One example is the so-called total variation (Rudin et al. 1992) regularization, where the `1 -norm of the local gradient is used to impose small local variations, while allowing for sharper discontinuities than those admitted by the use of the `2 -norm squared of the local gradient. Applications of these techniques in seismic recovery are, e.g., discussed in Loris et al. (2007), Herrmann et al. (2008), Herrmann and Hennenfent (2008), Loris et al. (2010), and Gholami and Siahkoohi (2010). The goal of this paper is to provide an introduction and guide to the use of proximity operator-based algorithms for convex minimization problems of type (1). Such proximity operators and their connection to optimization are discussed in Sect. 2. A small number of iterative optimization algorithms will be discussed in Sect. 3. The aim of this paper is not to compare different algorithms and their speed of convergence. Indeed, many iterative algorithms can be written for solving the same problem (see, e.g., Esser 2010; Esser et al. 2010). For the same reason, we will not systematically formulate the most general version of each algorithm but restrict ourselves to one that is of sufficient practical use. Problem (1) is symmetric w.r.t to interchanging f and g (and K and A). However, in the proposed algorithm, we will treat the two function f and g (and the two operators K and A) in a slightly different manner. The advantage of this is that the conditions on K and A for convergence of the iterative algorithm are uncoupled. We shall argue that this is the more natural thing to do as the operator K is fixed by the physics of the data collection, whereas the operator A is determined by assumptions underlying the regularization. The proposed iterative algorithms are applied to linear inverse problems in Sect. 4, assuming different noise models (e.g., robust and uniform noise models).
1908
I. Loris
Section 5 describes a synthetic inverse problem in global seismic tomography that demonstrates the use of these algorithms and noise models. In particular, the total variation regularization (on an irregular triangular or tetrahedral grid) of a global seismic tomographic problem is demonstrated.
2
Basic Concepts
In this section, a number of basic concepts of convex analysis are introduced. The most important one is that of proximity operator which is a generalization of the projection on a convex set. It will form the basis of the algorithms discussed in Sects. 3 and 4. The emphasis is on presenting relevant examples explicitly, rather than giving proofs in full detail. An introduction to convex optimization may be found in Rockafellar (1997) and Boyd and Vandenberghe (2004).
2.1
Convex Functions and Their Subdifferentials
As is well known, a set C Rd is said to be convex if u; v 2 C
)
u C .1 /v 2 C
(2)
N is said to be convex if for all 2 Œ0; 1 . A function f W Rd ! R f .u C .1 /v/ f .u/ C .1 /f .v/
(3)
for all points u; v 2 Rd and for 2 Œ0; 1 . The convex functions that we will be mostly interested in are expressed in terms of the `1 -norm, the `2 -norm (or Euclidean norm), and the `1 -norm (or maxnorm). They are defined as
kuk1 D
X
jui j;
i
kuk2 D
X
!1=2 jui j
2
and
kuk1 D max jui j
i
i
(4) d . In some applications, it makes sense to use a mixed norm of type for any u 2 R p P kuk1;2 D i jui;1 j2 C jui;2 j2 for a vector u 2 R2d . This will be the case for the example of Sect. 5. To each of these three convex functions corresponds a ball of radius R around the origin: .p/
BR D fu j kukp Rg; The indicator functions of these convex sets,
for p D 1; 2; 1:
(5)
Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery
iB .p/ .u/ D R
0 kukp R C1 kukp > R
1909
(6)
(p D 1; 2; 1), are also convex functions according to definition (3). The convex dual f of a convex function f is defined as f .u/ D suphv; ui f .v/
(7)
v
A straightforward calculation shows that the dual function of f .u/ D simply f .u/ D 12 kuk22 . One can also show the following relations: f .u/ D kukp
)
f .u/ D iB .q/ .u/
1 kuk22 2
is
(8)
(where 1=p C1=q D 1). We will use this property often in the following subsection. In case of convex functions, the notion of derivative can be extended to N be a convex function. The non-differentiable functions. Let f W Rd ! R subdifferential @f .u/ of f at the point u 2 Rd is the set of vectors w 2 Rd such that f .v/ f .u/ C hw; v ui
8v:
(9)
In case confusion is possible, one can explicitly indicate the independent variable and write @u f instead of @f . The elements w of the set @f are called the subgradients of f in u. Even for a non-differentiable function, the subgradient can still be interpreted as the slope of a line (or plane) touching the function f from below at the point u (more than one such touching line/plane can exist). In case the function f is differentiable at u, the subdifferential reduces to a single vector (the usual gradient). If the subdifferential is a singleton, it is often identified with its only element. A simple example of the subdifferential of a non-differentiable convex function is
f W R ! R W f .u/ D juj
)
8 u 0:
(10)
The subdifferential can be used to express the minimization of a convex function f : uO D arg min f .u/ u
,
0 2 @f .Ou/:
(11)
Indeed, uO D arg minu f .u/ corresponds to f .u/ f .Ou/ for all u or equivalently f .u/ f .Ou/ C h0; u uO i for all u. This is just saying that 0 2 @f .Ou/.
1910
I. Loris
Let us, as an example, express the conditions for minimizing the function of formula (1), where K and A are two linear operators (matrices), in terms of the subdifferentials of f and g. We find that u D arg min f .Ku/ C g.Au/
0 2 K T @f .Ku/ C AT @g.Au/;
,
(12)
or in other words, there exist v and w such that 0 D K T v C AT w;
v 2 @f .Ku/ and w 2 @g.Au/:
(13)
In the next subsection, we will rewrite the inclusions v 2 @f .Ku/ and w 2 @g.Au/ as algebraic equalities. This will allow us to rewrite minimization problem (1) as a system of algebraic equations. For solving these equations, we will then write iterative algorithms.
2.2
Projections and Proximity Operators
As this tutorial paper is aimed at geoscientists, the goal of this subsection is not to give the properties of the proximity operators in their full mathematical generality. A more detailed overview, with applications to optimization problems, may also be found in Combettes and Wajs (2005) and Combettes and Pesquet (2011).
Projection on a Convex Set The projection of the vector u on the (nonempty) closed convex set C is defined as the closest point in C to u: PC .u/ D arg min ku vk22 : v2C
(14)
.p/
For the `p -balls BR , these projections can be explicitly calculated. The projection .1/ .1/ PR on the convex set BR is given by ( .1/ PR .u/i
D
jui j R ui ui juRi j jui j R .2/
(15)
.2/
(component-wise calculation). The projection PR on the Euclidean ball BR is given by
.2/
PR .u/ D
8 M . As there exists a converging subsequence, the right-hand side can be made as small as one desires by choosing M large enough. It follows that the left- .n/ hand side tends to zero for N ! 1. In other words, the whole sequence u ; v .n/ - (and not just a subsequence) converges to the limit u ; v . t u As algorithm (42) implies that u.nC1/ D uN .nC1/ K T v .nC1/ v .n/ and hence .n/ u D uN .n/ K T v .n/ v .n1/ , it follows that the variable u.n/ can be eliminated from iteration (42). One finds the equivalent algorithm: (
u.nC1/ D u.n/ K T 2v .n/ v .n1/ v .nC1/ D proxf v .n/ C Ku.nC1/
(49)
(where we have also dropped the bar from the variable uN ). Algorithm (42) (or (49)) is well known. It is a special case of algorithm 1 of Chambolle and Pock (2010), of algorithm A1 of Zhang et al. (2011), or of the so-called PDHGMp algorithm of Esser et al. (2010). It is also possible to introduce variable step lengths (depending on iteration step n) (see, e.g., Esser (2010, page 78)). Algorithm (42) (or equivalently algorithm (49)) can also be used to minimize a sum of convex functions. The following result is a direct consequence of the previous proposition. di N be m convex functions, let Ki W Rd ! Rdi be Proposition 2. Let fi W RP !R m m linear operators with i D1 Kit Ki 2 < 1, and suppose that a minimizer of the problem
arg min u
m X
fi .Ki x/
(50)
i D1
exists. The algorithm 8 m X ˆ .n/ .n1/ ˆ < u.nC1/ D u.n/ Kit 2vi vi
(51)
i D1
ˆ ˆ : v .nC1/ D prox v .n/ C K u.nC1/ i f i i i
i D 1:::m
1920
I. Loris .0/
.0/
(u.0/ ; v1 ; : : : ; vm are arbitrary) converges to a minimizer of problem (50). Proof. It suffices to introduce a matrix K and a convex function f by 0
1 K1 B C K D @ ::: A
and
f .v1 ; : : : ; vm / D f1 .v1 / C f2 .v2 / C : : : C fm .vm /;
Km (52) to rewrite problem (50) into the form required for applying Proposition 1 (and algorithm (49)). Keeping in mind Property 2 of proximity operators, one has that proxf .v1 ; : : : ; vm / D proxf1 .v1 /; : : : ; proxfm .vm /
(53)
and similarly for proxf . Algorithm (49) applied to problem (50) therefore takes the form (51), where we have also split the auxiliary variable v of algorithm (49) in m parts .v1 ; : : : ; vm /, with vi 2 Rdi . In this case, the condition for convergence on matrices Ki becomes kKk2 < 1 which is equivalent to K T K 2 < 1 or Pthe m K t Ki < 1. t u i D1 i 2 The m proximity operator steps can be executed in parallel, which may be of interest inlarge-scale computing on a computer cluster. Unfortunately, the single P t < 1 mixes all the matrices Ki in just one inequality. In K K condition m i i D1 i 2 some application, it may be more convenient to have separate conditions on the matrices Ki , e.g., kKi k2 < 1. Indeed, instead of a parallel algorithm, one could prefer a sequential algorithm that updates the unknown u.n/ after each use of a block Ki . Such algorithms are called block iterative, or row-action algorithms (if each Ki consists of just a single row). Such a minimization modality is of great interest, e.g., in computer tomography (in medical imaging) where the matrix K is simply too large to be stored in memory (each block of rows Ki is recomputed on the fly). Certain existing row-action methods combine an iteratively reweighted least squares approach with an approximation of the cost function (Defrise et al. 2011; Nikazad et al. 2012). Apart from use in domains where the sensing matrix K is simply too large to be stored in memory, such algorithms could also exhibit potential superior performance (faster convergence). We will not discuss the problem of solving problem (50) by a block-iterative algorithm in general. We will however look at the particular case m D 2 in more detail in the next subsection. It turns out that a relative simple modification of algorithm (51) can be made with the desired properties.
Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery
3.2
1921
A Minimization Problem Involving Two Convex Functions
We finally come to the main optimization problem (1) discussed in this paper. Let N and g W Rd2 ! R N be two convex functions. Let K W Rd ! Rd1 and f W Rd1 ! R d d2 A W R ! R be two linear operators. We are interested in iterative algorithms for solving the optimization problem: arg min f .Ku/ C g.Au/: u
(54)
As problem (54) is merely a special case (m D 2) of problem (50), one can use algorithm (51) which we write down here for future reference. N and g W Rd2 ! R N be two convex functions. Proposition 3. Let f W Rd1 ! R d d1 d d2 Let K W R ! R and A W R ! R be two linear operators with K T K C AT A < 1, and suppose that a minimizer of the problem 2 arg min f .Ku/ C g.Au/ u
(55)
exists. The algorithm 8 .nC1/ D u.n/ K T 2v .n/ v .n1/ AT 2w.n/ w.n1/ ; ˆ ˆu < v .nC1/ D proxf v .n/ C Ku.nC1/ ; ˆ ˆ : .nC1/ w D proxg w.n/ C Au.nC1/ ;
(56)
(u.0/ ; v .0/ , and w.0/ are arbitrary) converges to a minimizer of problem (55). t u T The mixing of the operator K and A in the single condition K K C AT A2 < 1 is not so convenient in practice. As mentioned in the introduction, the linear operator K models the relationship between unknown u and data y, and a second (unrelated) linear operator A is chosen for its regularizing properties in combination with a suitable penalty function g. As A is merely a choice, one would often like to explore several possibilities for it. In that case, separate conditions on K and A are more convenient. The following algorithm has exactly this property. Proof. See Proposition 2.
N and g W Rd2 ! R N be two convex functions. Let Proposition 4. Let f W Rd1 ! R d d1 d K W R ! R and A W R ! Rd2 be two linear operators with kKk2 < 1 and kAk2 < 1, and suppose that a minimizer of the problem arg min f .Ku/ C g.Au/ u
(57)
1922
I. Loris
exists. The algorithm 8 .nC1/ uN D u.n/ K T 2v .n/ v .n1/ AT w.n/ ; ˆ ˆ ˆ ˆ ˆ < w.nC1/ D proxg w.n/ C ANu.nC1/ ; ˆ ˆ u.nC1/ D u.n/ K T 2v .n/ v .n1/ AT w.nC1/ ; ˆ ˆ ˆ : .nC1/ v D proxf v .n/ C Ku.nC1/ ;
(58)
(u.0/ ; uN .0/ ; v .0/ , and w.0/ are arbitrary) converges to a minimizer of problem (57). Proof. Let uO ; v; O wO again be a solution of the equations that characterize the minimizers of problem (57) (see Property 6): O uO D uO K T vO AT w;
wO D proxg .wO C AOu/ ;
and vO D proxf .vO C K uO / : (59)
O and D ANu.nC1/ , to Applying Property 3, with t C D w.nC1/ , t D w.n/ , t D w, the second line of algorithm (58) leads to the inequality: .nC1/ 2 2 2 w wO 2 w.n/ wO 2 w.nC1/ w.n/ 2 ˛ ˝ C 2 w.nC1/ w; O ANu.nC1/ C 2g .w/ O 2g w.nC1/ ; Applying Property 3, with t C D w, O t D w, O t D w.nC1/ , and D AOu, to the second equation of (59) leads to wO w.nC1/ 2 wO w.nC1/ 2 kw O wk O 22 2 2 ˝ ˛ C 2 wO w.nC1/ ; AOu C 2g w.nC1/ 2g .w/: O The two previous inequalities add up to 2 2 .nC1/ 2 ˝ ˛ w w O 2 w.n/ wO 2 w.nC1/ w.n/ 2 C2 w.nC1/ w; O A uN .nC1/ uO : (60) We can proceed similarly with the fourth line of algorithm (58) and the last equation in the list (59) to find .nC1/ 2 .n/ 2 .nC1/ .n/ 2 ˝ ˛ v vO 2 v vO 2 v v 2 C2 v .nC1/ v; O K u.nC1/ Ou : (61) Finally, the third equation in algorithm (58) and the first equation in the list (59) lead to
Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery
1923
2 2 .nC1/ 2 u uO 2 D u.n/ uO 2 u.nC1/ u.n/2 ˝ ˛ C2 u.nC1/ uO ; K T v .n/ vO K T v .n/ v .n1/ AT w.nC1/ wO : (62) Summing the three inequalities (60)–(62) results in .nC1/ 2 2 2 u uO 2 C v .nC1/ vO 2 C w.nC1/ wO 2 2 2 2 u.n/ uO 2 C v .n/ vO 2 C w.n/ wO 2 2 2 2 u.nC1/ u.n/ 2 v .nC1/ v .n/ 2 w.nC1/ w.n/ 2 ˝ ˛ C2 w.nC1/ w; O A uN .nC1/ uO ˝ ˛ C2 v .nC1/ v; O K u.nC1/ uO ˛ ˝ C2 u.nC1/Ou; K T v .n/ vO K T v .n/ v .n1/ AT w.nC1/ wO (63) where we replace uN .nC1/ uO in the fourth line by u.nC1/ uO AT w.n/ w.nC1/ (by virtue of the first and third line of algorithm (58)): .nC1/ 2 2 2 u uO 2 C v .nC1/ vO 2 C w.nC1/ wO 2 2 2 2 u.n/ uO 2 C v .n/ vO 2 C w.n/ w O 2 2 2 2 u.nC1/ u.n/2 v .nC1/ v .n/ 2 w.nC1/ w.n/ 2 ˛ ˝ C2 w.nC1/ w; O A u.nC1/ uO AT w.n/ w.nC1/ ˝ ˛ C2 v .nC1/ v; O K u.nC1/ uO ˛ ˝ C2 u.nC1/ uO ; K T v .n/ vO K T v .n/ v .n1/ AT w.nC1/ wO (64) ˝ .nC1/ ˛ Some terms cancel, and 2 w w; O AAT w.n/ w.nC1/ is replaced by T .nC1/ 2 2 T .nC1/ 2 A w wO 2 C A w w.n/ 2 AT w.n/ wO 2 resulting in .nC1/ 2 2 2 u uO 2 C v .nC1/ vO 2 C w.nC1/ wO A 2 2 2 u.n/ uO 2 C v .n/ vO 2 C w.n/ wO A 2 2 2 u.nC1/u.n/ 2 v .nC1/ v .n/ 2 w.nC1/ w.n/ A ˛ ˝ C2 u.nC1/Ou; K T v .nC1/v .n/ K T v .n/ v .n1/ : (65) q In this expression, we have introduced kwkA D kwk22 kAT wk22 , which is a norm as a consequence of the condition kAk2 < 1.
1924
I. Loris
Introducing new auxiliary variables, zO D K uO
z.n/ D Ku.n/ v .n/ C v .n1/ ;
and
(66)
one sees that 2 E D v .nC1/ v .n/ C 2 u.nC1/ uO ; K T v .nC1/ v .n/ K T v .n/ v .n1/ 2
2 D E D Ku.nC1/ z.nC1/ C2 K u.nC1/ uO ; Ku.nC1/ z.nC1/ Ku.n/ z.n/ 2 2 E D .nC1/ .nC1/ z D Ku C 2 K u.nC1/ uO ; K u.nC1/ u.n/ 2 E D E D .nC1/ uO ; z.n/ zO 2 K u.nC1/ uO ; z.nC1/ zO C2 K u 2 2 2 D Ku.nC1/ z.nC1/ C K u.nC1/ uO C K u.nC1/ u.n/ 2
2
2
2 2 2 2 K uO u.n/ C K u.nC1/ uO C z.n/ Oz K u.nC1/ uO z.n/ Oz 2 2 2 2 2 2 2 K u.nC1/ uO z.nC1/ zO C K u.nC1/ uO z.nC1/ zO 2
2
2
2 2 2 D K u.nC1/ uO C K u.nC1/ u.n/ K uO u.n/ 2 2 2 2 2 2 .n/ .nC1/ .nC1/ .n/ C z zO Ku z z zO zODK uO
2
2
2
where we have repeatedly used the identity 2ha; bi D kak22 C kbk22 ka bk22 . Inequality (65) can now be rewritten as 2 2 2 2 .nC1/ u uO K C v .nC1/ vO 2 C w.nC1/ wO A C z.nC1/ zO2 2 2 2 2 u.n/ uO K C v .n/ vO 2 C w.n/ wO A C z.n/ zO2 2 2 2 u.nC1/ u.n/K w.nC1/ w.n/ A Ku.nC1/ z.n/ 2 (67)
q
where we have introduced kukK D kuk22 kKuk22 , which is a norm as a result of the condition kKk2 < 1. It follows from inequality (67) that the sequence u.n/; v .n/ ; w.n/ ; z.n/ is bounded .n / .n / .n / .n / and that there must exist a converging subsequence u j ; v j ; w j ; z j . We call the limit of this subsequence u- ; v - ; w- ; z- . Inequality (67) also implies that 2 2 2 PN 1 .nC1/ u.n/ K C w.nC1/ w.n/ A C Ku.nC1/ z.n/ 2 nDM u 2 2 2 2 u.M / uO K C v .M / vO 2 C w.M / w O A C z.M / zO2 2 2 2 2 u.N / uO v .N / vO w.N / wO z.N / zO K
2
A
2
(68)
Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery
1925
2 2 2 This implies that u.nC1/ u.n/ K C w.nC1/ w.n/ A C Ku.nC1/ z.n/ 2 tends to zero as n ! 1 and therefore that u.nC1/ u.n/ , w.nC1/ w.n/ , and Ku.nC1/ z.n/ tend to zero as n ! 1. We also find that .n/ v v .n1/
2
D Ku.n/ z.n/ 2 D Ku.nC1/ z.n/ C K u.n/ u.nC1/ 2 Ku.nC1/ z.n/ 2 C K u.n/ u.nC1/ 2
(66)
n!1
! 0:
Similarly, we also have that .nC1/ z.n/ z
2
D
K u.nC1/ u.n/ v .nC1/ C v .n/ C v .n/ v .n1/ 2 K u.nC1/ u.n/ C v .nC1/ v .n/ C v .n/ v .n1/ 2
2
2
n!1
! 0:
It follows that also u.nj C1/ ; v .nj C1/ ; w.nj C1/ ; z.nj C1/ converges to -therefore u ; v - ; w- ; z- . As u.nj C1/ ; v .nj C1/ ; w.nj C1/ ; z.nj C1/ and u.nj / ; v .nj / ; w.nj / ; z.nj / are related by formulas (58) and (66), and as proxf and proxg are continuous, it follows that - - - - u ; v ; w ; z is a fixed point of algorithm (58). Therefore, u- is a minimizer of problem (57). Choosing .Ou; v; O w; O zO/ D u- ; v - ; w- ; z- in relation (68) implies 2 2 2 2 .N / u- C v .N / v - C w.N / w- C z.N / z- u K 2 A 2 2 2 2 2 .M / .M / .M / u u C v v C w w C z.M / z- K
2
A
2
(69) .n/ .n/ .n/ .n/ for N > M . As u ; v ; w ; z possesses a converging subsequence, it follows .n/ .n/ .n/ .n/ that the whole sequence u ; v ; w ; z converges to u- ; v - ; w- ; z- . t u In contrast to algorithm (56), algorithm (58) appears asymmetric with respect to f and g (and K and A). By rewriting the latter algorithm, it is possible to better understand the origin of this asymmetry. Introducing an additional auxiliary variable uQ , algorithm (58) can equivalently be rewritten as 8 .nC1/ uN ˆ ˆ ˆ ˆ ˆ ˆ w.nC1/ ˆ ˆ < u.nC1/ ˆ ˆ ˆ ˆ ˆ v .nC1/ ˆ ˆ ˆ : .nC1/ uQ
D uQ .n/ K T v .n/ AT w.n/ ; D proxg w.n/ C ANu.nC1/ ; D uQ .n/ K T v .n/ AT w.nC1/ ; D proxf v .n/ C Au.nC1/ ; D uQ .n/ K T v .nC1/ AT w.nC1/ :
(70)
1926
I. Loris
.nC1/ Indeed, the third and last equation u.nC1/ D .n/ it follows from that uQ T .nC1/ .n/ .n/ T .n1/ .n/ or uQ D u C K v K v v v . The auxiliary variable uQ .n/ can therefore be eliminated from system (70), and one recovers algorithm (58). In form (70), it is clear how the symmetry between f (and K) and g (and A) is broken. A predict-update step is performed with respect to both the auxiliary variables w and v, sequentially instead of the parallel update in algorithm (56). The symmetry between f and g (and K and A) is therefore broken simply by performing the w update step before the v update. The v update uses an already updated u. From the proof of Proposition 4, it is also clear that this technique is not immediately generalizable to the minimization of a sum of m functions (as in problem (50)). Indeed, the proof shows that the operator K ends up in the norm used for the variable u.n/ instead of in the norm used for the auxiliary variable v .n/ (see, e.g., expression (67)). The matrix A only shows up in the norm applied to the variable w.n/ . A special case of this algorithm, where f is the indicator function of the `2 -ball and g is the `1 -norm, was introduced in Loris and Verhoeven (2012). Apart from the two proximity operators proxf and proxg , one also needs to perform four matrix-vector multiplications per iteration step.
3.3
Penalized Least Squares Minimization
In this final subsection, we treat the minimization of a penalized least squares functional with a penalty of type g.Au/, where proxg is known but proxg.A/ is not known. The subdifferential of the quadratic part can be computed explicitly. The use of two proximity operators can then be avoided. N be a convex function. Let K W Rd ! Rd1 and Proposition 5. Let g W Rd ! R p d d2 A W R ! R be two linear operators with kKk2 < 2 and kAk2 < 1, and suppose that a minimizer of the problem 1 arg min kKu yk22 C g.Au/ x 2
(71)
8 .nC1/ T y Ku.n/ AT w.n/ ; D u.n/ C K < uN w.nC1/ D proxg w.n/ C ANu.nC1/ ; : .nC1/ u D u.n/ C K T y Ku.n/ AT w.n/ ;
(72)
exists. The algorithm
(u.0/ , uN .0/ , and w.0/ are arbitrary) converges to a minimizer of (71). Proof. See Loris and Verhoeven (2011). A generalization of algorithm (72) later appeared in Chen et al. (2013). t u
Numerical Algorithms for Non-smooth Optimization Applicable to Seismic Recovery
1927
Here too, step-length parameters can be introduced (see, e.g., formula (86) in Sect. 4). Many other algorithms can be written to solve problems of type (71) (see, e.g., Daubechies et al. (2007), Zhu and Chan (2008), Beck and Teboulle (2009), Bredies (2009), Afonso et al. (2010), Chambolle and Pock (2010), Esser et al. (2010), and Zhang et al. (2011)). When A is the identity, algorithm (72) reduces to the following proximal algorithm. N be a convex function and suppose that a minimizer Proposition 6. Let g W Rd ! R of the problem 1 arg min kKu yk22 C g.u/ u 2
(73)
u.nC1/ D prox˛g u.n/ C K T y Ku.n/
(74)
exists. The algorithm
(u.0/ is arbitrary) converges to a minimizer of (73) if kKk
0, there exists k0 ."/ such that for all k; l k0 ."/, 1 X
X 2 2nC1 ^ hFk Fl ; Gm;n;j i2L2 .B/ < "2 : K .m; n/
(13)
j D1
m;nD0 K ^ .m;n/6D0
Due to (10) and (11), the sequence .K ^ .m; n//m;n2N0 must be bounded. Let supm;n2N0 jK ^ .m; n/j DW . Thus, kFk
Fl k2L2 .B/
2
1 X m;nD0 K ^ .m;n/¤0
X 2 2nC1 K .m; n/ hFk Fl ; Gm;n;j i2L2 .B/ < 2 "2 ^
j D1
for all k; l k0 ."/. Hence, .Fk /k2N0 is a Cauchy sequence in L2 .B/. Let F be the L2 .B/-limit of the sequence. Since strong convergence implies weak convergence, one gets hF; Gm;n;j iL2 .B/ D limk!1 hFk ; Gm;n;j iL2 .B/ D 0 for all .m; n; j / with K ^ .m; n/ D 0. Moreover, (13) implies that for all N 2 N,
Tomography: Problems and Multiscale Solutions
X
2101
X ^ 2 2nC1 K .m; n/ hFk Fl ; Gm;n;j i2L2 .B/ < "2 j D1
mCnN K ^ .m;n/¤0
for every k; l k0 ."/ with " > 0 arbitrary but fixed. This yields 1 X
K ^ .m; n/
X 2 2nC1 j D1
m;nD0 K ^ .m;n/6D0
1 X
D
lim hFk Fl ; Gm;n;j i2L2 .B/
l!1
X ^ 2 2nC1 K .m; n/ hFk F; Gm;n;j i2L2 .B/ "2 j D1
m;nD0 K ^ .m;n/6D0
for all k k0 ."/. Thus, kFk F kH " for all k k0 ."/, where k0 ."/ can be obtained for each " > 0. Hence, .Fk /k2N0 converges to F in the sense of H. Finally, F 2 H since kF kH kF Fk kH C kFk kH < C1. Note that the approach presented here does not require the completion of a preliminary pre-Hilbert space in contrast to the previous approaches. Lemma 1 (Sobolev Lemma). Every H in Definition 9 is a subspace of C.B n f0g/. If the basis system of type I is used, H is also a subspace of C.B/. Proof. The lemma is a consequence of the uniform convergence of the L2 .B/Fourier series of each F 2 H, which can be proved by using the Cauchy-Schwarz inequality, Definition 9, the proof of Theorem 14, and the summability conditions: ˇ ˇ ˇ ˇ ˇ ˇ ˇ
X
2nC1 X
nCmN K ^ .n;m/¤0
j D1
ˇ ˇ ˇ ˇ hF; Gm;n;j iL2 .B/ Gm;n;j .x/ˇ ˇ ˇ 11=2
0 B @
X
2nC1 X
nCmN K ^ .n;m/¤0
j D1
C .K ^ .m; n//2 hF; Gm;n;j i2L2 .B/ A 11=2
0 B @
X
2nC1 X
nCmN K ^ .n;m/¤0
j D1
2 C .K ^ .m; n//2 Gm;n;j .x/ A
:
(14)
II The right-hand side converges to 0 as N ! 1. Note that Gm;n;j is discontinuous at 0 for n > 0. Therefore, the continuity of the limit of the series can only be proved at Bnf0g in the case of type II.
2102
V. Michel
Theorem 19. Every H ..K ^ .m; n// ; X; B/ in Definition 9 is a reproducing kernel Hilbert space. The corresponding reproducing kernel is the product series associated to the sequence K ^ .m; n/2 :
KH .x; y/ D
1 2nC1 X X 2 X X K ^ .m; n/ Gm;n;j .x/Gm;n;j .y/I
x; y 2 B:
m;nD0 j D1
Proof. Let Lx W H ! R, x 2 B, be the evaluation functional: Lx F WD F .x/: In analogy to (14), one obtains 0 jLx F j kF kH @
1 X
.K ^ .m; n//2
2nC1 X
11=2
.Gm;n;j .x//2 A
:
j D1
m;nD0
Due to (10) and (11), respectively, and Theorem 14, the linear functional Lx is, consequently, bounded. Thus, Theorem 15 yields the existence of the reproducing kernel. Its representation is an immediate consequence of Theorems 16 and 14 and the summability conditions (10) and (11) since (12) is the Parseval identity in H for the orthonormal system
K ^ .m; n/Gm;n;j
m;n2N0 I K ^ .m;n/¤0I j D1;:::;2nC1
for the following reason: hF1 ; F2 iH D
D
1 X m;nD0 K ^ .m;n/¤0
F1 ; F2 2 H.
4
1 X
2nC1 X
m;nD0 K ^ .m;n/¤0
j D1
2nC1 X
˝
^ ˛ ˝ ˛ 2 ˝ K .m; n/ F1 ; Gm;n;j L2 .B/ F2 ; Gm;n;j L2 .B/
F1 ; K ^ .m; n/Gm;n;j
˛ ˝ H
F2 ; K ^ .m; n/Gm;n;j
˛ H
I
j D1
Splines
Reproducing kernel Hilbert spaces are a starting point for the construction of localized basis functions on the ball. This concept is based on works for functions in L2 Œ0; 1 (see Engl 1982, 1983a) and on the concept of spherical splines in Freeden (1981a,b, 1999) and Freeden et al. (1998). The splines on the 3d-ball discussed here were first established for type II in Amirbekyan (2007) and Amirbekyan and Michel (2008) and for type I in Berkel (2009) and Berkel and Michel (2010). Note that general domains are treated in Engl (1983b) and Amirbekyan (2007). Furthermore,
Tomography: Problems and Multiscale Solutions
2103
it is worth noting that in Nashed and Wahba (1974, 1975) an approximation method based on reproducing kernel Hilbert spaces is presented which is related to the method explained here but different in the technical details and realization. In this section, the unknown function is assumed to be an element of the Sobolev space H ..K ^ .m; n// ; X; B/ DW H with X 2 fI; IIg: A finite number of data related to F 2 H is assumed to be known, which is represented by F l F D bl
8l D 1; : : : ; N;
(15)
where each F l W H ! R is a linear and continuous functional. These functionals are used for the definition of the splines. Definition 10. Every S 2 H of the form S .y/ D called a spline in H (with respect to F 1 ; : : : ; F N ).
PN
k kD1 ak Fx KH .x; y/,
y 2 B, is
The determination of S as a solution of (15) yields the system of linear equations N X
ak Fyl Fxk KH .x; y/ D bl
8l D 1; : : : ; N:
(16)
kD1
Due to Theorem 17, the corresponding matrix is a Gramian matrix since D E Fyl Fxk KH .x; y/ D Fxk KH .x; /; Fyl KH .y; / : H
P k Therefore, the matrix is regular, if and only if N kD1 ck Fx KH .x; / D 0 only has the trivial solution ck D 0 8k D 1; : : : ; N . On the other hand, this statement is equivalent to the requirement that N X kD1
!*N + X k ck F F D ck Fx KH .x; /; F k
kD1
D 0 8F 2 H
H
only has the trivial solution ck D 0 8k D 1; : : : ; N . Therefore, the Gramian matrix is regular, if and only if the system fF 1 : : : ; F N g is linearly independent. This leads to the following results. PN k Lemma 2. Let S D kD1 ak Fx KH .x; / be a spline in H. Then hS; F iH D PN k kD1 ak F F for all F 2 H. Theorem 20 (Existence and Uniqueness of the Interpolating Spline). The interpolation problem F l S D bl 8l D 1; : : : ; N yields a unique spline S in H (with respect to F 1 ; : : : ; F N ) for every choice of .b1 ; : : : ; bN /T 2 RN , if and only if fF 1 ; : : : ; F N g is linearly independent.
2104
V. Michel
Note that a regular Gramian matrix is automatically positive definite. For further proceeding, the linear independence of fF 1 ; : : : ; F N g is always assumed to be valid. The obtained spline has some important properties, which can be derived as P PN k k follows: Let S D N a F K .x; /, S D a F k H x kD1 kD1 k x KH .x; / be two splines in H and F 2 H with F k F D F k SP for all k D 1; : : : ; N . Then Lemma 2 yields k the identity hS S; F S iH D N kD1 .ak ak /F .F S / D 0: Hence, kF S k2H D kF S C S S k2H D kF S k2H C kS S k2H : Note that this is also true, if S 0. Theorem 21 (First and Second Minimum Property). Let an arbitrary function F 2 H be given and S be the spline in H with respect to F 1 ; : : : ; F N such that F k S D F k F for all k D 1; : : : ; N . Then kS kH D minfkFQ kH j FQ 2 H with F k FQ D F k F for all k D 1; : : : ; N g;
˚ kF S kH D min kF S kH j S spline in H with respect to F 1 ; : : : ; F N ; where in both cases the minimizer is unique. The minimum properties can be interpreted as follows: 1. Considering k kH as a kind of a non-smoothness measure (coefficients corresponding to high degrees get large factors .K ^ .m; n//2 ), one can regard S as the “smoothest” interpolant in H. 2. S is the best approximation (in the H-metric) to F among all splines. The following theorems on convergence and error estimates are known. The proofs can be found in Amirbekyan (2007) and Amirbekyan and Michel (2008). Theorem 22 (Error Estimate). Let F and S be given as in Theorem 21. Then p min jGF GS j 2ƒkF kH and kF S kH 2 ƒkF kH ;
G 2H kG kH D1
where H is the dual space of H and ƒ WD
sup
G 2H kG kH D1
min
J 2spanfF 1 ;:::;F N g
kG J kH :
In a sense, the quantity ƒ measures the radius of the largest “gap” in the set of functionals F 1 , . . . , F N representing the given data.
Tomography: Problems and Multiscale Solutions
2105
Theorem 23 (Convergence Theorem). Let a linearly independent system fF 1 ; F 2 ; : : : g H be given. For every N 2 N and every F 2 H, let the spline SF;N in H with respect to F 1 ; : : : ; F N be determined by F l SF;N D FlF
8l D 1; : : : ; N:
Then the following statements are equivalent: ˇ ˇ ˇ D 0 8F 2 H 8G 2 H . (i) limN !1 ˇG F G SF;N (ii) limN !1 kF SF;N kH D 0 8F 2 H. (iii) The system fF 1 ; F 2 ; : : : g is complete in H . Some further known enhancements should be mentioned here. • A Lagrangian basis can be constructed by setting
Lj .y/ WD
N X
ak;j Fxk .x; y/
8y 2 B
kD1
P l k l with N kD1 ak;j Fy Fx KH .x; y/ D ıj;l 8j; l D 1; : : : ; N (i.e., F Lj D ıj;l 8j; l D 1; : : : ; N ). Following Theorem 20, one immediately gets that S D
N X j F F Lj j D1
is the interpolating spline. • By adding a constant > 0 to the diagonal of the matrix in (16), one replaces the interpolation problem by an approximation problem. This corresponds to a Tikhonov-type regularization of the (usually) ill-conditioned matrix. The obtained coefficients a1 ; : : : ; aN yield a spline S which is the unique minimizer of the functional H 3 F 7!
N X 2 bj F j F C kF k2H : j D1
• A spline-based multiresolution analysis can be constructed. This is done by choosing different sequences KJ^ .m; n/ m;n2N0 , J 2 N0 . Provided that some conditions are satisfied, this produces a nested sequence of Sobolev spaces HJ . The idea is that larger Sobolev spaces correspond to splines that require more data. In comparison to Theorem 23, the different splines SF;N now belong to J different Sobolev spaces HJ that are chosen in accordance to the used functionals F 1 ; : : : ; F NJ . A corresponding convergence result can be proved. Such a theory
2106
V. Michel
I was first developed for harmonic functions on the 3d-ball (based on fG0;n;j gn;j , ^ i.e., K .m; n/ D 0, if m > 0) in Fengler et al. (2006) and Michel and Wolf (2008). It was then extended to the general use of type I or type II basis functions in Berkel (2009).
5
Scaling Functions and Wavelets
5.1
For the Approximation of Functions
Product series as they were defined above can be used to construct scaling functions and wavelets (see also, e.g., Michel (2005a) and Berkel (2009), in particular). Definition 11. A sequence fˆJ gJ 2N0 of product series is called a scaling function, if the following conditions are satisfied: ^ (i) lim ˇ n/j D 1 forˇ all m; n 2 N0 . ˇ ^J !1 jˆˇ J .m; (ii) ˇˆJ .m; n/ˇ ˇˆ^ .m; n/ˇ for all m; n; J 2 N0 . ^ J C1 2 P1 (iii) m;nD0 n ˆJ .m; n/ < C1 for all J 2 N0 .
Obviously, conditions (i) and (ii) imply that ˇ ˇ ^ ˇˆ .m; n/ˇ 1 J
8m; n; J 2 N0
(17)
and condition (iii) guarantees that ˆJ 2 L2 .B B/; see Theorem 14. Q J gJ 2N0 of product series are called Definition 12. Two sequences f‰J gJ 2N0 and f‰ a primal and a dual wavelet, respectively, corresponding to the scaling function fˆJ gJ 2N0 , if the following conditions are satisfied: Q ^ .m; n/ D ˆ^ .m; n/ 2 ˆ^ .m; n/ 2 for all m; n; J 2 N0 . (i) ‰J^ .m; n/‰ J J C1 J ^ 2 P P 2 Q^ (ii) 1 < C1 and 1 m;nD0 n ‰J .m; n/ m;nD0 n.‰J .m; n// < C1 for all J 2 N0 . Definition 13. If K is a product series with F 2 L2 .B/ is a given function, then
P1
m;nD0 n .K
^
.m; n//2 < C1 and
Z .K F / .x/ WD
K.x; y/F .y/ dy; B
is called the convolution of K and F .
x 2 B;
Tomography: Problems and Multiscale Solutions
2107
These tools prepare the way to show the standard features of a wavelet method such as an approximate identity, a multiresolution analysis, and a scale step property. P 2 ^ Lemma 3. If K is a product series with 1 m;nD0 n .K .m; n// < C1 and F 2 L2 .B/ is a given function, then K F 2 L2 .B/ and K F D
1 2nC1 X X
K ^ .m; n/hF; Gm;n;j iL2 .B/ Gm;n;j
m;nD0 j D1
in the sense of L2 .B/. This lemma is an immediate consequence of the Parseval identity. Theorem 24 (Approximate Identity). If fˆJ gJ 2N0 is a scaling function, then .2/ lim ˆJ F F
L2 .B/
J !1
D0
.2/
for all F 2 L2 .B/, where ˆJ F WD ˆJ .ˆJ F /. Proof. Due to Lemma 3, 2 .2/ ˆJ F F 2
L .B/
D
1 2nC1 X X h
ˆ^ J .m; n/
2
i2 1
hF; Gm;n;j i2L2 .B/ :
m;nD0 j D1
Note that (17) and the Parseval identity for kF k2L2 .B/ imply that the limit J ! P 1 may be interchanged with 1 m;nD0 above. Hence, condition (i) of Definition 11 yields the desired result. Theorem 25 (Multiresolution Analysis). If fˆJ gJ 2N0 is a scaling function, then the spaces VJ WD H
ˆ^ J .m; n/
2
; X; B ;
J 2 N0 ;
X 2 fI; IIg fixed, which are called the scale spaces, constitute a multiresolution analysis, i.e.: (a) V0 VJ VJ C1 L2 .B/ for all J 2 N0 . S kk 2 (b) J 2N0 VJ L .B/ D L2 .B/. Proof. Note that, for all H 2 L2 .B/, the equivalency
2108
V. Michel .2/
H 2 VJ , 9F 2 L2 .B/ W H D ˆJ F .2/
is valid. Now, let ˆJ F be an arbitrary element of VJ . Then f 2 L2 .B/ is defined by ˝
f; Gm;n;j
8
0, then bm;n;j WD Yn;j 8m; n; j , Y D L2 ./ is a possible constellation; cf. Sect. 6.1). The need of a regularization method occurs, in particular, if T is a compact operator, i.e., if lim
sup jm;n j D 0:
(18)
N !1 mCn N
The regularization method is now constructed by defining the scaling function fˆJ gJ 2N0 as a product series
ˆJ .x; z/ D
1 2nC1 X X m;nD0 j D1
ˆ^ J .m; n/Gm;n;j .x/bm;n;j .z/I
x 2 B; z 2 DI
2110
V. Michel
with (i) ˆ^ J .m; n/ D 0 for all .m; n/ with m;n D 0 and all J 2 N0 . ^ (ii) jˆ^ J .m; n/j jˆJ C1 .m; n/j for all m; n; J 2 N0 . ^ 1 (iii) limJ !1 ˆJ .m; n/ D m;n for all .m; n/ with m;n ¤ 0. ^ 2 P1 (iv) m;nD0 n ˆJ .m; n/ < C1 for all J 2 N0 . Note that one can embed these scaling functions in a Hilbert space of functions on B D with the inner product Z hF1 ; F2 i WD
B
hF1 .x; /; F2 .x; /iY dx:
A convolution is then defined by .ˆJ y/.x/ WD hˆJ .x; /; yiY in the sense of L2 .B/. One gets that lim kˆJ y f kL2 .B/ D 0;
J !1
where f is uniquely determined by Tf D y;
ˇ ˚
kkL2 .B/ f 2 span Gm;n;j ˇ m;n ¤ 0 ;
provided that y is in the image of T . Note that the convolution with the scaling function is not iterated here, in contrast to Sect. 5.1. However, choosing in Sect. 5.1 2 a product series with coefficients ..ˆ^ J .m; n// /m;n2N0 instead of ˆJ , one also gets a scaling function, where the iteration becomes obsolete. The constructed method represents a regularization since the operators RJ W Y ! L2 .B/, y 7! ˆJ y are continuous for every J 2 N0 due to conditions (i) and (iv) above: kˆJ yk2L2 .B/ D
1 2nC1 X X ˛ 2 ˝ ˆ^ J .m; n/ y; bm;n;j Y m;nD0 m;n ¤0
sup m;n2N0
j D1
ˆ^ J .m; n/
2
kyk2Y :
As a matter of fact, conditions (iii) and (iv) are a kind of antagonists for each other due to (18). The faster the convergence in (18) is, i.e., the “more” instable the inversion of Tf D y is, the more difficult it is to find a scaling function that satisfies both conditions (iii) and (iv). Nevertheless, it is always possible to find a scaling function (see Michel 2002b). Wavelets can also be constructed with respect to these scaling functions. Again, a corresponding scale step property can be proved (see Michel (2002b) for further details).
Tomography: Problems and Multiscale Solutions
2111
6
Applications
6.1
The Inverse Gravimetric Problem
The inverse gravimetric problem is concerned with the resolution of the Fredholm integral equation of the first kind Z %.x/ dx D V .y/; V jR3 nB given; %jB unknown; B jx yj which is given by Newton’s law of gravitation ( is the gravitational constant which is omitted in the further considerations). This problem is not uniquely solvable. For a survey article on this problem and the non-uniqueness of the solution, see Michel and Fokas (2008). Assuming that % has an expansion in L2 .B/ of the form %.x/ D
1 2nC1 X X
%n;j .jxj/ Yn;j
nD0 j D1
x ; jxj
x 2 B;
(19)
and using 1 X 1 y x jxjm D Pm jx yj jyjmC1 jxj jyj mD0 D
1 2mC1 X X x y jxjm 4 Y ; Y m;k m;k mC1 jyj 2m C 1 jxj jyj mD0
jxj < jyj;
kD1
(see, e.g., Freeden et al. 1998, p. 44), one gets Z .T %/.y/ WD B
Z
%.x/ dx jx yj
ˇ
2nC1 X 2mC1 X y 4 rm r %n;j .r/ mC1 Ym;k jyj 2m C 1 j D1 jyj n;mD0 2
D 0
Z
1 X
kD1
Yn;j . /Ym;k . / d!. / dr
D
1 2nC1 X XZ nD0 j D1
ˇ
%n;j .r/r nC2 dr 0
4 jyjn1 Yn;j 2n C 1
y jyj
(20)
(see also Michel and Fokas 2008, Theorem 2.1). This result is an expansion in outer harmonics. Thus, the expansion coefficients of the harmonic potential V jR3 nB can be linked to the integrals
2112
V. Michel
Z
ˇ
%n;j .r/r nC2 dr: 0
By imposing appropriate constraints, a unique solution can be obtained (see also Michel and Fokas (2008) and the references therein). One possibility is to require the harmonicity of the unknown density function %: % D 0. In terms of the basis of type I, this corresponds to an expansion
%.x/ D
1 2nC1 X X
I %n;j;I G0;n;j .x/
nD0 j D1
D
1 2nC1 X X
s %n;j;I
nD0 j D1
2n C 3 ˇ3
jxj ˇ
n
Yn;j
x ; jxj
x 2 B;
in the sense of L2 .B/. Setting this expansion in (19), one gets for (20) the identity
.T %/.y/ D
1 2nC1 X X
s %n;j;I
nD0 j D1
D
1 2nC1 X X
2n C 3 ˇ3 s
%n;j;I
nD0 j D1
Z
ˇ
r 0
2nC2
dr ˇ
n
4 jyjn1 Yn;j 2n C 1
4 ˇn ˇ3 Yn;j 2n C 3 2n C 1 jyjnC1
y ; jyj
y jyj
jyj ˇ;
where the convergence for jyj D ˇ is given in the sense of L2 .@B/. Let now V be given at a sphere with center 0 and radius ˇ. Then the expansion coefficients for the orthonormal basis f 1 Yn;j . /gn2N0 I j D1;:::;2nC1 of L2 . / must satisfy
s n 4 ˇ ˇ3 1 V j ; Yn;j D %n;j;I : L2 . / 2n C 3 2n C 1
This shows the following: • The corresponding operator is compact since the singular values converge to zero. If > ˇ, this convergence is exponential (exponentially ill-posed problem). The reason is that the downward continuation problem becomes a part of the problem. • The spline method in Sect. 4 and the wavelet-based regularization method in Sect. 5.2 are applicable to the inverse gravimetric problem. Note that satellite missions yield derivatives of the gravitational potential. For instance, CHAMP (Challenging Minisatellite Payload) and GRACE (Gravity Recovery And Climate Experiment) yield first-order derivatives. For this reason,
Tomography: Problems and Multiscale Solutions
2113
it is also interesting to discuss the expansion of the gradient of the potential. If Z .T1 %/ .y/ WD ry
B
%.x/ dx; jx yj
then the separation y D r , r D jyj > ˇ, 2 , yields
.T1 %/ .y/ D
1 2nC1 X X
s %n;j;I
nD0 j D1
D
1 2nC1 X X
s %n;j;I
nD0 j D1
C
1 2nC1 X X
n 1 4 @ ˇ ˇ3
C r Y . / n;j 2n C 3 2n C 1 @r r r nC1 ˇ n .1/ 4 ˇ3 .n 1/ nC2 yn;j . / 2n C 3 2n C 1 r s
%n;j;I
nD1 j D1
4 ˇn p ˇ3 .2/ n.n C 1/ yn;j . /: 2n C 3 2n C 1 r nC2
Further details and explicit formulae for second-order derivatives, which are relevant in the case of the GOCE (Gravity Field and Steady-State Ocean Circulation Explorer) mission, can be found in Michel (2005b). Results of the application of the wavelet and spline methods to the inverse gravimetric problem are shown in Figs. 1 and 2. For further details and numerical results of wavelet-based methods for the inverse gravimetric problem, see Michel (1999, 2002a,b, 2005b) and Michel and Fokas (2008), and for corresponding spline methods, see Fengler et al. (2006) and Michel and Wolf (2008).
6.2
Normal Mode Tomography
Normal mode tomography is concerned with the determination of structures inside the Earth out of frequency anomalies of the free oscillations of the Earth. These anomalies are represented as great circle integrals of a so-called splitting function ı!. / D
1 2
I . / dl. /;
(21)
where 2 is the pole of the great circle. The splitting function is related to the expansion coefficients ımn;j of a vector ım representing the relative deviations of the compressional velocity, the shear velocity, and the mass density from a given reference model. This relation is given by
D
1 2nC1 X XZ nD0 j D1
ˇ
kn .r/ ımn;j .r/ dr Yn;j 0
2114
V. Michel
2
Fig. 1 Wavelet-based harmonic density determination: regularization ˆJ @@rV2 (left) and details Q J ‰J @2 V2 (right) at scales J D 3 (top) to J D 8 (bottom), plotted at a sphere with radius ‰ @r 0:999ˇ, where EGM96 from degree 3 was taken as V on a 200 km “orbit” (Images from Michel 2005b)
Tomography: Problems and Multiscale Solutions
–40
–30
–20
–10
0
2115
10
20
30
40
[kg/m3]
Fig. 2 Spline approximation of the harmonic density out of heterogeneous gravitational data: above South America, data at a denser grid and at a lower height were used, whereas globally data at a coarser grid and a bigger height were used. This mixture of the data was used for the inversion via a spline method. The result shows that the spline spatially adapts its resolution to the data quality (from Michel and Wolf 2008); see the reference for further details
in the sense of L2 ./, where the vectorial functions kn are implicitly given. For further details, see, e.g., Woodhouse and Dahlen (1978), Li et al. (1991), Dahlen and Tromp (1998), and Nolet (2008). Keeping the velocities fixed, one obtains a scalar problem for the density to which the spline method in Sect. 4 is applicable. Figure 3 shows a numerical result. Note that this result contains similarities to a result in Ishii and Tromp (1999), which was obtained by a different method. In Berkel (2009) and Berkel and Michel (2010), a vectorial version of the spline method is developed for the determination of all three components of ım. Note that a joint inversion of gravity and normal mode data by the spline method in Sect. 4 is also possible. An important drawback of normal mode tomography is the fact that the great circle integral in (21) cancels out all odd degrees. This is a severe non-uniqueness problem, which may cause phantoms in the solution. However, recent advances (see Irving et al. 2009) show perspectives for the recovery of odd-degree structures of the Earth by analyzing so-called cross-coupling normal modes.
6.3
Travel Time Tomography
In travel time tomography, the travel times of seismic waves for source-receiver pairs .Sk ; Rk / 2 .ˇ/2 ; k D 1; : : : ; N ; are given where the space-dependent
2116
–0.015
V. Michel
–0.01
–0.005
0
0.005
0.01
3
kg/m
Fig. 3 Spline approximation of the relative mass density variations at the radius 3,670 km, calculated by inverting normal mode data (From Berkel 2009, p. 141)
velocity V is unknown. One possible modeling is the use of the equation Z k
1 dl.x/ D Tk I V .x/
k D 1; : : : ; N:
where k represents a path from the source Sk to the receiver Rk , which is derived from seismic ray theory (see, e.g., Aki and Richards 2002). For obtaining a linear problem, one assumes that k is given by a reference model and does not depend on V and one considers the slowness S WD V1 as the unknown function. Note that one can use surface waves, where S 2 L2 ./ is unknown, or body waves, where S 2 L2 .B/ has to be determined. Both problems were solved by a spline method, where in the latter case the splines in Sect. 4 were used. For further details and numerical results, the reader is referred to Amirbekyan (2007), Amirbekyan and Michel (2008), and Amirbekyan et al. (2008). Note that an alternative modeling, which has been common for the last years, uses the integral equation Z ıTk D
B
K˛ .x/
ı˛ ı ˇQ ı% .x/ C KˇQ .x/ .x/ C K% .x/ .x/ dx; ˛M %M ˇQM
where .˛M ; ˇQM ; %M / is a reference model for the compressional velocity, the shear Q and ı% are the unknown real deviations from velocity, and the mass density; ı˛, ı ˇ,
Tomography: Problems and Multiscale Solutions
2117
this model; and Tk is the known difference between the real travel time and the travel time predicted by the reference model in the case of a source-receiver pair .Sk ; Rk /. Moreover, the Fréchet kernels K˛ , KˇQ , and K% are given and depend, e.g., on the considered source-receiver pair. The values of these kernels are large in a (banana-shaped) neighborhood of the ray between source and receiver but are small immediately close to the ray such that a cross section looks like a doughnut. Due to this visual interpretation, these kernels are also called banana-doughnut kernels. For further details, see, e.g., Dahlen and Tromp (1998) and Nolet (2008) and the references therein. It appears to be likely that the vectorial spline approximation method from Berkel (2009) and Berkel and Michel (2010) is also applicable to this inverse problem. The latter approach based on Fréchet kernels takes into account wave scattering and uses a more realistic finite-frequency modeling, whereas the former approach based on seismic rays corresponds to an infinite-frequency modeling.
6.4
Inverse EEG and MEG
The inverse problems associated to electroencephalography (EEG) and magnetoencephalography (MEG) are examples of non-geoscientific inverse problem which are nevertheless related to the problems discussed here. They are concerned with the determination of the electric current distribution in the human brain out of measurements of the electric and the magnetic field outside the brain. These inverse problems require an appropriate approximation method on the 3d-ball for a numerical implementation. It turns out that a spline method similar to the one presented in Sect. 4 yields promising results; see Fokas et al. (2012). Note that the use of a Lagrangian basis is interesting here since almost real-time solutions can be calculated.
7
Conclusions and Outlook
Two different orthonormal bases are known for L2 .B/, where each has advantages and disadvantages. Both systems can be used to construct a reproducing kernelbased spline interpolation/approximation method and a wavelet method. The quality of the spline is spatially adapted to the (possibly heterogeneous) structure of the given data. The wavelet method represents a zooming-in tool that yields approximations of the unknown function at different levels of resolution, where the differences of the approximations can reveal certain details of the function. Moreover, the spline and the wavelet method both can regularize ill-posed inverse problems. The presented tools are applicable to several tomographic problems in the geosciences but also in medical imaging. The product kernels, which were used here for the construction of the spline and the wavelet method, are localized basis functions, in contrast to orthogonal polynomials. A recently developed algorithmic improvement (the Regularized
2118
V. Michel
Functional Matching Pursuit, RFMP) yields the possibility of combining arbitrary systems of trial functions to obtain a sparse representation of the solution. For further details, see Fischer (2011), Fischer and Michel (2012, 2013a,b), and the author’s other article in this handbook.
References Aki K, Richards PG (2002) Quantitative seismology, 2nd edn. University Science Books, Sausalito Akram M, Amina I, Michel V (2011) A study of differential operators for particular complete orthonormal systems on a 3d ball. IJPAM-Int J Pure Appl Math 73:489–506 Amirbekyan A (2007) The application of reproducing kernel based spline approximation to seismic surface and body wave tomography: theoretical aspects and numerical results. PhD thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern. http:// kluedo.ub.uni-kl.de/volltexte/2007/2103/index.html Amirbekyan A, Michel V (2008) Splines on the three-dimensional ball and their application to seismic body wave tomography. Inverse Probl 24:015022 (25pp) Amirbekyan A, Michel V, Simons FJ (2008) Parameterizing surface-wave tomographic models with harmonic spherical splines. Geophys J Int 174:617–628 Ballani L, Engels J, Grafarend EW (1993) Global base functions for the mass density in the interior of a massive body (Earth). Manuscr Geod 18:99–114 Berkel P (2009) Multiscale methods for the combined inversion of normal mode and gravity variations. PhD thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern, Shaker, Aachen Berkel P, Michel V (2010) On mathematical aspects of a combined inversion of gravity and normal mode variations by a spline method. Math Geosci 42:795–816 Dahlen FA, Tromp J (1998) Theoretical global seismology. Princeton University Press, Princeton Davis P (1963) Interpolation and approximation. Blaisdell Publishing Company, Waltham Dufour HM (1977) Fonctions orthogonales dans la sphère – résolution théorique du problème du potentiel terrestre. Bull Geod 51:227–237 Dunkl CF, Xu Y (2001) Orthogonal polynomials of several variables. In: Doran R, Ismail M, Lam T-Y, Lutwak E (eds.) Encyclopedia of mathematics and its applications, vol 81. Cambridge University Press, Cambridge Engl H (1982) On least-squares collocation for solving linear integral equations of the first kind with noisy right-hand side. Boll Geod Sci Aff 41:291–313 Engl H (1983a) On the convergence of regularization methods for ill-posed linear operator equations. In: Hämmerlin G, Hoffmann KH (eds.) Improperly posed problems and their numerical treatment, ISNM 63. Birkhäuser, Basel, pp 81–95 Engl H (1983b) Regularization by least-squares collocation. In: Deuflhard P, Hairer E (eds.) Numerical treatment of inverse problems in differential and integral equations. Birkhäuser, Boston, pp 345–354 Fengler M, Michel D, Michel V (2006) Harmonic spline-wavelets on the 3-dimensional ball and their application to the reconstruction of the earth’s density distribution from gravitational data at arbitrarily shaped satellite orbits. ZAMM-Z Angew Math Me 86:856–873 Fischer D (2011) Sparse regularization of a joint inversion of gravitational data and normal mode anomalies. PhD thesis, Geomathematics Group, Department of Mathematics, University of Siegen, Verlag Dr. Hut, Munich Fischer D, Michel V (2012) Sparse regularization of inverse gravimetry – case study: spatial and temporal mass variations in South America. Inverse Probl 28:065012 (34pp) Fischer D, Michel V (2013a) Inverting GRACE gravity data for local climate effects. J Geod Sci 3:151–162 Fischer D, Michel V (2013b) Automatic best-basis selection for geophysical tomographic inverse problems. Geophys J Int 193:1291–1299
Tomography: Problems and Multiscale Solutions
2119
Fokas AS, Hauk O, Michel V (2012) Electro-magneto-encephalography for the three-shell model: numerical implementation via splines for distributed current in spherical geometry. Inverse Probl 28:035009 (28pp) Freeden W (1981a) On approximation by harmonic splines. Manuscr Geod 6:193–244 Freeden W (1981b) On spherical spline interpolation and approximation. Math Methods Appl Sci 3:551–575 Freeden W (1999) Multiscale modelling of spaceborne geodata. Teubner, Leipzig Freeden W, Michel V (2004) Multiscale potential theory with applications to geoscience. Birkhäuser, Boston Freeden W, Schreiner M (2009) Spherical functions of mathematical geosciences – a scalar, vectorial, and tensorial setup. Springer, Heidelberg Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere – with applications to geomathematics. Oxford University Press, Oxford Freud G (1971) Orthogonal polynomials. Pergamon, Budapest Irving JCE, Deuss A, Woodhouse JH (2009) Normal mode coupling due to hemispherical anisotropic structure in earth’s inner core. Geophys J Int 178:962–975 Ishii M, Tromp J (1999) Normal-mode and free-air gravity constraints on lateral variations in velocity and density of earth’s mantle. Science 285:1231–1236 Li X-D, Giardini D, Woodhouse JH (1991) Large-scale even-degree structure of the earth from splitting of long-period normal modes. J Geophys Res 96:551–577 Michel V (1999) A multiscale method for the gravimetry problem – theoretical and numerical aspects of harmonic and anharmonic modelling. PhD thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern, Shaker, Aachen Michel V (2002a) Scale continuous, scale discretized and scale discrete harmonic wavelets for the outer and the inner space of a sphere and their application to an inverse problem in geomathematics. Appl Comput Harmon Anal 12:77–99 Michel V (2002b) A multiscale approximation for operator equations in separable Hilbert spaces – case study: reconstruction and description of the earth’s interior. Habilitation thesis, Shaker Verlag, Aachen Michel V (2005a) Wavelets on the 3-dimensional ball. Proc Appl Math Mech 5:775–776 Michel V (2005b) Regularized wavelet-based multiresolution recovery of the harmonic mass density distribution from data of the earth’s gravitational field at satellite height. Inverse Probl 21:997–1025 Michel V (2013) Lectures on constructive approximation – Fourier, spline, and wavelet methods on the real line, the sphere, and the ball. Birkhäuser, Boston Michel V, Fokas AS (2008) A unified approach to various techniques for the non-uniqueness of the inverse gravimetric problem and wavelet-based methods. Inverse Probl 24:045019 (25pp) Michel V, Wolf K (2008) Numerical aspects of a spline-based multiresolution recovery of the harmonic mass density out of gravity functionals. Geophys J Int 173:1–16 Müller C (1966) Spherical harmonics. Springer, Berlin Nashed MZ, Wahba G (1974) Convergence rates of approximate least squares solutions of linear integral and operator equations of the first kind. Math Comput 28:69–80 Nashed MZ, Wahba G (1975) Some exponentially decreasing error bounds for a numerical inversion of the Laplace transform. J Math Anal Appl 52:660–668 Nikiforov AF, Uvarov VB (1988) Special functions of mathematical physics – a unified introduction with applications. Translated from the Russian by RP Boas. Birkhäuser, Basel Nolet G (2008) A breviary of seismic tomography. Cambridge University Press, Cambridge Schröder P, Sweldens W (1995) Spherical wavelets: efficiently representing functions on the sphere. In: Proceedings of the 22nd annual conference on computer graphics and interactive techniques. Los Angeles, ACM, New York, pp 161–172 Szegö G (1939) Orthogonal polynomials. AMS Colloquium Publications, XXIII, Providence Tscherning CC (1996) Isotropic reproducing kernels for the inner of a sphere or spherical shell and their use as density covariance functions. Math Geol 28:161–168 Woodhouse JH, Dahlen FA (1978) The effect of a general aspherical perturbation of the free oscillations of the earth. Geophys J R Astron Soc 53:335–354
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences Volker Michel
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Constellation of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Properties of the Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Discussion of Some Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2121 2124 2125 2127 2133 2138 2141 2145
Abstract
We summarize a recently introduced greedy algorithm for inverse problems in geomathematics. The algorithm is able to combine heterogeneous systems of trial functions to construct a stable approximation to the solution of the given illposed problem. The representation of this approximation with respect to the trial functions of mixed types is sparse in the sense that essentially less trial functions than available are used. Some new theoretical results about the method are also proved here.
1
Introduction
Numerous systems of trial functions are available today for the resolution of problems on the sphere or the ball. For the sphere itself, the spherical harmonics are available as a very popular system of orthogonal polynomials (see, e.g., Müller 1966;
V. Michel () Geomathematics Group, University of Siegen, Siegen, Germany e-mail: [email protected]; www.geomathematics-siegen.de © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_93
2121
2122
V. Michel
Heiskanen and Moritz 1981; Freeden et al. 1998; Michel 2013). Its features and algorithms for its applications are well-known. For instance, a fast Fourier transform which is applicable also to non-equispaced data is available in Keiner et al. (2009). Whereas spherical harmonics have particular advantages, they are also connected to drawbacks which become evident, e.g., if spatial irregularities in the data (like regionally varying noise levels or strongly scattered data grids) occur or if the data are only given or shall only be analyzed regionally. For such applications, localized trial functions have been developed in the last decades. Examples of methods of this kind are a spline method based on reproducing kernels (see Freeden 1981a,b; Freeden et al. 1998) and numerous wavelet methods. A small and incomplete selection for the latter kind can be found in Schröder and Sweldens (1995), Freeden and Windheuser (1996), Holschneider (1996), Freeden and Schreiner (1998), Antoine and Vandergheynst (1999), and Gerhards (2011). Moreover, the development of spherical Slepian functions (see Wieczorek and Simons (2005), Simons and Dahlen (2006), Simons et al. (2006), Wieczorek and Simons (2007), and Dahlen and Simons (2008) as well as FJ Simons’ article in this handbook) yielded the possibility to construct trial functions which are tailored for a particular spherical subregion of personal interest. Also for applications on the ball, global orthogonal trial functions (based on Dufour 1977; Ballani et al. 1993; Tscherning 1996) are available as well as localized trial functions, which can be used for a spline interpolation and approximation (see Fengler et al. 2006; Amirbekyan and Michel 2008; Berkel 2009; Berkel and Michel 2010; Berkel et al. 2011) as well as a wavelet analysis (see Michel 2002, 2005a,b). Note that, in particular, ill-posed inverse problems have also been handled with such tools. For a survey on these methods, see the textbook Michel (2013) and the author’s other article in this handbook. All these methods have their own advantages and disadvantages. There is, certainly, no such thing as a “perfect” method. For instance, global basis functions are ideal to represent global trends by low degree polynomials, but high-frequent structures or perturbations which are restricted to subdomains can cause problems. Moreover, not all localized trial functions can be used for all kinds of problems. Some trial functions are isotropic and are, therefore, a good choice to solve differential or integral equations represented by isotropic operators. However, non-isotropic operators or non-isotropic noise can be handled better with other approaches, which are, themselves not ideal for isotropic problems. Moreover, localized trial functions usually provide us with the possibility to vary their localization, i.e., the size of the “hat” of the hat function. However, not all numerical algorithms associated to such localized functions allow the combination of different levels of localization, although this would certainly bring some advantages, since not all local phenomena are of the same spatial size. Furthermore, some numerical algorithms for the calculation of an expansion of the unknown solution in a selected basis are based on the resolution of a system of linear equations. Large data sizes yield, consequently, limits for the numerical realization since the inversion of the system becomes too time-consuming or too unstable in such cases. For some methods, sophisticated
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
2123
numerical algorithms which compensate for such problems are known (see, e.g., the approach in Gutting (2012) for a spherical spline method), whereas such techniques are unknown for some other methods at present. Moreover, there exist methods which can do without the resolution of a system of linear equations, but require a numerical integration. Numerical integration methods on the sphere are wellknown for regular data grids. For irregular data grids on the sphere, a promising method was developed in Gräf et al. (2009). However, at present, the largest polynomial exactness which could be reached is given by degree 1,024, whereas some problems in gravitational field modeling and analysis require higher exactness degrees. Furthermore, the development of quadrature rules for subdomains of the sphere is still in its infancy (see Mhaskar (2004a,b) and Beckmann et al. (2012) for first approaches). In Mallat and Zhang (1993), a greedy algorithm which is called a Matching Pursuit (MP) is developed for the interpolation of data and is applied to problems in a Euclidean setting. This algorithm is further elaborated and extended to the use of kernel-based trial functions in Vincent and Bengio (2002). In Fischer (2011) and Fischer and Michel (2012), we used the ideas of this Matching Pursuit to develop an algorithm which has three additional features: first, it allows the resolution of inverse problems (represented by a set of functionals), i.e., the data and the solution need not be in the same space anymore and they are connected by equations. Second, a Tikhonov regularization was included to stabilize ill-posed problems. Third, in the numerical implementation, we used trial functions which are relevant for geoscientific problems (more precisely, a three-dimensional ball was used as a domain of the unknown function as it occurs for tomographic inverse problems; for the treatment of a sphere, i.e., the surface of a ball, as the domain, see Michel and Telschow (2014) and Telschow (2014)). This new algorithm, which contains the classical Matching Pursuit as a particular case, is, indeed, applicable to illposed inverse problems in the geosciences as we demonstrated in the numerical experiments in Fischer (2011) and Fischer and Michel (2012, 2013a,b). The particular advantages of this novel technique, which is called the Regularized Functional Matching Pursuit (RFMP), are as follows: • Trial functions of different types (e.g., global trial functions such as polynomials and localized hat-like functions with different hat widths) can be combined to represent the solution. This allows the combination of the advantages that different kinds of basis functions provide. The algorithm itself chooses a best basis (in a particular sense explained further below) out of a large set of possible trial functions. This set is called a dictionary and typically is constructed as a union of different basis systems, i.e., it is overcomplete. • The approximation is calculated iteratively, i.e., step by step the next element of the best basis is selected by the RFMP. As a consequence, intermediate results can also be analyzed in the sense of a multiresolution. • It is neither necessary to solve a system of linear equations nor to use a quadrature rule. Therefore, the method is more robust with respect to the use of large data
2124
V. Michel
grids (connected to high expectations on the accuracy of the result) and with respect to the handling of strongly irregular data grids. The latter is, in particular, also achieved by taking the least square error as a basis for the search for the solution. Hence, data points which are very close to each other simply cause similar summands in the calculation of the error functional but do not cause almost identical rows in a matrix to be inverted. The purpose of this article is to summarize the previous publications on the RFMP, to explain the algorithm, to prove the theoretical results, and to show some examples of previous numerical experiments. These objectives are represented in the outline of this paper. In Sect. 2, we describe the general setting of those inverse problems which can be solved with the RFMP. In Sect. 3, we derive the formulae of the RFMP and formulate the algorithm. In Sect. 4, we prove the theoretical properties of the RFMP, where we are able to partially extend the previous knowledge about the properties of our algorithm. In Sect. 5, we investigate the conditions of the convergence theorem proved in Sect. 4 in the case of particular examples of trial functions and geomathematical inverse problems. Furthermore, in Sect. 6, we recapitulate some examples of numerical results obtained with the RFMP and report our further experiences from numerical experiments. Finally, in Sect. 7, we conclude the paper with a summary and an outlook on forthcoming publications.
2
Constellation of the Problem
We consider inverse problems of the form F k F D yk ;
k D 1; : : : ; l;
where y D .y1 ; : : : ; yl /T 2 Rl is given, each F k W L2 .D/ ! R (k D 1; : : : ; l) is a linear and continuous functional, F 2 L2 .D/ is unknown, and D Rd is a measurable domain (in particular, D could be the unit sphere R3 or the ball B R3 with radius ˇ > 0). Furthermore, L2 .D/ stands for the usual Hilbert space of (almost everywhere identical) square-integrable functions on D. For some examples of (tomographic) inverse problems in the case D D B, see the author’s other article in this handbook. We will summarize the l functionals F k in the form of a vector in the operator F W L2 .D/ ! Rl . For the approximation of F , we have a set of trial functions D L2 .D/ available, which we call a dictionary. Without loss of generality, we assume that 0 … D. In the case D D , D could, for example, contain spherical harmonics, spline basis functions, wavelets, and Slepian functions. In the case D D B, D could, for instance, contain the corresponding counterparts of the global (i.e., polynomial) and localized trial functions known on the sphere. For further details on trial functions on the sphere and the ball, see the textbook Michel (2013) and the references therein.
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
3
2125
The Algorithm
The basic idea of the algorithm is to iteratively construct a sequence of approximations .Fn /n to the unknown function F by consecutively adding further summands to the approximation. In other words, we start with F0 WD 0 (or some initial approximation) and continue with FnC1 WD Fn C ˛nC1 dnC1 for all n 2 N0 . The new summand ˛nC1 dnC1 is chosen such that ky F .Fn C ˛nC1 dnC1 /k2Rl C kFn C ˛nC1 dnC1 k2L2 .D/ is minimized, where 2 RC 0 is a regularization parameter. We first introduce some basic notations. Definition 1. For a sequence of approximations .Fn /n span D L2 .D/, we define the sequence of residuals .Rn /n Rl by Rn WD y F Fn . Moreover, the family of mappings J W Rl L2 .D/ D R ! R, 2 RC 0 , is defined by J .y; F; d; ˛/ WD ky F .F C ˛d /k2Rl C kF C ˛d k2L2 .D/ : For the mapping J , we, obviously, get (note that F is linear) J .y; F; d; ˛/ D ky F F k2Rl 2˛hy F F; F d iRl C ˛ 2 kF d k2Rl C kF k2L2 .D/ C 2˛hF; d iL2 .D/ C ˛ 2 kd k2L2 .D/ : Hence, if we already know the first n C 1 approximations F0 ; : : : ; Fn , then J .y; Fn ; d; ˛/ D kRn k2Rl 2˛ hRn ; F d iRl C ˛ 2 kF d k2Rl C kFn k2L2 .D/ C 2˛ hFn ; d iL2 .D/ C ˛ 2 kd k2L2 .D/ ;
(1)
which has to be minimized to find ˛nC1 2 R and dnC1 2 D. As a consequence, we get 0D
ˇ ˇ @ J .y; Fn ; dnC1 ; ˛/ ˇˇ @˛ ˛D˛nC1
D 2 hRn ; F dnC1 iRl C 2˛nC1 kF dnC1 k2Rl C 2 hFn ; dnC1 iL2 .D/ C 2˛nC1 kdnC1 k2L2 .D/ such that ˛nC1 D
hRn ; F dnC1 iRl hFn ; dnC1 iL2 .D/ kF dnC1 k2Rl C kdnC1 k2L2 .D/
:
2126
V. Michel
We insert this result in (1) and obtain nC1 2 R l C kFnC1 k2 2 L .D/ D J .y; Fn ; dnC1 ; ˛nC1 / R n 2 hR ; F dnC1 iRl hFn ; dnC1 iL2 .D/ 2 n 2 D kR kRl C kFn kL2 .D/ 2 kF dnC1 k2Rl C kdnC1 k2L2 .D/ !2 hRn ; F dnC1 iRl hFn ; dnC1 iL2 .D/ 2 2 C C d kd k k kF 2 .D/ l nC1 nC1 L R kF dnC1 k2Rl C kdnC1 k2L2 .D/ n 2 hR ; F dnC1 iRl hFn ; dnC1 iL2 .D/ 2 n 2 D kR kRl C kFn kL2 .D/ : (2) kF dnC1 k2Rl C kdnC1 k2L2 .D/ This motivates the following algorithm. Algorithm 1 (Regularized Functional Matching Pursuit, RFMP). Let y 2 Rl , a linear and continuous operator F W L2 .D/ ! Rl , and an initial approximation F0 2 L2 .D/ be given. 1. Set n WD 0 and R0 WD y F F0 , choose a stopping criterion (e.g., require kRnC1 k < " for a given " > 0 or require n C 1 N for a given N 2 N), and choose a regularization parameter 2 RC 0 (e.g., with the L-curve method). 2. Determine dnC1 WD arg max d 2D
˛nC1 WD
n 2 hR ; F d iRl hFn ; d iL2 .D/ kF d k2Rl C kd k2L2 .D/
hRn ; F dnC1 iRl hFn ; dnC1 iL2 .D/ kF dnC1 k2Rl C kdnC1 k2L2 .D/
;
(3)
(4)
and set FnC1 WD Fn C ˛nC1 dnC1 and RnC1 WD Rn ˛nC1 F dnC1 . 3. If the stopping criterion is fulfilled, then FnC1 is the output. Otherwise, increase n by 1 and go to step 2. Note that the choice of the optimal dictionary element dnC1 is the most expensive part of the algorithm since every iteration step n C 1 requires the calculation of the fraction in (3) for each dictionary element d 2 D. For an efficient implementation, one should compute F d , kF d kRl , and kd kL2 .D/ for all d 2 D in the preprocessing and store the corresponding scalars and vectors, respectively. Moreover, the formulae
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
˝
RnC1 ; F d
˛ Rl
2127
D hRn ; F d iRl ˛nC1 hF dnC1 ; F d iRl ;
hFnC1 ; d iL2 .D/ D hFn ; d iL2 .D/ C ˛nC1 hdnC1 ; d iL2 .D/ allow a fast update of the required inner products. Provided that enough memory is available, one can further reduce the computational expenses by calculating the inner products hF d; F dQ iRl and hd; dQ iL2 .D/ for all pairs .d; dQ / 2 D2 of dictionary elements in the preprocessing and by using them in these updates.
4
Properties of the Algorithm
In the following, we derive some theorems for the algorithm. We start with theorems which are valid in the regularized case ( > 0) as well as the unregularized case ( D 0). Theorem 1. The sequence .kRn k2Rl C kFn k2L2 .D/ /n produced by the algorithm is monotonically decreasing and convergent. Proof. The monotonicity is a direct consequence of (2). Moreover, since the monotonically decreasing sequence is bounded from below, it is convergent. Theorem 1 in combination with (2) and (3), also shows that the strategy of the algorithm is to reduce the value kRn k2Rl C kFn k2L2 .D/ as much as possible in each iteration step. This means that, in this respect, an optimal progress in the Tikhonov regularized data misfit is achieved. We can certainly not expect that the algorithm produces a sequence .Fn /n which converges to an exact solution F of F F D y, if > 0. This is also something that we do not desire, if we try to solve an inverse problem which is ill-posed due to an unstable solution since F discontinuously depends on y, i.e., F is unstable with respect to small perturbations of the data vector y. Moreover, we certainly also cannot expect that the sequence .kRn k2Rl C kFn k2L2 .D/ /n converges to 0 since this would imply that .Fn /n converges to 0 and y D 0. We now prove a new theorem, which extends a result which we have previously published for the unregularized case only (see Theorem 4.5 in Fischer and Michel 2012). Theorem 2 (Convergence Theorem). Let the dictionary D satisfy the following properties: 1. “Semi-frame condition”: There exists a constant c > 0 such that, for all expansions
2128
V. Michel
P H D 1 kD1 ˇk dk with ˇk 2 R and dk 2 D, where the dk are not necessarily pairwise distinct but fj 2 N j dj D dk g is a finite set for each k 2 N, the following inequality is valid: ckH k2L2 .D/
1 X
ˇk2 :
kD1
2. C1 WD infd 2D .kF d k2Rl C kd k2L2 .D/ / > 0. If the sequence .Fn /n is produced by the RFMP and no dictionary P1 element is chosen 2 infinitely often, then .Fn /n converges in L2 .D/ to F1 WD nD1 ˛n dn 2 L .D/. Moreover, the following holds true: kk
1. If span D L .D/ D L2 .D/, C2 WD supd 2D kd kL2 .D/ < C1, and 2 RC 0 is an arbitrary parameter, then F1 solves 2
.F F C I /F1 D F y ; where F is the adjoint operator corresponding to F and I is the identity operator on L2 .D/. In other words, ky F F1 k2Rl C kF1 k2L2 .D/ D min
F 2L2 .D/
ky F F k2Rl C kF k2L2 .D/ ;
where the minimizer is unique, if > 0. 2. If span fF d j d 2 Dg D Rl and D 0, then F1 solves F F1 D y. Proof. According to (4) and (2), we have 1 ˛n2 D kF dn k2Rl C kdn k2L2 .D/
˝ n1 ˛ 2 R ; F dn Rl hFn1 ; dn iL2 .D/ kF dn k2Rl C kdn k2L2 .D/
1 D kF dn k2Rl C kdn k2L2 .D/ i h 2 Rn1 Rl C kFn1 k2L2 .D/ kRn k2Rl C kFn k2L2 .D/ :
(5)
Hence, condition 2 of the theorem to be proved as well as Theorem 1 yield 1 X nDN
D
˛n2
1 i 1 X h Rn1 2 l C kFn1 k2 2 kRn k2 l C kFn k2 2 L .D/ L .D/ R R C1 nDN
i 1 h RN 1 2 l C kFN 1 k2 2 lim kRn k2 l C kFn k2 2 : (6) L .D/ L .D/ R R n!1 C1
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
2129
Obviously, for N ! 1, the right-hand side converges to 0 and, consequently, also the P1left-hand side. Thus, condition 1 of2 the theorem to be proved yields that F1 WD nD1 ˛n dn is a convergent series in L .D/: lim kF1
N !1
FN 1 k2L2 .D/
1 2 X D lim ˛n dn 2 N !1 nDN
L .D/
1 X 1 lim ˛2 D 0 : c N !1 nDN n
As a consequence, the sequence of summands must converge to 0, i.e., with (4), we get 2 ˛nC1
D
hRn ; F dnC1 iRl hFn ; dnC1 iL2 .D/
!2
kF dnC1 k2Rl C kdnC1 k2L2 .D/
! 0 as n ! 1 :
Moreover, with the continuity of F , the boundedness of the dictionary in the case (1), and (3), we obtain, for all d 2 D, the inequality hRn ; F dnC1 iRl hFn ; dnC1 iL2 .D/
!2
kF dnC1 k2Rl C kdnC1 k2L2 .D/ n 2 hR ; F dnC1 iRl hFn ; dnC1 iL2 .D/ 1
.kF k2L C / kdnC1 k2L2 .D/ kF dnC1 k2Rl C kdnC1 k2L2 .D/ n 2 hR ; F d iRl hFn ; d iL2 .D/ 1 ; .kF k2L C /C22 kF d k2Rl C kd k2L2 .D/ where kF kL is the usual operator norm given by kF kL WD sup
F 2L2 .D/ F ¤0
kF F kRl kF kL2 .D/
for continuous operators. Hence, hRn ; F d iRl hFn ; d iL2 .D/ D hF Rn Fn ; d iL2 .D/ ! 0
as n ! 1
for all d 2 D and, due to the first requirement on D in the case (1), also for all d 2 L2 .D/. Thus, .F Rn /n weakly converges to F1 . However, since F Rn D F y F F Fn , the continuity of F and F implies that .F Rn /n strongly converges to F y F F F1 . Consequently, F y F F F1 D F1 : Hence, F y D F F C I F1 :
(7)
2130
V. Michel
Let now F1 be a fixed solution of (7). It is well-known (see, e.g., Louis 1989, p. 89) that (note that F F C I is self-adjoint) ky F F k2Rl C kF k2L2 .D/ D kyk2Rl 2hy; F F iRl C hF F; F F iRl C hF; F iL2 .D/ ˝ ˛ D kyk2Rl 2 hF y; F iL2 .D/ C F F C I F; F L2 .D/ ˛ ˝ D kyk2Rl 2 hF y; F iL2 .D/ C F F C I .F F1 / ; F F1 L2 .D/ ˛ ˝ ˛ ˝ C 2 F F C I F1 ; F L2 .D/ F F C I F1 ; F1 L2 .D/ ˛ ˝ D kyk2Rl C F F C I .F F1 / ; F F1 L2 .D/ hF y; F1 iL2 .D/ (8) is minimal if F D F1 , since F F C I is positive semi-definite: ˝ ˛ F F C I F; F L2 .D/ D kF F k2Rl C kF k2L2 .D/ 0
for all F 2 L2 .D/ :
Moreover, in the case > 0, (8) is minimal if and only if F D F1 , since F F CI is positive definite. Note that this also implies that F1 D .F F C I /1 F y is uniquely determined, if > 0. In the unregularized case (case (2), i.e., D 0), Theorem 1 yields that .kRn kRl /n converges. Hence, there exists a convergent subsequence .Rnj /j with limit R1 2 Rl . Moreover, (3), (5), and the convergence of the series in (6) yield that, for all d 2 D, ˝ n ˛2 R j ; F dnj C1 Rl hR1 ; F d i2Rl hRnj ; F d i2Rl 0 D lim lim D 0: j !1 j !1 kF d k2Rl kF d k2Rl kF dnj C1 k2Rl Due to the requirement that span fF d j d 2 Dg D Rl , we get R1 D 0. Since .kRn kRl /n is monotonically decreasing and every convergent subsequence converges to 0, the sequence .kRn kRl / converges to 0 and, thus, .Rn /n converges to 0 2 Rl . Finally, we use the continuity of F and conclude that F F1 D lim F Fn D lim .y Rn / D y: n!1
n!1
Note that the requirements of (2) are easier to achieve than the requirements of (1) due to the finite dimensions of Rl and the infinite dimensions of L2 .D/. The results of Theorem 2 show that the algorithm, indeed, converges to the desired result. In the absence of a regularization, the limit F1 of the algorithm is an exact solution, whereas the regularized case yields a solution of the Tikhonov
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
2131
regularized normal equation, i.e., we, indeed, obtain the minimal (regularized) data misfit, as we desired. Since the RFMP contains the MP from Mallat and Zhang (1993) and Vincent and Bengio (2002) as a particular case, the convergence rate of the MP, which was proved in Mallat and Zhang (1993), can analogously also be proved for the RFMP in the case D 0, as we observed in Fischer (2011) and Fischer and Michel (2012). We state this result here. Note that the RFMP was called the FMP (Functional Matching Pursuit) in the case D 0 in Fischer (2011) and Fischer and Michel (2012). Definition 2. Let a dictionary D be given. The corresponding correlation ratio is given, for each v 2 Rl n f0g, by .v/ WD sup d 2D F d ¤0
jhv; F d iRl j : kvkRl kF d kRl
Furthermore, we define the worst case correlation ratio by I ./ WD
inf
v2Rl nf0g
.v/ :
The following properties of the correlation ratio are obvious: • The correlation ratio .v/ and, consequently, also I ./ are bounded from below by 0 and, due to the Cauchy-Schwarz inequality, bounded from above by 1. • The correlation ratio .v/ is independent of kvk, i.e., .rv/ D .v/ for all r 2 R n f0g and all v 2 Rl n f0g. • In the case v D Rn and D 0, the correlation ratio .Rn / is obtained for d D dnC1 (see (3)). Theorem 3 (Convergence Rate of the FMP). Let a function F 2 L2 .D/, the corresponding data vector y WD F F 2 Rl , and a dictionary D with span fF d j d 2 Dg D Rl be given and let the sequence .Rn /n be produced by the RFMP with D 0 (i.e., the FMP). Then the residual converges exponentially to 0. More precisely,
n=2 kRn kRl kykRl 1 I ./2 for all n 2 N, where I ./ > 0. Moreover, the choice of a Tikhonov regularization term for the penalty term in J automatically inherits the well-known nice properties of the Tikhonov regularization (see, e.g., Rieder (2003) and the reference therein). These are the stability (i.e., the continuous dependence of the solution with respect to the data vector) and the convergence with respect to the regularization parameter. We start with the following
2132
V. Michel
theorem, which (slightly) extends our previous result on the stability of the solution (see Fischer 2011, Theorem 4.7; Fischer and Michel 2012, Theorem 5.4). Theorem 4 (Stability of the Solution). Let the dictionary satisfy conditions 1, 2, and (1) of Theorem 2 and let > 0. Moreover, let .y k /k Rl be a convergent sequence with limit y 2 Rl and let, for each k 2 N0 , F1;k 2 L2 .D/ be the corresponding limit produced by the RFMP for the data vector y k . Then .F1;k /k converges to the limit F1 produced by the RFMP for the data vector y. Proof. Due to Theorem 2, each F1;k is the unique minimizer of 2 L2 .D/ 3 F 7! y k F F Rl C kF k2L2 .D/ :
(9)
From the well-known theory of Tikhonov regularization (see, e.g., Engl et al. 1989, Theorem 2.1; Rieder 2003, p. 241; Seidman and Vogel 1989, Theorem 2), we know that the minimizer of (9) converges to the minimizer of L2 .D/ 3 F 7! ky F F k2Rl C kF k2L2 .D/
(10)
as k ! 1. Moreover, the minimization of (10) is equivalent to the equation F F C I F D F y ; which is uniquely solved by the limit F1 of the RFMP in the case of the data vector y, as we showed in the proof of Theorem 2. This completes the proof of Theorem 4. The second theorem regarding the regularization addresses the convergence of the solution with respect to the regularization parameter. Theorem 5 (Convergence of the Regularization). Let y 2 F .L2 .D// be a given (exact) data vector and .y " /">0 Rl be a family of given perturbed data vectors with ky y " kRl ". Moreover, let F C be the minimum-norm solution of F F D y, i.e., C ˇ ˚
F 2 D min kF kL2 .D/ ˇ F 2 L2 .D/ and F F D y ; L .D/ which is obtained by the Moore-Penrose pseudoinverse F C as F C D F C y. Furthermore, let the regularization parameter W RC ! RC be chosen in dependence on the noise level " in the sense that "2 : "!0C ."/
lim ."/ D 0 D lim
"!0C
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
2133
If .F1;" /">0 denotes the family of limits produced by the RFMP in the case of the data vector y " and the regularization parameter ."/ for each " > 0, provided that again conditions 1, 2, and (1) of Theorem 2 are satisfied, then lim F1;" F C L2 .D/ D 0 :
"!0C
Proof. According to Theorem 2, the limit F1;" uniquely minimizes L2 .D/ 3 F 7! ky " F F k2Rl C ."/kF k2L2 .D/ :
(11)
From, e.g., Engl et al. (1996, Theorem 5.2), we know that the limit " ! 0C and the conditions stated above yield that the family of minimizers of (11) has the required convergence property.
5
Discussion of Some Dictionaries
We discuss here some examples of inverse problems and dictionaries that are relevant for geomathematical problems. The list of examples is certainly not complete and not all theoretical problems that occur here have been solved so far. For reasons of simplicity, we assume here, without loss of generality, that all dictionary elements have been normalized, i.e., kd kL2 .D/ D 1 for all d 2 L2 .D/. This has also been used in the majority of our numerical experiments since this assumption simplifies the calculation of the denominators of (3) and (4) in Algorithm 1. The primary objective of this section is the investigation of the conditions stated in Theorem 2. The assumption on a normalized dictionary has the consequence that the requirement on C1 becomes trivial in the regularized case ( > 0), whereas it (always) reduces to the requirement that infd 2D kF d kRl > 0 in the unregularized case ( D 0). Moreover, we, obviously, have C2 D 1 in the case of this assumption. We focus here on two particular domains: the unit sphere R3 and the ball B R3 with radius ˇ > 0. For both domains, a series of trial functions is available, which we will only briefly summarize here. For further details, see, e.g., the textbook Michel (2013) and the author’s other article in this handbook. We can, roughly, subdivide the trial functions into two types: global and localized trial functions. Examples of global trial functions are orthogonal polynomials such as the so-called spherical harmonics fYn;j gn2N0 I j D1;:::;2nC1 (see the first column of I Fig. 1) on the unit sphere as well as the systems fGm;n;j gm;n2N0 I j D1;:::;2nC1 and II fGm;n;j gm;n2N0 I j D1;:::;2nC1 on the ball B, where it should be mentioned that the latter functions on B are not polynomials in Cartesian coordinates. All three systems are orthonormal basis systems in L2 ./ and L2 .B/, respectively, and, therefore, trivially satisfy the semi-frame condition (condition 1 of Theorem 2) as well as condition (1) of Theorem 2.
2134
V. Michel
Some localized trial functions are based on reproducing kernels Kh .; / in the sense that Kh .x; / is a hat function, where x is the center of the hat and the parameter h controls the localization, i.e., the “hat width”. They can be used for spline interpolation/approximation as well as for a wavelet analysis. A celebrated example of a localized trial function on the sphere is generated by the AbelPoisson kernel in the sense that Kh . / W 3 7! Kh . / WD
1 h2 4 .1 C h2 2h /3=2
;
see the second column of Fig. 1. Further examples on the sphere and the ball are known. For many inverse problems in geomathematics, singular value decompositions are known. This can, e.g., occur in the sense that the operator F satisfies F Yn;j D n Yn;j k kD1;:::;l ;
Fig. 1 Examples of functions which can be used as elements of a dictionary on the sphere are spherical harmonics (see the first column for examples of degree n D 5) as global trial functions and Abel-Poisson kernels (see the second column, where h D 0:5 (top), h D 0:7 (middle), and h D 0:9 (bottom) and, in each case, D .0; 1; 0/T were used), which are localized trial functions and are often used as spline or wavelet basis functions on the sphere
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
2135
where .n /n converges to 0 and f k gkD1;:::;l is a grid of pairwise distinct points on . An example is the downward continuation problem, where F maps a harmonic potential (such as the gravitational potential) from the surface of the Earth to a higher altitude (e.g., a satellite altitude). The inverse problem consists of the recovery of the potential at the surface from given data at the point grid f k gkD1;:::;l at the high altitude. In this example (see also the results in Telschow (2014) for the application of an enhanced version of the RFMP to the downward continuation problem), we have n D c n for constants c > 0 and 0 < < 1:
(12)
If we include all spherical harmonics basis functions fYn;j gn2N0 I j D1;:::;2nC1 in the dictionary D, then theqwell-known maximum-norm estimate for spherical harmonics max2 jYn;j ./j 2nC1 4 (see, e.g., Freeden et al. 1998, Lemma 3.1.5 or Müller 1966, Lemma 8) yields l X 2 F Yn;j 2 l D n2 Yn;j k R kD1
l X
n2
kD1
D l n2
2n C 1 4
2n C 1 4
!
n!1
0:
Hence, infd 2D kF d kRl D 0 in this case. This violation of condition 2 of Theorem 2 (in the unregularized case) cannot be compensated by renormalizing the dictionary elements for the following reason: let us replace the normalized basis fYn;j gn2N0 I j D1;:::;2nC1 D by a system fˇn Yn;j gn2N0 I j D1;:::;2nC1 D. We have to take into account the (possible) arbitrariness of q the point grid f k gkD1;:::;l such that an assumption like maxkD1;:::;l jYn;j . k /j 2nC1 4 for all n 2 N0 and all j D 1; : : : ; 2n C 1, where is a fixed constant with 0 < < 1, appears to be reasonable (note the maximum-norm estimate mentioned above). In this case, condition 2 of Theorem 2 is only satisfied (in the unregularized case), if
inf
n2N0 j D1;:::;2nC1
kF .ˇn Yn;j /k2Rl D
inf
n2N0 j D1;:::;2nC1
l X kD1
2 n2 ˇn Yn;j k
2 2 2 2n C 1 inf n ˇn n2N0 4 > 0:
2136
V. Michel
As a consequence, we have to choose .ˇn /n as an exponentially diverging sequence due to (12). This, however, violates both condition (1) of Theorem 2 since C2 D sup kd kL2 ./ d 2D
kˇn Yn;j kL2 ./ D sup ˇn D C1
sup n2N0 j D1;:::;2nC1
n2N0
and the semi-frame condition (condition 1 of Theorem 2) since 2 1 2nC1 X X 1 ˇn Yn;j 2 nD0 j D1 ˇn
D
L ./
1 2nC1 X X
1 D C1 ;
nD0 j D1
but 1 2nC1 X X 1 < C1 : ˇn2 nD0 j D1
As a consequence, the convergence to a solution in the unregularized case cannot be guaranteed in the case of spherical harmonics. This might have the consequence that numerical instabilities occur, if relatively low regularization parameters are used in the combination with a dictionary with high degree spherical harmonics. Let us discuss the following alternative: we insert Abel-Poisson kernels (with a grid of pairwise distinct points f i gi 2N0 which is dense in ) Kh i D
D
1 h2 4 .1 C h2 2h i /3=2 1 X 2n C 1
4
nD0
D
1 X nD0
n
h
2nC1 X
hn Pn i Yn;j i Yn;j
j D1
in the dictionary, where we used the addition theorem for spherical harmonics (see, e.g., Freeden et al. 1998, Theorem 3.1.3 or Müller 1966, Theorem 2) in this representation and Pn is a Legendre polynomial of degree n. We now verify the existence of C1 and C2 . Since F is linear and continuous, we get 0 1 X X 1 2nC1 i 2 F Kh l D @ hn Yn;j i n Yn;j k A R nD0 j D1
2
kD1;:::;l Rl
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
2137
2 ! 1 X 2n C 1 2 n n i k Pn
Dc h 4 nD0 l kD1;:::;l D c2
D c2
l X
1 X
kD1
nD0
l X
hn n
2n C 1 Pn i k 4
!2
R
2 Kh i k
kD1 2
c l Kh .1/2 >0 for all i such that infi 2N0 kF Kh . i /kRl > 0. Moreover, the fact that Pn .1/ D 1 for all n 2 N0 implies i 2 ˛ ˝ Kh 2 D Kh i ; Kh i 2 L ./ L ./ D
1 2nC1 X X
2 h2n Yn;j i
nD0 j D1
D
1 X 2n C 1 2n i i h Pn 4 nD0
D Kh2 .1/ such that supi 2N0 kKh . i /k2L2 ./ D Kh2 .1/ < C1. Hence, the Abel-Poisson kernel guarantees the existence of C1 and C2 in the unregularized and the regularized case. It remains, however, an open problem to verify the semi-frame condition for this particular dictionary (note that, in Freeden and Schreiner (1995), it is shown that the Abel-Poisson kernel constitutes a frame on the sphere; however, we use the word “frame” here in a different context). At least, it is well-known that the choice of a dense point grid f i gi 2N0 in yields a basis in L2 ./ (see Freeden et al. 1998, Corollary 6.4.3), i.e., span fKh . i /gi 2N0
kkL2 ./
D L2 ./;
which is required by condition (1) of Theorem 2. Similar considerations are possible for the case where the ball B is the domain of the unknown function. For example, the inverse gravimetric problem consists of the inversion of the gravitational potential for the mass density distribution inside the Earth and at its surface (see the survey article (Michel and Fokas 2008) for further details). In this case, we obtain a linear and continuous operator F W L2 .B/ ! Rl which satisfies
2138
V. Michel
I F Gm;n;j
D ım0 Qn Yn;j
xk ˇ ˇ ˇx k ˇ
!! ; kD1;:::;l
where fx k gkD1;:::;l is a grid of pairwise distinct points at the Earth’s surface or outside the Earth where the gravitational potential is given and ım0 is the Kronecker delta. In the case of data given at the surface, .Qn /n converges to 0 at the order O.n3=2 /. If the data are given outside the Earth, then .Qn /n exponentially converges to 0, since the downward continuation problem is involved.
6
Numerical Results
We demonstrate here some numerical results which we obtained for the inverse gravimetric problem and which were previously published. The purpose of this section is to show the applicability of the RFMP and its particular features which are visible in the discussion of the numerical results. For each example, we refer to the corresponding papers for further details on the implementation. We also refer to (Telschow 2014) where the downward continuation problem and the treatment of extremely irregular data grids are discussed. In all cases, the unknown function (mass density anomalies) is a function on the ball B. For this purpose, we used dictionaries with combined global trial functions I (the orthogonal polynomials Gm;n;j ) for low degrees and localized trial functions of different types (i.e., hat functions with different hat widths). It is a particular feature of the RFMP that such a heterogeneous mixture of trial functions can be used for the calculation of the solution. The algorithm selects those trial functions which are (in some sense) the best choice to represent the result (see Algorithm 1). The obtained solution is sparse in the sense that essentially less trial functions than available are eventually used. We do not use any a-priori information on the solution. In particular, the grid of centers of the hat functions in the dictionary is uniformly distributed over the investigated region. In the first two examples, we consider the (static) gravitational potential model EGM2008 (Earth Gravitational Model 2008, see Pavlis et al. 2008). In both cases, the model is evaluated (starting from polynomial degree 3) on a regular regional point grid, slightly above the surface. The mass density anomaly is computed with the RFMP and plotted at the surface of the Earth in the corresponding region. In the first case (from Fischer and Michel 2012), data at 25,440 points over South America are considered, the dictionary contains approximately 120,000 trial functions, and the RFMP is stopped after 20,000 iterations. The obtained result F20;000 is shown on the left-hand side of Fig. 2, where typical mass anomalies like the Andes, the Caribbean, or traces of some tectonic structures can be identified. The righthand side of the figure shows the centers of those localized trial functions which were chosen by the RFMP. Clearly, the algorithm prefers hat functions which are concentrated to those areas where the solution has a complicated detail structure.
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
2139
5 20 N
4 3
0
2
80 W
60 W
40 W
1 0 kg/m –1
3
20 S
–2 –3
40 S
–4 –5
Fig. 2 The RFMP was used to invert EGM2008 data for mass anomalies in South America (left). The algorithm prefers localized trial functions which are concentrated to areas with a high detail density as the plot of the chosen centers of the hat functions (right) shows; from Fischer and Michel (2012)
This is a reasonable choice and corresponds to the intended sparsity effect since many hat functions are rejected in areas where the solution has a low detail density. The second example (see Fischer and Michel 2013a) has a very similar constellation. This time, we use the same amount of data but over the Himalayas and India and a slightly larger dictionary (degrees 3; : : : ; 50 for the polynomials instead of 3; : : : ; 8) and stop the RFMP again after 20,000 iterations. However, in this example we compare the result with the result obtained for data with artificial noise. Figure 3 shows the result, where the first line corresponds to the noise-free case with D 500 and the second line corresponds to the case where we added 5 % uniformly distributed random noise (relative to the data, i.e. the perturbed data are yOi D yi C0:05i yi for some random values i 2 Œ1; 1 ) and used the regularization parameter D 600. Again, the left column shows the approximate solution F20;000 and the right column shows the centers of the localized trial functions chosen by the RFMP. These results show the same quality of the algorithm that the application to the data over South America revealed. The solution allows the identification of typical mass anomalies in the corresponding region, and the trial functions are chosen primarily in those areas where the solution has a high detail density. Moreover, the result of this second example shows that the method is, indeed, a regularization, i.e., the solution shows only a low sensitivity with respect to noise (see also Fig. 4, where the absolute difference of the two solutions is shown), though the original problem is ill-posed. Note also that the choice of the centers of the localized trial functions is stable as well, if noise is added. In the third example (see Fischer and Michel 2012), we show the applicability of the RFMP to the identification of mass transports in data of the GRACE mission
2140
V. Michel 5 4
45 N
3 2 1
30 N
0 kg/m3 –1 –2
15 N
–3 –4 –5
0 60 E
75 E
90 E
75 E
90 E
105 E
5 4
45 N
3 2 1
30 N
0 kg/m3 –1 –2
15 N
–3 –4 –5
0 60 E
105 E
Fig. 3 A test similar to Fig. 2 was applied to the Himalayas and India. The second row shows the result obtained when 5 % uniformly distributed noise relative to the data was added to the data used for the first row. Obviously, the RFMP is, indeed, a regularization, i.e., the result of a data inversion is stable, also if the underlying inverse problem is ill-posed; from Fischer and Michel (2013a)
(Gravity Recovery and Climate Experiment, see WWW: CSR). The GRACE mission has provided us with monthly models of the Earth’s gravitational field since its launch in 2002. We computed the mean of all monthly potentials from July 2004 to June 2009 provided in Release 04, Level 2 of the Jet Propulsion Laboratory ( WWW: JPL). This mean potential was subtracted from each monthly potential of the year 2008. Such differences of GRACE models are usually contaminated with noise, which becomes apparent in North-South oriented stripes in the results. We used the Freeden wavelets of cp-type (see Schreiner 1996; Freeden et al. 1998, pp. 295– 296) to denoise the obtained model differences. The Freeden wavelets are general isotropic bandpass filters for spherical signals and are sufficient for our purposes. Note that the result could be further improved by using a more sophisticated filtering technique which has been particularly tailored for the GRACE stripes such as the method introduced in Kusche (2007).
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
2.5
2
1.5
kg/m3
Fig. 4 The absolute difference of the results obtained in the left column of Fig. 3 for the inversion of noise-free and noisy data confirms the stability of the regularization algorithm RFMP; from Fischer and Michel (2013a)
2141
1
0.5
0
The denoised monthly differences were inverted with the RFMP for mass anomalies. The used dictionary again was a combination of low degree orthogonal polynomials and localized trial functions with different hat widths, in analogy to the previous two examples. The data were given on a regular grid above South America with 11,990 points, the same regularization parameter D 8:7128 was used for all months, and the RFMP was stopped after 10,000 iterations. The results are shown in Figs. 5 and 6. We can clearly identify the mass transports in the Amazon area which are caused by the different rain seasons in the northern and the southern part. We recommend the cited references Fischer and Michel (2012) and Fischer and Michel (2013a) for further reading, since the numerical results are analyzed in more detail there, and additional results are shown. In particular, the influence of the localized trial functions is visualized, where it becomes clear that local perturbations of the data only have a local influence on the model. Moreover, another example of an identification of mass transports by means of the RFMP is shown in Fischer and Michel (2013b), where some droughts and a flood in South America are recovered from GRACE data.
7
Conclusions
The RFMP was presented as a best basis algorithm which was able to combine trial functions of different kinds (like spherical harmonics and spline/wavelet basis functions) to construct an approximate solution to an (geoscientific) ill-posed inverse problem. This is done by iteratively selecting an additional trial function from a large toolbox of available trial functions (called the dictionary) and adding the chosen trial function to the approximation obtained in the previous iteration step. The criterion for the selection of this next summand in the expansion of
Fig. 5 A long-term mean was subtracted from each monthly model of the gravitational field provided by the GRACE mission for the year 2008. These differences were inverted for mass anomalies with the RFMP. The results are shown for January 2008 (top left) via March 2008 (top right) until June 2008 (bottom right). Typical mass transports due to different rain seasons in the northern part and the southern part of the Amazon are visible in the results. Note, in particular, the mass surplus north of the equator in April 2008 (bottom left); from Fischer and Michel (2012)
2142 V. Michel
Fig. 6 Here, the remaining months corresponding to Fig. 5 are shown from July 2008 (top left) via September 2008 (top right) until December 2008 (bottom right). Note, in particular, the mass surplus south of the equator in September and October 2008; from Fischer and Michel (2012)
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences 2143
2144
V. Michel
the approximate solution is the minimization of the data misfit plus a Tikhonov regularization term. In this sense, the selected system of trial functions is a “best basis” (note that we call it “best” here, although further enhancements of the RFMP are possible such as in Telschow (2014), where the number of trial functions to be chosen for a given accuracy level is additionally reduced). Numerical experiments show that the algorithm is able to construct a stable and good approximation to the unknown function. Several application scenarios were used for the practical verification of the method. Further numerical experiments performed in our group also showed that often the algorithm starts by predominantly choosing orthogonal polynomials (i.e., global trial functions) in the first iteration steps. Later in the iteration, it chooses localized trial functions more and more often. In other words: At the beginning, a coarse “image” of the main features of the solution is constructed by using global trial functions. Later, the remaining data misfit can be better reduced by locally correcting the solution, which can be achieved best, by taking localized trial functions. Furthermore, in some experiments, we could also show that, with an increasing number of iterations, functions which are more and more localized (i.e., smaller and smaller hats) are selected from the available set of localized trial functions. Thus, the algorithm, indeed, uses the provided possibility to use spline/wavelet basis functions with different levels of localizations. Moreover, some figures in this paper also illustrate the choice of the centers of the hat functions. It is obvious that the algorithm selects much more hat functions with centers in areas where the solution has a high detail density. This appears to be reasonable, since locally complicated structures have higher degrees of freedom and, consequently, require more basis functions for their representation. Furthermore, we also demonstrated that the algorithm is well appropriate for the handling of ill-posed problems, also in the presence of noise. We consider the RFMP to be a promising approach. It is worth investigating its properties further in the future and to look for additional applications. Several improvements of the algorithm are also possible. For example, different norms than the L2 -norm could be tested for the regularizing penalty term, where Sobolev norms are one possible alternative. Note that the choice of the L1 -norm or anything similar is probably not reasonable, as first investigations in our group suggest, though such a norm is commonly used for sparsity methods. The reason why this is probably not appropriate is the fact that the purpose of the penalty term in our approach is not to obtain sparsity in the solution but to regularize an ill-posed problem. Another aspect that could be improved refers to the algorithm itself. At present, we keep all chosen dictionary elements and their corresponding coefficients fixed when we go to the next iteration step. The minimization could, however, be improved by reconsidering these previous choices in the next step. The ideal case would be the calculation of a best n-term approximation (see Temlyakov 2003) in the sense that we have to find, in iteration step n, coefficients ˛1 ; : : : ; ˛n 2 R and dictionary elements d1 ; : : : ; dn 2 D such that
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
! n X ˛k dk y F kD1
D
Rl
inf
c1 ;:::;cn 2R g1 ;:::;gn 2D
! n X ck gk y F kD1
2145
Rl
or (taking into account that we want to add a regularization) n 2 !2 n X X ˛k dk C ˛k dk y F kD1
D
inf
c1 ;:::;cn 2R g1 ;:::;gn 2D
Rl
kD1
L2 .D/
0 n 2 !2 n X X @ ck gk C ck gk y F kD1
Rl
kD1
1 A:
L2 .D/
This is, however, very time-consuming such that appropriate trade-offs between the required CPU time on the one hand and the obtained sparsity and accuracy of the solution on the other hand have to be found. In Vincent and Bengio (2002), improvements of the MP in this regard are investigated. However, such techniques have currently not been published for a regularized version which handles ill-posed inverse problems. These mentioned challenges for future research have partially already been tackled in our group and will be addressed in forthcoming publications. Acknowledgements The support by the German Research Foundation (project number DFG MI 655/7-1) is gratefully acknowledged. Moreover, the author wishes to thank Roger Telschow for proofreading the manuscript.
References Amirbekyan A, Michel V (2008) Splines on the three-dimensional ball and their application to seismic body wave tomography. Inverse Probl 24:015022 (25 pp) Antoine JP, Vandergheynst P (1999) Wavelets on the 2-sphere: a group-theoretic approach. Appl Comput Harmon Anal 7:1–30 Ballani L, Engels J, Grafarend EW (1993) Global base functions for the mass density in the interior of a massive body (Earth). Manuscr Geod 18:99–114 Beckmann J, Mhaskar HN, Prestin J (2012) Quadrature formulas for integration of multivariate trigonometric polynomials on spherical triangles. Int J Geomath 3:119–138 Berkel P (2009) Multiscale methods for the combined inversion of normal mode and gravity variations. PhD thesis, Geomathematics Group, Department of Mathematics, University of Kaiserslautern, Shaker, Aachen Berkel P, Michel V (2010) On mathematical aspects of a combined inversion of gravity and normal mode variations by a spline method. Math Geosci 42:795–816 Berkel P, Fischer D, Michel V (2011) Spline multiresolution and numerical results for joint gravitation and normal mode inversion with an outlook on sparse regularisation. Int J Geomath 1:167–204
2146
V. Michel
Dahlen FA, Simons FJ (2008) Spectral estimation on a sphere in geophysics and cosmology. Geophys J Int 174:774–807 Dufour HM (1977) Fonctions orthogonales dans la sphère — résolution théorique du problème du potentiel terrestre. Bull Geod 51:227–237 Engl HW, Kunisch K, Neubauer A (1989) Convergence rates for Tikhonov regularization of nonlinear ill-posed problems. Inverse Probl 5:523–540 Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Mathematics and its applications, vol 375. Kluwer Academic, Dordrecht Fengler M, Michel D, Michel V (2006) Harmonic spline-wavelets on the 3-dimensional ball and their application to the reconstruction of the earth’s density distribution from gravitational data at arbitrarily shaped satellite orbits. ZAMM-Z Angew Math Me 86:856–873 Fischer D (2011) Sparse regularization of a joint inversion of gravitational data and normal mode anomalies. PhD thesis, Geomathematics Group, Department of Mathematics, University of Siegen, Verlag Dr. Hut, Munich Fischer D, Michel V (2012) Sparse regularization of inverse gravimetry – case study: spatial and temporal mass variations in South America. Inverse Probl 28:065012 (34 pp) Fischer D, Michel V (2013a) Automatic best-basis selection for geophysical tomographic inverse problems. Geophys J Int 193:1291–1299 Fischer D, Michel V (2013b) Inverting GRACE gravity data for local climate effects. J Geod Sci 3:151–162 Freeden W (1981a) On approximation by harmonic splines. Manuscr Geod 6:193–244 Freeden W (1981b) On spherical spline interpolation and approximation. Math Methods Appl Sci 3:551–575 Freeden W, Schreiner M (1995) Non-orthogonal expansions on the sphere. Math Methods Appl Sci 18:83–120 Freeden W, Schreiner M (1998) Orthogonal and non-orthogonal multiresolution analysis, scale discrete and exact fully discrete wavelet transform on the sphere. Constr Approx 14:493–515 Freeden W, Windheuser U (1996) Spherical wavelet transform and its discretization. Adv Comput Math 5:51–94 Freeden W, Gervens T, Schreiner M (1998) Constructive approximation on the sphere – with applications to geomathematics. Oxford University Press, Oxford Gerhards C (2011) Spherical decompositions in a global and local framework: theory and application to geomagnetic modeling. Int J Geomath 1:205–256 Gräf M, Kunis S, Potts D (2009) On the computation of nonnegative quadrature weights on the sphere. Appl Comput Harmon Anal 27:124–132 Gutting M (2012) Fast multipole accelerated solution of the oblique derivative boundary value problem. Int J Geomath 3:223–252 Heiskanen WA, Moritz H (1981) Physical geodesy. Institute of Physical Geodesy, Technical University Graz/Austria (Reprint) Holschneider M (1996) Continuous wavelet transforms on the sphere. J Math Phys 37:4156–4165 Keiner J, Kunis S, Potts D (2009) Using NFFT 3 – a software library for various non-equispaced fast Fourier transforms. ACM Trans Math Softw 36:Article 19 (30 pp) Kusche J (2007) Approximate decorrelation and non-isotropic smoothing of time-variable GRACE-type gravity field models. J Geod 81:733–749 Louis AK (1989) Inverse und schlecht gestellte Probleme. Teubner, Stuttgart Mallat SG, Zhang Z (1993) Matching pursuits with time-frequency dictionaries. IEEE Trans Signal Process 41:3397–3415 Mhaskar HN (2004a) Local quadrature formulas on the sphere. J Complex 20:753–772 Mhaskar HN (2004b) Local quadrature formulas on the sphere, II. In: Neamtu M, Saff EB (eds) Advances in constructive approximation. Nashboro Press, Brentwood, pp 333–344 Michel V (2002) Scale continuous, scale discretized and scale discrete harmonic wavelets for the outer and the inner space of a sphere and their application to an inverse problem in geomathematics. Appl Comput Harmon Anal 12:77–99 Michel V (2005a) Wavelets on the 3-dimensional ball. Proc Appl Math Mech 5:775–776
RFMP: An Iterative Best Basis Algorithm for Inverse Problems in the Geosciences
2147
Michel V (2005b) Regularized wavelet-based multiresolution recovery of the harmonic mass density distribution from data of the earth’s gravitational field at satellite height. Inverse Probl 21:997–1025 Michel V (2013) Lectures on constructive approximation – Fourier, spline, and wavelet methods on the real line, the sphere, and the ball. Birkhäuser, Boston Michel V, Fokas AS (2008) A unified approach to various techniques for the non-uniqueness of the inverse gravimetric problem and wavelet-based methods. Inverse Probl 24:045019 (25 pp) Michel V, Telschow R (2014) A non-linear approximation method on the sphere. Int J Geomath, accepted for publication Müller C (1966) Spherical harmonics. Springer, Berlin Pavlis NK, Holmes SA, Kenyon SC, Factor JK (2008) An Earth gravitational model to degree 2160: EGM2008. General Assembly of the European Geosciences Union, Vienna Rieder A (2003) Keine Probleme mit Inversen Problemen. Vieweg, Braunschweig Schreiner M (1996) A pyramid scheme for spherical wavelets. AGTM report 170, Geomathematics Group, Kaiserslautern Schröder P, Sweldens W (1995) Spherical wavelets: efficiently representing functions on the sphere. In: Proceedings of the 22nd annual conference on computer graphics and interactive techniques, Los Angeles. ACM, New York, pp 161–172 Seidman TI, Vogel CR (1989) Well posedness and convergence of some regularisation methods for non-linear ill posed problems. Inverse Probl 5:227–238 Simons FJ, Dahlen FA (2006) Spherical Slepian functions and the polar gap in geodesy. Geophys J Int 166:1039–1061 Simons FJ, Dahlen FA, Wieczorek MA (2006) Spatiospectral concentration on a sphere. SIAM Rev 48:504–536 Telschow R (2014) An Orthogonal Matching Pursuit for the Regularization of Spherical Inverse Problems, PhD thesis, Geomathematics Group, Department of Mathematics, University of Siegen Temlyakov VN (2003) Nonlinear methods of approximation. Found Comput Math 3:33–107 Tscherning CC (1996) Isotropic reproducing kernels for the inner of a sphere or spherical shell and their use as density covariance functions. Math Geol 28:161–168 Vincent P, Bengio Y (2002) Kernel matching pursuit. Mach Learn 48:169–191 Wieczorek MA, Simons FJ (2005) Localized spectral analysis on the sphere. Geophys J Int 162:655–675 Wieczorek MA, Simons FJ (2007) Minimum-variance spectral analysis on the sphere. J Fourier Anal Appl 13:665–692 WWW Center for Space Research, University of Texas, Austin. http://www.csr.utexas.edu/grace/ overview.html. Last accessed: 30 July 2013 WWW Jet Propulsion Laboratory, California Institute of Technology, Pasadena. http://podaac.jpl. nasa.gov/GRACE. Last accessed: 30 July 2013
Material Behavior: Texture and Anisotropy Ralf Hielscher, David Mainprice, and Helmut Schaeben
Contents 1 2 3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scientific Relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rotations and Crystallographic Orientations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Parametrizations and Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Kernels and Radially Symmetric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Crystallographic Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Geodesics, Hopf Fibres, and Clifford Tori . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Totally Geodesic Radon Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Properties of the Spherical Radon Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Texture Analysis with Integral Orientation Measurements: Texture Goniometry Pole Intensity Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Texture Analysis with Individual Orientation Measurements: Electron Backscatter Diffraction Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Anisotropic Physical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Effective Physical Properties of Crystalline Aggregates . . . . . . . . . . . . . . . . . . . 7.2 Properties of Polycrystalline Aggregates with Texture . . . . . . . . . . . . . . . . . . . . 7.3 Properties of Polycrystalline Aggregates: An Example . . . . . . . . . . . . . . . . . . . . 8 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2150 2151 2151 2153 2155 2157 2158 2159 2161 2162 2166 2170 2175 2175 2178 2180 2182 2185 2185
R. Hielscher () Applied Functional Analysis, Technical University Chemnitz, Chemnitz, Germany e-mail: [email protected] D. Mainprice Géosciences UMR CNRS 5243, Université Montpellier 2, Montpellier, France e-mail: [email protected] H. Schaeben Geophysics and Geoinformatics, TU Bergakademie Freiberg, Freiberg, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_33
2149
2150
R. Hielscher et al.
Abstract
This contribution is an attempt to present a self-contained and comprehensive survey of the mathematics and physics of the material behavior of rocks in terms of texture and anisotropy. Being generally multiphase and polycrystalline, where each single crystallite is anisotropic with respect to its physical properties, texture, i.e., the statistical and spatial distribution of crystallographic orientations, becomes a constitutive characteristic and determines material behavior except for grain boundary effects, i.e., in first-order approximation. This chapter is in particular an account of modern mathematical texture analysis explicitly clarifying terms, providing definitions and justifying their application, and emphasizing its major insights. Thus, mathematical texture analysis is brought back to the realm of spherical Radon and Fourier transforms, spherical approximation, and spherical probability, i.e., to the mathematics of spherical tomography.
1
Introduction
Quantitative analysis of crystals’ preferred orientation, or texture analysis, is historically important in metallurgy and has become increasingly applied to Earth science problems of the anisotropy of physical properties (e.g., seismic anisotropy) and the study of deformation processes (e.g., plasticity). The orientation probability density function f W SO.3/ ! R, which is used to model the volume portion of crystallites dVg realizing a random crystallographic orientation g within a polycrystalline specimen of volume V , is instrumental to the description of the preferred crystallographic orientation, i.e., texture, and to the computation of anisotropic material behavior due to texture. The orientation probability density function can practically be determined (i) from individual orientation measurements (electron backscatter diffraction data) by nonparametric kernel density estimation or (ii) from integral orientation measurements (X-ray, neutron, or synchrotron diffraction data) by resolving the largely ill-posed problem to invert experimentally accessible “pole figure” intensities interpreted as volume portions of crystallites dV˙hjjr having the lattice plane normals ˙h 2 S2 coincide with the specimen direction r 2 S2 . Mathematically, pole density functions are defined in terms of totally geodesic Radon transforms. Mathematical proofs as well as algorithms and their numerics are generally omitted; instead, the reader is referred to original publications or standard textbooks R I where they apply, and the new open-source MATLAB toolbox “MTEX” for texture analysis, created by Ralf Hielscher (Hielscher and Schaeben 2008a, see also http://code.google.com/p/mtex/), provides a practical numerical implementation of the methods described in the chapter. References are not meant to indicate or assign priorities; they were rather chosen according to practical reasons, such as accessibility.
Material Behavior: Texture and Anisotropy
2
2151
Scientific Relevance
Several important physical properties of rocks of geophysical interest are controlled by the single-crystal properties of constituent minerals, e.g., thermal conductivity and seismic wave speed. Many minerals have strongly anisotropic mechanical and physical properties; hence, an accurate statistical description of the crystallographic orientation (or texture) of minerals in aggregates of geomaterials is essentially for the prediction of bulk anisotropic properties. Further quantitative texture analysis of minerals provides a means of identifying the minerals that control the anisotropy of rock properties and the deformation mechanisms that generate texture, e.g., dislocation glide systems. The study of crystallographic orientation in rocks dates back to the work of Sander (1930) making measurements with a petrological microscope and interpretation of textures in terms of rock movement or flow patterns in the 1920s. Hence, it has been recognized for a long time that texture records important information of the history or evolution of a rock, which may reflect on past conditions of temperature, pressure, and mechanical deformation. Recent advances have extended the field of texture analysis from naturally deformed specimens found on the surface to samples experimentally deformed at the high pressures of the Earth’s mantle. Mainprice et al. (2005) and core Mao et al. (1998), corresponding to depths of hundreds or thousands of kilometers below the Earth’s surface. Texture analysis is an important tool for understanding the deformation behavior of experimentally deformed samples. New diffraction techniques using synchrotron high-intensity X-ray radiation on micron-sized high-pressure samples (Raterron and Merkel 2009) and the now widespread application of electron backscatter diffraction (EBSD) to a diverse range of geological samples (Prior et al. 2009) require a modern texture analysis that can be coherently applied to volume diffraction data, single orientation measurements, and plasticity-modeling schemes involving both types of data. The texture analysis proposed in this chapter constitutes a reply to these new requirements within a rigorous mathematical framework.
3
Rotations and Crystallographic Orientations
The special orthogonal group SO(3) is initially defined as the group of rotations g 2 R3 3 with det(g/ D C1. It may be characterized as a differentiable manifold and endowed with a metric and a distance as a Riemannian manifold (Morawiec 2004). An orientation is defined to mean an instantaneous rotational configuration. Let KS D fx; y; zg be a right-handed orthonormal specimen coordinate system, and let KC D fa; b; cg be a right-handed orthonormal crystal coordinate system. Then, we call the rotation g 2 SO.3/ of the coordinate system KC with respect to the coordinate system KS if it rotates the latter onto the former system, i.e., if gx D a; gy D b; gz D c. Let r D .u; v; w/T be a unit coordinate vector with respect to the
2152
R. Hielscher et al.
specimen coordinate system KS , and let h D .h; k; l/T be the corresponding unit coordinate vector with respect to the crystallographic coordinate system KC , i.e., both coordinate vectors represent the same direction such that ux C vy C wz D ha C kb C lc: Then, the orientation g 2 SO(3) identified with a matrix M .g/ 2 R3 3 realizes the basis transformation between the coordinate systems, and we have M .g/h D r: Casually, h 2 S2 is referred to as crystallographic direction, while r 2 S2 is referred to as specimen direction. Initially, a crystallographic orientation should be thought of as an element g 2 O(3), the orthogonal group. Considering crystallographic symmetries, mathematically different orientations of a crystal may be physically equivalent. The set of all orientations that are equivalent to the identical orientation g0 D id is called the point group Spoint of the crystal. With this notation, the set of all orientations crystallographically symmetrically equivalent to an arbitrary orientation g0 2 O(3) becomes the left coset g0 Spoint D fg0 gjg 2 Spoint g. The set of all these cosets is denoted O(3)/Spoint and called quotient orientation space, i.e., the orientation space modulo crystallographic symmetry Spoint . In case of diffraction experiments, not only this crystallographic equivalence but also equivalence imposed by diffraction itself has to be considered. Due to Friedel’s law (Friedel 1913), equivalence of orientations with respect to diffraction is described by the Laue group SLaue , which is the point group of the crystal augmented by inversion, i.e., SLaue D Spoint ˝ fid; idg. Since O(3)/SLaue Š SO.3/=.SLaue \ SO.3//, the cosets of equivalent orientations with respect to diffraction are completely represented by proper rotations. Likewise, when analyzing diffraction data for a preferred crystallographic orientation, it is sufficient to consider the restriction of the Laue group GLaue O(3) to its purely rotational part GQ Laue D GLaue \ SO.3/. Then, two orientations g; g 0 2 SO(3) are called crystallographically symmetrically equivalent with respect to GQ Laue if there is a symmetry element q 2 GQ Laue such that gq D g 0 . The left cosets g GQ Laue define the classes of crystallographically symmetrically equivalent orientations. Thus, a crystallographic orientation is a left coset. These cosets define a partition of SO(3). A set of class representatives, that contains exactly one element of each left coset or class is called a left transversal. It is not unique. If it is easily tractable with respect to a parametrization, it will be denoted G. Analogously, two crystallographic directions h; h0 2 S2 are called crystallographically symmetrically equivalent if there is a symmetry element q 2 GQ Laue such that qh D h0 . The orientation probability density function f of a polycrystalline specimen is defined as fW SO.3/ ! R, which models the relative frequencies of crystallographic
Material Behavior: Texture and Anisotropy
2153
orientations within the specimen by volume, i.e., f .g/dg D ized to
dVg V
, and is normal-
Z f .g/dg D 8 2 ;
(1)
SO.3/
where dg denotes the rotational invariant measure on SO(3). The orientation probability density function possesses the symmetry property f .g/ D f .gq/;
g 2 SO.3/; q 2 GQ Laue ;
(2)
i.e., it is essentially defined on the quotient space SO.3/=GQ Laue . Crystallographic preferred orientations may also be represented by the pole density function P W S2 S2 ! R, where P (h,r) models the relative frequencies that a given crystallographic direction or any crystallographically symmetrically equivalent direction or its antipodally symmetric direction ˙h 2 S2 coincides with dV , and is a given specimen direction r 2 S2 , i.e., P .h; r/d h D P .h; r/d r D ˙hkr V normalized to Z
Z P .h; r/dh D S2
P .h; r/dr D 4: S2
Thus, it satisfies the symmetry relationships P .h; r/ D P .h; r/ and P .h; r/ D P .qh; r/;
h; r 2 S2 ; q 2 GQ Laue ;
i.e., it is essentially defined on S2 =SLaue S2 . Pole density functions are experimentally accessible as “pole figures” by X-ray, neutron, or synchrotron diffraction for some crystallographic forms h, i.e., crystallographic directions and their crystallographically symmetrical equivalents. Orientation probability density and pole density functions are related to each other by the totally geodesic Radon transform.
3.1
Parametrizations and Embeddings
Parametrizations are a way of describing rotations in a quantitative manner. The two major parametrizations we shall consider are the intuitively most appealing parametrization of a rotation in terms of its angle ! 2 Œ0; and axis n 2 S2 of rotation and the parametrization in terms of Euler angles .˛; ˇ; /, with ˛; 2 Œ0; 2/ and ˇ 2 Œ0; . Of particular interest are the embeddings of rotations in R3 3 or in S3 H, the skew field of real quaternions Altmann 1986; Gürlebeck and Sprößig (1997); Hanson 2006; Kuipers 1999.
2154
R. Hielscher et al.
The eigenvalues of a rotation matrix M .g/ 2 R3 3 are given by 1 D 1 and 2;3 D e ˙i ! , where 0 ! . The argument ! of the eigenvalues 2;3 of a rotation matrix M .g/ can be uniquely determined by trace.M .g// D 1 C 2 cos !; which defines the angle of rotation !. Furthermore, the axis n 2 S2 of an arbitrary rotation given as a matrix with entries M .g/ D .mi;j /i;j D1;:::;3 2 R3 3 , with M .g/ ¤ I3 , is defined to be nD
1 .m23 m32 ; m31 m13 ; m12 g21 /T ; 2 sin !
where 0 < ! is the rotation angle. If we explicitly refer to the angle-axis parametrization of a rotation, we use the notation g D g.!I n/; accordingly, !.g/ and n.g/ denote the angle and axis of rotation g, respectively. The unit quaternion q D cos !2 C n sin !2 , associated with the rotation g D g.!I n/, provides an embedding of the group SO(3) in the sphere S3 H of unit quaternions. Euler’s theorem states that any two right-handed orthonormal coordinate systems can be related by a sequence of rotations (not more than three) about coordinate axes where two successive rotations must not be about the same axis. Then, any rotation g can be represented as a sequence of three successive rotations about conventionally specified coordinate axes by three corresponding “Euler” angles, where the rotation axes of two successive rotations must be orthonormal. There exist 12 different choices of sets of axes of rotations (in terms of the coordinate axes of the initial coordinate system) to define corresponding Euler angles, and they are all in use somewhere. Euler angles .˛; ˇ; / usually define a rotation g in terms of a sequence g.˛; ˇ; / of three successive rotations about conventionally fixed axes of the initial coordinate system, e.g., the first rotation by 2 Œ0; 2/ about the z-axis, the second by ˇ 2 Œ0; about the y-axis, and the third by ˛ 2 Œ0; 2/ about the z-axis of the initial coordinate system, such that g.˛; ˇ; / D g.˛I z/g.ˇI y/g. I z/:
(3)
In texture analysis, Bunge’s definition of the Euler angles ('1 ; ; '2 ) (Bunge 1982) of three successive rotations about conventionally fixed axes of rotations refers to the first rotation by '1 2 Œ0; 2/ about the z-axis, the second by 2 Œ0; about the rotated x-axis, i.e., about x0 D g.'I z/x, and the third by '2 2 Œ0; 2/ about the rotated z-axis, i.e., about z00 D g. I x0 /z, such that gBunge .'1 ; ; '2 / D g.'2 I z00 /g. I x0 /g.'1 I z/:
Material Behavior: Texture and Anisotropy
2155
Roe’s or Matthies’ Euler angles (˛; ˇ; / (Matthies et al. 1987; Roe 1965) of three successive rotations about conventionally fixed axes of rotation replace Bunge’s second rotation about x0 by a rotation about the y0 -axis, i.e., they refer to the first rotation by ˛ 2 Œ0; 2/ about the z-axis, the second by ˇ 2 Œ0; about the rotated y-axis, i.e., about y0 D g.˛I z/x, and the third by 2 Œ0; 2/ about the rotated z-axis, i.e., about z00 D g.ˇI y0 /z, such that gRM .˛; ˇ; / D g. ; z00 /g.ˇ; y0 /g.˛I z/: As can be shown by conjugation of rotations, the differently defined Euler angles are related by g.˛; ˇ; / D gRM .˛; ˇ; /: Then, the Roe-Matthies notation has the simple advantage that (˛, ˇ/ are the spherical coordinates of the crystallographic direction c with respect to KS .
3.2
Harmonics
Representation of rotations in terms of harmonics is a subject of representation theory as exposed in Gel’fand et al. (1963), Varshalovich et al. (1988), and Vilenkin and Klimyk (1991). Satisfying the representation property is the single most important characteristic of any useful system of functions for SO(3). An important tool for the mathematical analysis of orientation probability and pole density functions are harmonic functions on the rotation group SO(3) and on the two-dimensional sphere S2 , respectively. In fact, an orientation probability density function and its totally geodesic Radon transform share the same harmonic coefficients, which gives rise to a “harmonic approach” to the resolution of the inverse problem of texture analysis (Bunge 1965, 1969, 1982; Roe 1965). Furthermore, these harmonic coefficients are instrumental to compute the anisotropic macroscopic properties of a specimen, e.g., its thermal expansion, optical refraction index, electrical conductivity, or elastic properties, given the corresponding anisotropic properties of its single crystals. Closely following the exposition in Hielscher (2007) and Hielscher and Schaeben (2008a), we render an explicit definition of harmonics as there are many slightly different ways to define them, e.g., with respect to normalization, which reveal their disastrous impact only in the course of writing and checking software code. Harmonic analysis on the sphere is based on the Legendre polynomials P` W Œ1; 1 !; R; ` 2 N0 , where P` .t/ D
1 d` ..t 2 1/` /; 2` `Š dt `
2156
R. Hielscher et al.
and on the associated Legendre functions, P`k W Œ1; 1 ! R; ` 2 N0 ; k D 0; : : : ; `; P`k .t/ D
.` k/Š .` C k/Š
1=2 .1 t 2 /k=2
dk P` .t/: dt k
In terms of the associated Legendre functions, we define the spherical harmonics Y`k .r/; ` 2 N0 ; k D `; : : : ; `, by r Y`k .r/
D
2` C 1 jkj P` .cos /e i k ; 4
where ; 2 R are the polar coordinates 2 Œ0; 2/; 2 Œ0; of the vector r D .cos sin ; sin sin ; cos /T 2 S2 : By this definition, the spherical harmonics are normed to Z
Z
0
S2
Y`k .r/Y`k0 .r/dr D
0
S2
Y`k .; /Y`k0 .; / sin d d D ı``0 ıkk 0
and, hence, provide an orthonormal basis in L2 .S2 /. In order to define harmonic functions on SO(3), we use the parameterization of a rotation g 2 SO(3) in terms of Euler angles, Eq. (3). Now, we follow Nikiforov and Uvarov (1988) (see also Varshalovich et al. 1988, Kostelec and Rockmore 2003, Vollrath 2006) and define, for ` 2 N0 ; k; k 0 D `; : : : ; `; the generalized spherical harmonics or Wigner-D functions as 0
0
0
i k D`kk .˛; ˇ; / D e i k˛ dkk ; ` .cos ˇ/e
where kk 0
d`
s 0 .1/`k .` C k 0 /Š .t/ D Skk 0 2` .` k 0 /Š.` C k/Š.` k/Š s 0 .1 t/kk 0 d`k .1 t/`k .1 C t/`Ck .1 C t/kCk 0 dt `k 0
and
Skk 0
8 1 ˆ ˆ < .1/k D 0 ˆ .1/k ˆ : 0 .1/kCk
k; k 0 0; k 0 0; k < 0; k 0; k 0 < 0; k; k 0 < 0:
Material Behavior: Texture and Anisotropy
2157
The last term skk0 corrects for the normalization of the spherical harmonics, which are slightly different from those in Nikiforov and Uvarov (1988). The Wigner-D functions satisfy the representation property 0 D`kk .gq/
` X
D
j k0
kj
D` .g/D` .q/;
(4)
j D`
and by virtue of the Peter-Weyl theorem (cf. Vilenkin 1968), they are orthogonal in L2 (SO(3)), i.e., Z
2 0
Z
0
Z
2 0
0 0
D`mn .˛; ˇ; /D`m0 n .˛; ˇ; / d˛ sin ˇ dˇ d D
8 2 ı``0 ımm0 ınn0 : 2` C 1
Furthermore, they are related to the spherical harmonics by the representation property ` X
0
0
D`kk .g/Y`k .h/ D Y`k .gh/;
g 2 SO.3/; h 2 S2 :
k 0 D`
Moreover, any (orientation density) function f 2 L2 (SO(3)) has an associated harmonic or Fourier series expansion of the form f
1 ` X X `D0 k;k 0 D`
1 ` C 12 2 0 fO.`; k; k 0 /D`kk ; 2
with harmonic or Fourier coefficients fO.`; k; k 0 / D
1 Z ` C 12 2 0 f .g/D`kk .g/dg; ` 2 N0 ; k; k 0 D `; : : : ; `: 2 SO.3/
Defined in this way, the classical Parseval identity O f D kf kL2 `2
is fulfilled; otherwise, e.g., for Bunge’s C coefficients, it is not.
3.3
Kernels and Radially Symmetric Functions
In texture analysis, radially symmetric functions appear as unimodal bell-shaped model orientation density functions. Mathematically, they are defined as functions W SO.3/ ! R or ' W S2 ! R that depend only on the distance to a center rotation g0 2 SO(3) or a center direction r0 2 S2 , respectively, i.e., we have
2158
R. Hielscher et al.
.g/ D
.g 0 / and '.r/ D '.r0 /
for all rotations g, g 0 2 SO(3) with !gg01 D !g 0 g01 and all directions r; r0 2 S2 with .r, r0 / D .r0 ; r0 /, where !gg01 denotes the rotational angle of the rotation g; g01 and cos .r, r0 / D r r0 . Radially symmetric functions, both on the rotation group as well as on the sphere, have characteristic Fourier series expansions. More precisely, there exist Chebyshev coefficients .l/ and Legendre coefficients ' .l/; l 2 N, respectively,
b
b
O .`; k; k 0 / D O .`/D kk 0 .g0 / `
and '.`; O k/ D '.l/ O
4 Y k .r0 /; 2l C 1 l
such that .g/
1 X
` X
O .l/
0 0 D`kk .g/D`kk .g0 /
k;k 0 D`
`D0
1 X
b.l/U
`D0
!.gg01 / 2` cos 2 (5)
and '.r/
1 X
'.l/ O
lD0
l 1 X 4 X k k Yl .r/Y l .r0 /
'.l/P O l .r r0 /: 2l C 1 kDl
(6)
lD0
Here, Ul , l 2 N, denote the Chebyshev polynomials of the second kind U` .cos !/ D
sin.` C 1/! ; ` 2 N0 ; ! 2 .0; / sin !
(7)
with U` .1/ D ` C 1 and U` .1/ D .1/` ` C 1:
3.4
Crystallographic Symmetries
Considering crystallographic symmetries, Eq. (2), requires special provision. If the harmonics should be explicitly symmetrized such that they are properly defined on a left transversal G SO.3/ only, then special attention should be paid to the preservation of the representation property, Eq. (4). Here, a different approach is pursued in terms of radially symmetric functions with known Chebyshev coefficients. Then, symmetrization is actually done by summation 1 cs .!.gg0 //
D
X 1 Q #GLaue Q
.!.gqg01 //;
g; g0 2 SO.3/; q 2 GQ Laue :
q2GLaue
(8)
Material Behavior: Texture and Anisotropy
2159
It is emphasized that cs , like , is properly defined on SO(3), where numerical methods of fast summation are known (cf. Hielscher et al. 2010), which are, however, unknown for any subset of SO(3), such as G. Moreover, the Fourier coefficients of §cs can easily be computed from the Chebychev coefficients of § O
3.5
cs .`; k; k
0
/D
X
O .`/ #GQ Laue
` X
kj
j k0
D` .q/D` .g0 /:
q2GQ Laue j D`
Geodesics, Hopf Fibres, and Clifford Tori
Then, some sets of rotations that are instrumental for texture analysis, as the (Hopf) fiber and the (Clifford) torus, are defined and characterized in terms of pairs (of sets) of unit vectors comprising an initial set of unit vectors and its image with respect to the elements of the set of rotation. The distance between two rotations g0 , g is defined as the angle !.g0 g 1 / of the composition of the rotation g1 followed by the rotation g0 . The distance of a rotation g0 from a set of rotations G is the infimum of all distances between the rotation and any element of the set of rotations, ie., d .g0 , G/ D infg2G !.g0 g 1 /. A one-dimensional submanifold of a Riemannian manifold is called geodesic if it is locally the shortest path between any two of its points. Any geodesic G 2 SO(3) can be parametrized by two unit vectors and is defined as fiber G D G.h; r/ D fg 2 SO.3/jgh D rg;
(9)
where the vectors h; r 2 S2 are well defined up to the symmetry G.h; r/ D G.h; r/ (Hielscher 2007; Meister and Schaeben 2004). The geodesics induce a double fibration of SO(3) and may be referred to as Hopf fibers (Vajk 1995; Kreminski 1997; Chisholm, The sphere in three dimensions and higher: generalizations and special cases. Personal Communication, 2000). In terms of unit quaternions q1 ; q2 2 S3 associated with rotations g1 , g2 2 SO(3), their geodesic G(h,r) obviously corresponds to the great circle C .q1 ; q2 / S3 with pure quaternions h D q1 q2 ;
r D q2 q1 :
Given a pair of unit vectors .h; r/ 2 S2 S2 with h r ¤ 0, the geodesic G.h; r/ is associated with the great circle C .q1 ; q2 / of unit quaternions spanned by orthonormal quaternions q1 D
hr
1 .1 rh/ D cos C sin ; k1 rhk 2 kh rk 2
(10)
2160
R. Hielscher et al.
q2 D
hCr 1 .h C r/ D 0 C ; kh C rk kh rk
(11)
where D .h; r/ denotes the angle between h and r, i.e., cos D h r. Since r and h are pure unit quaternions, we get r.1 rh/ D .1 rh/h D h C r; and obviously, jj1 rhjj D jjh C rjj. Then, with Eqs. (10) and (11), rq1 D q1 h D q2 ;
(12)
i.e., rq1 and q1 h also represent rotations mapping h onto r. Moreover, it should be noted that Eq. (12) implies q2 rq1 D 1;
q1 hq2 D 1;
which may be interpreted as a remarkable “factorization” of 1 (Meister and Schaeben 2004). The distance of an arbitrary rotation g0 from the fiber G.h; r/ or, referring to the quaternionic embedding, the distance of q0 2 S3 from the circle C .q1 ; q2 / is given by d.g; G.h; r// D
1 arccos.gh r/; 2
d.q; C .q1 ; q2 // D
1 arccos.qhq r/ 2
(Kunze 1991; Meister and Schaeben 2004). Let q1 ; q2 ; q3 ; and q4 denote four mutually orthonormal quaternions. Then, the Clifford torus T .q1 ; q2 ; q3 ; q4 I ^/ S3 (Chisholm, The sphere in three dimensions and higher: generalizations and special cases. Personal Communication, 2000), defined as the set of quaternions q.s; tI ^/ D .q1 cos s C q2 sin s/ cos ^ C .q3 cos t C q4 sin t/ sin ^; s; t 2 Œ0; 2/; ^ 2 Œ0; =2 ;
(13)
consists of all great circles with distance cos^ from the geodesic C .q1 ; q2 /G.h; r/, i.e., T .q1 ; q2 ; q3 ; q4 I ^/ D T .G.h, r/I ^/. It is associated with all rotations mapping h on the small circle c.r; 2^/ S2 and mapping the small circle c.h; 2^/ S2 on r, respectively (Meister and Schaeben 2004). It should be noted that Eq. (13) can be suitably factorized (Meister and Schaeben 2004).
Material Behavior: Texture and Anisotropy
4
2161
Totally Geodesic Radon Transforms
For any function f integrable on each fiber G(h, r), the totally geodesic Radon transform Rf assigns the mean values along any fiber to f , i.e., Rf .h; r/ D
1 2
Z f .g/dg:
(14)
G.h;r/
It provides the density of the probability that the random crystal direction gh coincides with the specimen direction r, given the random rotation g. Accounting for Friedel’s law (Friedel 1913) that diffraction cannot distinguish between the positive and negative normal vector of a lattice plane, it is P .h; r/ D
1 .Rf .h; r/ C Rf .h; r// D &f .h; r/; 2
(15)
where X f .h; r/ is also referred to as the basic crystallographic X-ray transform (Nikolayev and Schaeben 1999; Schaeben 2001). While the totally geodesic Radon transform possesses a unique inverse (Helgason 1994, 1999), the crystallographic X-ray transform does not. The kernels of the latter are the harmonics of odd order (Matthies 1979), see below. Further following Helgason (1994, 1999), the generalized totally geodesic Radon transform and the respective dual are well defined. The generalized totally geodesic Radon transform of a real function f W SO.3/ ! R is defined as 1 R f .h; r/ D 4 2 sin 2
Z
./
f .g/ dg: d.g;G.h;r//D
It associates with f its mean values over the torus T .G(h, r);) with core G.h; r/ and radius (Eq. 13). For D 0, the generalized totally geodesic Radon transform converges toward the totally geodesic Radon transform. Then, we may state the following theorem: The generalized totally geodesic Radon transform is equal to the spherically translated totally geodesic Radon transform, and it can be identified with the angle density function ./ T ŒRf .h; r/ D
R 1 Rf .h0 ; r/dh0 2 sin c.hI/ R 1 D Rf .h; r0 /dr0 2 sin c.rI/
D
4 2
R R 1 0 c.rI/ G.h;r0 / f .g/dg d r sin
(16)
(17)
2162
R. Hielscher et al.
D
D
R 1 f .g/dg 4 2 sin T .G.h;r/I 2 /
4 2
R 1 f .q/dq sin d.g;G.h;r//D 2
(18)
(19)
D R.=2/ f .Ch;r /: Thus,
T ./ ŒRf .h; r/ D R.=2/ f .h; r/ D Af .h; rI /
(20)
(Bernstein et al. 2009). The angle density function Af .h; rI / has been introduced into texture analysis by Bunge, e.g., Bunge (1969, p. 44, 1982, p. 74) (with a false normalization). According to its definition, it is the mean value of the pole density function over a small circle c.hI / centered at r, a construct known as spherical translation .T ./ ŒRf /.h; r/ in spherical approximation. Thus, it is the density that the crystallographic direction h encloses the angle ; 0 with the specimen direction r, given the orientation probability density function f . Equation (16), i.e., the commutation of the order of integration, has been observed without reference to Radon transforms or Ásgeirsson means and stated without proof (cf. Bunge 1969, p. 47; 1982, p. 76), not to mention purely geometric arguments. Nevertheless, its central role for the inverse Radon transform was recognized in Muller et al. (1981) by “rewriting” Matthies’ inversion formula (Matthies 1979). It should be noted that Af .h; rI 0/ D Rf .h; r/;
Af .h; rI / D Rf .h; r/:
The key to the analytical inverse of the totally geodesic Radon transform is provided by the dual Radon transforms (Helgason 1999).
4.1
Properties of the Spherical Radon Transform
In this section, we compile the properties of the totally geodesic Radon transform which are fundamental to understand the mathematics of texture analysis.
Antipodal Symmetry On its domain of definition, the one-dimensional Radon transform, Rf of any function f W SO.3/ ! R has the symmetry property Rf .h; r/ D Rf .h; r/. The crystallographic X-ray transform satisfies the additional symmetry property X f .h; r/ D X f .h; r/ D X f .h; r/. Thus, pole figures correspond to the crystallographic transform, which is even in both arguments.
Material Behavior: Texture and Anisotropy
2163
Effect of Crystallographic Symmetry If, for a symmetry group GQ Laue SO.3/, a function f satisfies f .gq/ D f .g/ for all g 2 SO(3), q 2 GQ Laue , its corresponding Radon transform Rf satisfies Rf .qh; r/ D RŒf .ı q 1 / .h; r/ D RŒf .ı/ .h; r/ D Rf .h; r/ for all q 2 GQ Laue :
Radial Symmetry If the orientation probability density function is radially symmetrical with respect to g0 2 SO(3), i.e., if it depends on the angle of rotation only, the Radon transform is radially symmetric too. More specifically and formally, let f be of the form f .g/ D f .!.gg01 //; g0 2 SO.3/: Then, the Radon transform Rf .h; r/ defined on S2 S2 is radially symmetrical with respect to r0 D g0 h, i.e., Rf .h; ı/ is radially symmetric with respect to g0 h, and Rf .ı; r/ is radially symmetric with respect to g01 r. Thus, the Radon transform reduces to a function Rf .g0 h r/ defined on [-1, 1] and may be thought of as depending on the angle D arccos(g0 h r/ 2 Œ0; . In particular, the Chebyshev coefficients .l/ of a radially symmetric orientation density function coincide with the Legendre coefficients of its Radon transform R .h; /, i.e.,
b
R .h; r/
1 X
b.l/P .g h r/; l
0
h; r 2 S2 :
(21)
lD0
It should be noted that the radial symmetry of f with respect to g0 2 SO(3) is necessary and sufficient for the radial symmetry of the transform Rf .h; r/ with respect to r0 D g0 h (Schaeben 1997). If the orientation probability density function is a fiber symmetric function, i.e., if f is of the form f .g/ D f .gh0 r0 /; .h0 ; r0 / 2 S2 S2 ; then Rf .h; ı/ is radially symmetric with respect to r0 , and Rf .ı; r/ is radially symmetric with respect to h0 (Hielscher 2007). Special cases of the even Bingham quaternion distribution on S 3 or, eqivalently, of the Fisher von Mises matrix distribution on SO(3) comprise a bimodal “bipolar,” a circular “fiber,” and an often overlooked spherical “surface” distribution (Kunze and Schaeben 2004).
Darboux Differential Equation The Radon transform satisfies a Darboux-type differential equation
2164
R. Hielscher et al.
.h r /Rf .h; r/ D 0;
(22)
where h denotes the Laplace-Beltrami operator applied with respect to h 2 S2 (Savyolova 1994). Its general solution has been derived in terms of harmonics and in terms of characteristics, respectively (Nikolayev and Schaeben 1999).
Fourier Slice Theorem The following well-known theorem, dating back to the origin of texture analysis, characterizes the relationship between the Fourier expansion of an orientation density function and its corresponding pole density function. Let f 2 L2 (SO(3)) be an orientation density function with Fourier expansion f
1 ` 1 X X .` C 12 / 2 0 fO.`; k; k 0 /D`kk : 2 0
`D0 k;k D`
Then, the corresponding pole density function P 2 L2 .S2 S2 /, P .h; r/ D 1 .Rf .h; r/ C Rf .h; r// possesses the associated Fourier expansion 2 P .h; r/ D X f .h; r/
X
` X
`22N0 k;k 0 D`
1 .` C
1 12 2/
0 fO.`; k; k 0 /Y`k .h/Y`k .r/:
(23)
The theorem states that the Radon transform preserves the order of harmonics 0
RD`kk .h; r/ D
2 0 Y`k .h/Y`k .r/; ` C 12
(24)
and, moreover, that a function f W SO.3/ ! R and its Radon transform Rf W S2 S2 ! R have the same harmonic coefficients up to scaling. In particular, it states that the crystallographic X-ray transform, Eq. (15), of any odd-order harmonic vanishes, i.e., 0
X D`kk 0 for all odd `:
(25)
Thus, the crystallographic X-ray transform has a nonempty kernel comprising the harmonics of odd order. For a modern account of the Fourier slice theorem, the reader is referred to Hielscher et al. (2008).
Range The range of an operator A W D ! Y is defined as the subspace of all functions P 2 Y such that there is a function f 2 D with Af D P . In the case of the Radon transform, a characterization of the range can be derived directly from Eq. (24).
Material Behavior: Texture and Anisotropy
2165
More specifically, the image of L2 (SO(3)) with respect to the Radon transform can be derived by comparison of Eq. (23) with X
u.h; r/
` X
0
uO .`; k; k 0 /Y`k .h/Y`k .r/
`22N0 k;k 0 D`
resulting in RL2 .SO.3//Dfu.h; r/ D
P
D fu.h; r/ D
1 1 .`C 12 / 2
P
P fO.`; k; k 0 /Y`m .h/Y`n .r/j .fO.`; k; k 0 //2 < 1g
uO .`; k; k 0 /Y`m .h/Y`n .r/j
P
.`C 12 /.Ou.`; k; k 0 //2 of a statistically uniform sample are linked by effective macroscopic moduli C and S that obey Hookes’s law of linear elasticity, Cijkl D < ij >< kl >1 ; ; Sijkl D < ij >< kl >1 R R where < ij >D V1 ij .r/dr, < ij >D V1 ij .r/dr, V is the volume, and the notation < : > denotes an ensemble average. The stress (r) and strain "(r) distribution in a real polycrystal vary discontinuously at the surface of grains. By replacing the real polycrystal with a “statistically uniform” sample, we are assuming that stress (r) and strain "(r) are varying slowly and continuously with position r. A number of methods are available for determining the effective macroscopic modulus of an aggregate. We make the simplifying assumption that there is no significant interaction between grains, which for fully dense polycrystalline aggregates is justified by agreement between theory and experiments for the methods we present here. However, these methods are not appropriate for aggregates that contain voids, cracks, or pores filled with liquids or gases, as the elastic contrast between the different microstructural elements will be too high and we cannot ignore elastic interactions in such cases. The classical method that takes into account grain interaction is the self-consistent method based on the Eshelby inclusion model (e.g., Eshelby 1957; Hill 1965, which can also account for the shape of the microstructural elements. The simplest and best-known averaging techniques for obtaining estimates of the effective elastic constants of polycrystals are the Voigt (1928) and Reuss (1929) averages. These averages only use the volume fraction of each phase, the orientation, and the elastic constants of the single crystals or grains. In terms of statistical probability functions, these are first-order bounds, as only the first-order correlation function is used, which is the volume fraction. Note that no information about the shape or position of neighboring grains is used. The Voigt average is found by simply assuming that the strain field is everywhere constant (i.e., "(r) is independent of r) and hence the strain is equal to its mean value in each grain. The strain at every position is set equal to the macroscopic strain of the sample. C is then estimated by a volume average of local stiffnesses C .gi / with orientation gi and volume fraction Vi ,
C C
Voigt
D
" X i
# Vi C .gi / :
Material Behavior: Texture and Anisotropy
2177
The Reuss average is found by assuming that the stress field is everywhere constant. The stress at every position is set equal to the macroscopic stress of the sample. C or S is then estimated by the volume average of local compliances S .gi /, C C Reuss D
P
S S Reuss D
1 Vi S .gi /
i P
:
Vi S .gi /
i
and C Voigt ¤ C Reuss and C Voigt ¤ ŒS Reuss 1 : These two estimates are not equal for anisotropic solids, with the Voigt being an upper bound and the Reuss a lower bound. A physical estimate of the moduli should lie between the Voigt and Reuss average bounds, as the stress and strain distributions are expected to be somewhere between uniform strain (Voigt bound) and uniform stress (Reuss bound). Hill (1952) observed that the arithmetic mean (and the geometric mean) of the Voigt and Reuss bounds, sometimes called the Hill or Voigt-Reuss-Hill (VRH) average, is often close to experimental values. The VRH average has no theoretical justification. As it is much easier to calculate the arithmetic mean of the Voigt and Reuss elastic tensors, all authors have tended to apply the Hill average as an arithmetic mean. In Earth sciences, the Voigt, Reuss, and Hill averages have been widely used for averages of oriented polyphase rocks (e.g., Crosson and Lin 1971). Although the Voigt and Reuss bounds are often far apart for anisotropic materials, they still provide the limits within which the experimental data should be found. Several authors have searched for a geometric mean of oriented polycrystals using the exponent of the average of the natural logarithms of the eigenvalues of the stiffness matrix (Matthies and Humbert 1993). Their choice of this averaging procedure was guided by the fact that the ensemble average elastic stiffness < C > should equal the inverse of the ensemble average elastic compliances < S >1 , which is not true, for example, of the Voigt and Reuss estimates. A method of determining the geometric mean for arbitrary orientation distributions has been developed (Matthies and Humbert 1993). The method derives from the fact that a stable elastic solid must have an elastic strain energy that is positive. It follows from this that the eigenvalues of the elastic matrix must all be positive. Comparison between Voigt, Reuss, Hill, and self-consistent estimates shows that the geometric mean provides estimates very close to the self-consistent method but at considerably reduced computational complexity (Matthies and Humbert 1993). The condition that the macroscopic polycrystal elastic stiffness < C > must equal the inverse of the aggregate elastic compliance < S >1 would appear to be a powerful physical constraint on the averaging method (Matthies and Humbert 1993). However, the arithmetic (Hill) and geometric means are also very similar
2178
R. Hielscher et al.
(Mainprice and Humbert 1994), which tends to suggest that they are just mean estimates with no additional physical significance. The fact that there is a wide separation between the Voigt and Reuss bounds for anisotropic materials is caused by the fact that the microstructure is not fully described by such averages. However, despite the fact that these methods do not take into account such basic information as the position or the shape of grains, several studies have shown that the Voigt and Hill average are within 5–10 % of experimental values for crystalline rocks. For example, Barruol and Kern (1996) showed for several anisotropic lower-crust and upper-mantle rocks from the Ivrea zone in Italy that the Voigt average is within 5 % of the experimentally measured velocity.
7.2
Properties of Polycrystalline Aggregates with Texture
The orientation of crystals in a polycrystal can be measured by volume diffraction techniques (e.g., X-ray or neutron diffraction) or individual orientation measurements (e.g., U-stage and Optical microscope, electron channeling, or electron backscattered diffraction (EBSD)). In addition, numerical simulations of polycrystalline plasticity also produce populations of crystal orientations at mantle conditions (e.g., Tommasi et al. 2004). An orientation, often given the letter g, of a grain or crystal in sample coordinates can be described by the rotation matrix between crystal and sample coordinates. In practice, it is convenient to describe the rotation by a triplet of Euler angles, e.g., g D . 1 ; ˆ; 2 / by Bunge (1982). One should be aware that there are many different definitions of Euler angles that are used in the physical sciences. The orientation distribution function (O.D.F.) f .g/ is defined as the volume fraction of orientations, with an orientation in the interval between g and g C dg in a space containing all possible orientations given by V D V
Z f .g/dg;
where V =V is the volume fraction of crystals with orientation g, f .g/ is the texture function, and dg D 1=8 2 sin ' d 1 dˆ d 2 is the volume of the region of integration in orientation space. To calculate the seismic properties of a polycrystal, one must evaluate the elastic properties of the aggregate. In the case of an aggregate with a crystallographic texture, the anisotropy of the elastic properties of the single crystal must be taken into account. A potential complication is the fact that the Cartesian frame defined by orthogonal crystallographic directions used report elastic tensor of the single crystal, may not be the same as those used for Euler angle reference frame used in texture analysis (e.g., MTEX) or measurement (e.g., EBSD) packages. To account for this difference, a rotation may be required to bring the crystallographic frame of tensor into coincidence with the Euler angle frame,
Material Behavior: Texture and Anisotropy
2179
Cij kl .g E / D Tip :Tj q :Tkr :Tlt Cpqrt .g T /; where Cij kl .g E / is the elastic property in the Euler reference and Cpqrt (g T / is the elastic property in the original tensor reference frame; both frames are in crystal coordinates. The transformation matrix Tij is constructed from the angles between the two sets perpendicular to the crystallographic axes, forming rows and columns of the orthogonal transformation or rotation matrix (see Nye 1957). For each orientation g, the single-crystal properties have to be rotated into the specimen coordinate frame using the orientation or rotation matrix gij , Cij kl .g/ D gip :gj q :gkr :glt :Cpqrt .g E /; where Cij kl .g E / is the elastic property in sample coordinates, gij D g. 1 ; ˆ; 2 / is the measured orientation in sample coordinates, and Cpqrt (g E / is the elastic property in crystal coordinates of the Euler frame. We can rewrite the above equation as Cij kl .g/ D Tij klpqrt .g/Cpqrt .g E / with Tij klpqrt .g/ D @xi =@xp @xj =@xq @xk =@xr @xl =@xt D gip :gj q :gkr :glt : The elastic properties of the polycrystal may be calculated by integration over all possible orientations of the ODF. Bunge (1982) has shown that integration is given as Z < Cij kl >D
Z E
gip :gj q :gkr :glt :Cpqrt .g /:f .g/ dg D
Cij kl .g/:f .g/ dg;
R where < Cij kl > are the elastic properties of the aggregate and f .g/dg D 1. The integral on SO(3) can be calculated efficiently using the numerical methods available in MTEX. We can also regroup the texture-dependent part of the integral as < Tij klpqrt > Z < Tij klpqrt > Cpqrt .g E / D
Tij klpqrt .g/ f .g/dg Cpqrt .g E /:
We can evaluate < Tij klpqrt > analytically in terms of generalized spherical harmonic coefficients for specific crystal and sample symmetries (e.g., Ganster and Geiss 1985; Johnson and Wenk 1986; Morris 2006; Zuo et al. 1989). The minimum texture information required to calcluate the elastic properties are the even-order coefficients and series expansion to 4, which drives from centrosymmetric symmetry and fourth-rank tensor of elasticity, respectively. The direct consequence of this is
2180
R. Hielscher et al.
that only a limited number of pole figures are required to define the ODF, e.g., 1 for cubic and hexagonal and 2 for tetragonal and trigonal crystal symmetries. Alternatively, elastic properties may be determined by simple summation of individual orientation measurements < Cij kl >D
X
gip :gj q :gkr :glt :Cpqrt .g E /:V .g/ D
X
Cij kl .g/:V .g/;
where V .g/ is the volume fraction of grains in orientation g. For example, the Voigt average of the rock for m mineral phases of volume fraction V .m/ is given as < Cij kl >Voigt D
X
V .m/ < Cij kl >m :
The final step is the calculation of the three seismic phase velocities by solution of the Christoffel tensor (Ti k /. The Christoffel tensor is symmetrical because of the symmetry of the elastic constants, and hence, Ti k D Cij kl nj nl D Cj i kl nj nl D Cij lk nj nl D Cklij nj nl D Tki : The Christoffel tensor is also invariant upon the change of sign of the propagation direction n, as the elastic tensor is not sensitive to the presence or absence of a center of symmetry, being acentrosymmetric physical property. Because the elastic strain energy 12 Cij kl ij kl of a stable crystal is always positive and real (e.g., Nye 1957), the eigenvalues of the 3 3 Christoffel tensor (being a Hermitian matrix) are three positive real values of the wave moduli M corresponding to Vp 2 ; Vs12 ; and Vs 22 of the plane waves propagating in the direction n. The three eigenvectors of the Christoffel tensor are the polarization directions (also called vibration, particle movement, or displacement vectors) of the three waves, as the Christoffel tensor is symmetrical to the three eigenvectors, and polarization vectors are mutually perpendicular. In the most general case, there are no particular angular relationships between polarization directions p and the propagation direction n; however, typically the P-wave polarization direction is nearly parallel and the two S-wave polarizations are nearly perpendicular to the propagation direction, and they are termed quasi-P or quasi-S waves. If the P-wave and two S-wave polarizations are parallel and perpendicular to the propagation direction, which may happen along a symmetry direction, then the waves are termed pure P and pure S or pure modes. In general, the three waves have polarizations that are perpendicular to one another and propagate in the same direction with different velocities, with Vp > Vs1 > Vs2 .
7.3
Properties of Polycrystalline Aggregates: An Example
Metamorphic reactions and phase transformations often result in specific crystallographic relations between minerals. A specific orientation relationship between two minerals is defined by choosing any orientation descriptor that is convenient,
Material Behavior: Texture and Anisotropy
2181
e.g., a pair of parallel crystallographic features, Euler angle triplet, rotation matrix, or rotation axis and angle. The two minerals may have the same or different crystal symmetries. The composition may be the same, as in polymorphic phase transitions, or different, as in dehydration or oxidization reactions. Recently, Boudier et al. (2009/ described the orientation relationship between olivine and antigorite serpentine crystal structures by two pairs of planes and directions that are parallel in both minerals: relation 1 W .100/ Olivinejj.001/Antigorite andŒ001 OlivinejjŒ010 Antigorite relation 2 W .010/ Olivinejj.001/Antigorite andŒ001 OlivinejjŒ010 Antigorite Such relationships are called Burgers orientation relationships in metallurgy. The relation is used in the present study to calculate the Euler angle triplet, which characterizes the rotation of the crystal axes of antigorite into coincidence with those of olivine. Olivine is hydrated to form antigorite, and in the present case, the rotational point group symmetry of olivine (orthorhombic) and antigorite (monoclinic) results in four symmetrically equivalent new mineral orientations (see Mainprice et al. (1990) for details) because of the symmetry of the olivine that is transformed. The orientation of the n symmetrically equivalent antigorite minerals is given by Antigorite
gnD1;:::;4 D g OlivineAntigorite :SnOlivine :g Olivine ; where g OlivineAntigorite is rotation between olivine and newly formed antigorite, SnOlivine are the rotational point group symmetry operations of olivine, and gOlivine is the orientation of an olivine crystal. g is defined by the Burgers relationships given above, where relation 1 is g D 1 ; ˆ; 2 D (88.6, 90.0, 0.0) and relation 2 is g D (178.6, 90.0, 0.0). Note that the values of the Euler angles of g will depend on the right-handed orthonormal crystal coordinate system chosen for the orthorhombic olivine and the monoclinic antigorite. In this example, for olivine KC D fa; b; cg, and for antigorite KC D fa ; b; cg. The measurement of the texture of antigorite is often unreliable using EBSD because of sample preparation problems. We will use g, which may be expressed as a mineral or phase misorientation function (Bunge and Weiland 1988) as Z F
OlivineAntigorite
.g/ D
f Olivine .g/:f Antigorite .g:g/dg
to predict the texture of antigorite from the measured texture of olivine. We will only use relation 1 of Boudier et al. (2009) because this relation was found to have a much higher frequency in their samples. We used the olivine texture database of Ben Ismail and Mainprice (1998), consisting of 110 samples and over 10,000 individual measurements made with an optical microscope equipped with a five axis universal stage as our model olivine texture illustrated in Fig. 3. The olivine model texture has the [100] aligned with the lineation and the [010] axes normal to the foliation.
2182
R. Hielscher et al.
The texture of the antigorite, calculated using phase misorientation functions, and the pole figures (Fig. 3) clearly show that Burgers orientation relationships between olivine and antigorite are statistically respected in the aggregates. The seismic properties of the 100 % olivine and antigorite aggregates were calculated using the methods described in Sect. 6.2 for individual orientations using the elastic single-crystal tensors for olivine (Abramson et al. 1997) and antigorite (Pellenq et al. 2009), respectively. The numerical methods for the seismic calculations are described by Mainprice (1990). The seismic velocities for a given propagation direction are on a five degree grid in the lower hemisphere. The percentage anisotropy (A) is defined here as A = 200(Vmax Vmin /=.Vmax C Vmin /. The Vp anisotropy is found by searching the hemisphere for all possible propagation directions for maximum and minimum values of Vp . There are in general two orthogonally polarized Swaves for each propagation direction with different velocities in an anisotropic medium. The anisotropy AVs can then be defined for each direction, with one S-wave having the maximum velocity and the other the minimum velocity. Contoured lowerhemisphere stereograms of P-wave velocity (Vp /, percentage shear-wave anisotropy (AVs), also called shear-wave splitting, as well as polarization .Vs1 / of the fastest S-wave are shown in Fig. 4. The seismic properties show a major change in the orientation of the fast direction of compressional wave propagation from parallel to the lineation (X / in the olivine aggregate to normal to the foliation (Z/ in the antigorite aggregate. In addition, there is a dramatic change of orientation of the polarization (or vibration) of the fastest S-wave (S1) from parallel to the (XY) foliation plane in the olivine aggregate to perpendicular to the foliation (Z/ in the antigorite aggregate. The remarkable changes in seismic properties associated with hydration of olivine and its transformation to antigorite have been invoked to explain the changes in orientation of S-wave polarization of the upper mantle between back arc and mantle wedge in subduction zones (Faccenda et al. 2008; Katayama et al. 2009; Kneller et al. 2008).
8
Future Directions
Although quantitative texture analysis has been formally available since the publication of the H.-J. Bunges classical book (1969), many of the original concepts only applied to single-phase aggregates of metals. Extension of these methods was rapidly made to lower crystal symmetry typical of rock-forming minerals and lower sample symmetry corresponding to naturally deformed rocks. The relationship of neighboring crystal orientations called misorientation has also now been widely studied. However, most rocks are poly mineral or poly phase, and the extension of quantitative texture analysis to poly phase materials has been slow to develop because a universal mathematical framework is missing. A coherent framework will encompass misorientation between crystals of the same phase and between crystals of different phases. Future research in this area based on the mathematical framework of this chapter will provide a coherent and efficient theoretical and numerical methodology. Other future developments will include
Fig. 3 Olivine CPO of the Ben Ismail and Mainprice (1998) database and the corresponding antigorite CPO calculated using phase misorientation function described in the text. Horizontal black lines on the pole figures marks the foliation (XY) plane of the olivine aggregates and the lineation (X/ is East-West. Contours in times uniform. Lower hemisphere equal area projection
Material Behavior: Texture and Anisotropy 2183
Fig. 4 The calculated seismic properties of the olivine and antigorite polycrystals with pole figures shown in Fig. 3. Vp is compression wave velocity, AVs is shear-wave splitting or birefringence anisotropy as percentage as defined in the text and Vs1 polarization is the vibration direction of the fastest S-wave. Horizontal black lines on the pole figures marks the foliation (XY) plane of the olivine aggregates and the lineation (X/ is East-West. Lower hemisphere equal area projection
2184 R. Hielscher et al.
Material Behavior: Texture and Anisotropy
2185
methods to quantify the statistical sampling of the orientation space of different types of data.
9
Conclusions
Forty years after Bunge’s pioneering Mathematische Methoden der Texturanalyse (Bunge 1969), which is most likely the single most influential textbook besides its English translation (Bunge 1982), this contribution to the Handbook of Geomathematics presents elements of mathematical texture analysis as part of mathematical tomography. The “fundamental relationship” of an orientation distribution and its corresponding “pole figures” was identified as a totally geodesic Radon transform on SO(3) or S3 H. Being a Radon transform, pole figures are governed by an ultrahyperbolic or Darboux-type differential equation, the meaning of which was furiously denied at its first appearance. In fact, this differential equation opened a new dimension, and its general solution, both in terms of harmonics and characteristics, suggested a novel approach by radial basis functions, featuring a compromise of sufficiently good localization in spatial and frequency domains. Availability of fast Fourier methods for spheres and SO(3) was the necessary prerequiste to put the mathematics of texture analysis into practice, as provided by the free and open-source toolbox MTEX.
References Abramson EH, Brown JM, Slutsky LJ, Zaug J (1997) The elastic constants of San Carlos olivine to 17 GPa. J Geophys Res 102:12253–12263 Altmann SL (1986) Rotations, quaternions and double groups. Clarendon, Oxford Barruol G, Kern H (1996) P and S waves velocities and shear wave splitting in the lower crustal/upper mantle transition (Ivrea Zone). Experimental and calculated data. Phys Earth Planet Int 95:175–194 Ben Ismail W, Mainprice D (1998) An olivine fabric database: an overview of upper mantle fabrics and seismic anisotropy. Tectonophysics 296:145–157 Bernier JV, Miller MP, Boyce DE (2006) A novel optimization-based pole-figure inversion method: comparison with WIMV and maximum entropy methods. J Appl Cryst 39:697–713 Bernstein S, Schaeben H (2005) A one-dimensional radon transform on SO(3) and its application to texture goniometry. Math Methods Appl Sci 28:1269–1289 Bernstein S, Hielscher R, Schaeben H (2009) The generalized totally geodesic Radon transform and its application in texture analysis. Math Methods Appl Sci 32:379–394 Boudier F, Baronnet A, Mainprice D (2009) Serpentine mineral replacements of natural olivine and their seismic implications: oceanic lizardite versus subduction-related antigorite. J Pet. doi:10.1093/petrology/egp049 Bunge HJ (1965) Zur Darstellung allgemeiner Texturen. Z Metallk 56:872–874 Bunge HJ (1969) Mathematische Methoden der Texturanalyse. Akademie-Verlag, New York Bunge HJ (1982) Texture analysis in materials science. Butterworths, Boston Bunge HJ, Weiland H (1988) Orientation correlation in grain and phase boundaries. Textures Microstruct 7:231–263 Cowley JM (1995) Diffraction physics, 3rd edn. North-Holland personal library. North-Holland, Oxford
2186
R. Hielscher et al.
Crosson RS, Lin JW (1971) Voigt and Reuss prediction of anisotropic elasticity of dunite. J Geophys Res 76:570–578 Epanechnikov VA (1969) Nonparametric estimates of a multivariate probability density. Theor Probl Appl 14:153–158 Eshelby JD (1957) The determination of the elastic field of a ellipsoidal inclusion, and related problems. Proc R Soc Lond A 241:376–396 Faccenda M, Burlini L, Gerya T, Mainprice D (2008) Fault-induced seismic anisotropy by hydration in subducting oceanic plates. Nature 455:1097–1101 Fengler MJ, Freeden W, Gutting M (2006) The Spherical Bernstein Wavelet. Int J Pure Appl Math, 31, 209–230 Forsyth JB (1988) Single crystal diffractometry. In: Newport RJ, Rainford BD, Cywinski R (eds) Neutron scattering at a pulsed source. Adam Hilger, Bristol, pp 177–188 Friedel G (1913) Sur les symetries cristallines que peut reveler la diffraction des rayons Röntgen. C R Acad Sci 157:1533–1536 Ganster J, Geiss D (1985) Polycrystalline simple average of mechanical properties in the general (triclinic) case. Phys Stat Sol (B) 132:395–407 Gel’fand IM, Minlos RA, Shapiro ZYa (1963) Representations of the rotation and Lorentz groups and their application. Pergamon, Oxford Gürlebeck K, Sprößig W (1997) Quaternionic and Clifford calculus for physicists and engineers. Wiley, New York Hall P, Watson GS, Cabrera J (1987) Kernel density estimation with spherical data. Biometrika 74:751–762 Hammond C (1997) The basics of crystallography and diffraction. Oxford University Press, Oxford Hanson AJ (2006) Visualizing quaternions. Morgan Kaufmann, San Francisco Helgason S (1984) Groups and geometric analysis. Academic, New York/Orlando Helgason S (1994) Geometric analysis on symmetric spaces. Mathematical surveys and monographs, vol 39. American Mathematical Society, New York/Orlando Helgason S (1999) The Radon transform, 2nd edn. Birkhäuser Boston, Boston Hielscher R (2007) The Radon transform on the rotation group-inversion and application to texture analysis. PhD thesis, TU Bergakademie Freiberg Hielscher R, Schaeben H (2008a) A novel pole figure inversion method: specification of the MTEX algorithm. J Appl Cryst 41:1024–1037 Hielscher R, Schaeben H (2008b) MultiScale texture modeling. Math Geosci 40:63–82 Hielscher R, Potts D, Prestin J, Schaeben H, Schmalz M (2008) The Radon transform on SO(3): a Fourier slice theorem and numerical inversion. Inverse Probl 24:025011 (21p) Hielscher R, Prestin J, Vollrath A (2010) Fast summation of functions on SO(3). Math Geosci, 42, 773–794 Hill R (1952) The elastic behaviour of a crystalline aggregate. Proc Phys Soc Lond Ser A 65:349–354 Hill R (1965) A self consistent mechanics of composite materials. J Mech Phys Solids 13:213–222 Johnson GC, Wenk HR (1986) Elastic properties of polycrystals with trigonal crystal and orthorhombic specimen symmetry. J Appl Phys 60:3868–3875 Katayama I, Hirauchi KI, Michibayashi K, Ando JI (2009) Trench-parallel anisotropy produced by serpentine deformation in the hydrated mantle wedge. Nature 461:1114–1118. doi:10.1038/nature08513 Kneller EA, Long MD, van Keken PE (2008) Olivine fabric transitions and shear wave anisotropy in the Ryukyu subduction system. Earth Planet Sci Lett 268:268–282 Kostelec PJ, Rockmore DN (2003) FFTs on the rotation group. Santa Fe institute working papers series paper, 03-11-060 Kreminski R (1997) Visualizing the Hopf fibration. Math Educ Res 6:9–14 Kuipers JB (1999) Quaternions and rotation sequences-a primer with applications to orbits, aerospace, and virtual reality. Princeton University Press, Princeton Kunze K (1991) Zur quantitativen Texturanalyse von Gesteinen: Bestimmung, Interpretation und Simulation von Quarztefügen. PhD thesis, RWTH Aachen
Material Behavior: Texture and Anisotropy
2187
Kunze K, Schaeben H (2004) The Bingham distribution of rotations and its spherical Radon transform in texture analysis. Math Geol 36:917–943 Mainprice D (1990) A FORTRAN program to calculate seismic anisotropy from the lattice preferred orientation of minerals. Comput Geosci 16:385–393 Mainprice D, Humbert M (1994) Methods of calculating petrophysical properties from lattice preferred orientation data. Surv Geophys 15:575–592 (Special Issue Seismic properties of crustal and mantle rocks: laboratory measurements and theoretical calculations) Mainprice D, Humbert M, Wagner F (1990) Phase transformations and inherited lattice preferred orientation: implications for seismic properties. Tectonophysics 180:213–228 Mainprice D, Tommasi A, Couvy H, Cordier P, Frost DJ (2005) Pressure sensitivity of olivine slip systems: implications for the interpretation of seismic anisotropy of the Earths upper mantle. Nature 433:731–733 Mao HK, Shu J, Shen G, Hemley RJ, Li B, Singh, AK (1998) Elasticity and rheology of iron above 220 GPa and the nature of the Earths inner core. Nature 396:741–743 Matthies S (1979) On the reproducibility of the orientation distribution function of texture samples from pole figures (ghost phenomena). Phys Stat Sol (B) 92:K135–K138 Matthies S, Humbert M (1993) The realization of the concept of a geometric mean for calculating physical constants of polycrystalline materials. Phys Stat Sol (B) 177:K47–K50 Matthies S, Vinel GW, Helming K (1987) Standard distributions in texture analysis, vol I. Akademie Verlag, New York Meister L, Schaeben H (2004) A concise quaternion geometry of rotations. Math Methods Appl Sci 28:101–126 Morawiec A (2004) Orientations and rotations. Springer, Berlin Morris PR (2006) Polycrystal elastic constants for triclinic crystal and physical symmetry. J Appl Cryst 39:502–508. doi:10.1107/S002188980 6016645 Muller J, Esling C, Bunge HJ (1981) An inversion formula expressing the texture function in terms of angular distribution function. J Phys 42:161–165 Nye JF (1957) Physical properties of crystals – their representation by tensors and matrices. Oxford University Press, Oxford Nikiforov AF, Uvarov VB (1988) Special functions in mathematical physics. Birkhäuser Boston, Boston Nikolayev DI, Schaeben H (1999) Characteristics of the ultrahyperbolic differential equation governing pole density functions. Inverse Probl 15:1603–1619 Pellenq RJM, Mainprice D, Ildefonse B, Devouard B, Baronnet A, Grauby O (2009) Atomistic calculations of the elastic properties of antigorite at upper mantle conditions: application to the seismic properties in subduction zones. EPSL submitted Prior DJ, Mariani E, Wheeler J (2009) EBSD in the Earth Sciences: applications, common practice and challenges. In: Schwartz AJ, Kumar M, Adams BL, Field DP (eds) Electron backscatter diffraction in materials science. Springer, Berlin Randle V, Engler O (2000) Texture analysis: macrotexture, microtexture, and orientation mapping. Gordon and Breach Science, New York Raterron P, Merkel S (2009) In situ rheological measurements at extreme pressure and temperature using synchrotron X-ray diffraction and radiography. J Synchrotron Radiat 16:748–756 Reuss A (1929) Berechnung der Fließgrenze von Mischkristallen auf Grund der Plastizitätsbedingung für Einkristalle. Z Angew Math Mech 9:49–58 Roe RJ (1965) Description of crystallite orientation in polycrystal materials III. General solution to pole figure inversion. J Appl Phys 36:2024–2031 Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27:832–837 Sander B (1930) Gefügekunde der Gesteine mit besonderer Bercksichtigung der Tektonite. Springer, Berlin, p 352 Savyolova TI (1994) Inverse formulae for orientation distribution function. Bunge HJ (ed) Proceedings of the tenth international conference on textures of materials (Materials Science Forum 15762), pp 419–421
2188
R. Hielscher et al.
Schaeben H (1982) Fabric-diagram contour precision and size of counting element related to sample size by approximation theory methods. Math Geol 14:205–216 [Erratum: Math Geol 15:579–580] Schaeben H (1997) A simple standard orientation density function: the hyperspherical de la Vallée Poussin kernel. Phys Stat Sol (B) 200:367–376 Schaeben H (1999) The de la Vallée Poussin standard orientation density function. Textures Microstruct 33:365–373 Schaeben H, Sprößig W, van den Boogaart KG (2001) The spherical X-ray transform of texture goniometry. In: Brackx F, Chisholm JSR, Soucek V (eds) Clifford analysis and its applications. Proceedings of the NATO advanced research workshop Prague, 30 Oct–3 Nov, 2000, pp 283–291 Schaeben H, Hielscher R, Fundenberger, J-J, Potts D, Prestin J (2007) Orientation density functioncontrolled pole probability density function measurements: automated adaptive control of texture goniometers. J Appl Cryst 40:570–579 Schwartz AJ, Kumar M, Adams BL (2000) Electron back scatter diffraction in materials science. Kluwer Academic, Dordrecht Scott DW (1992) Multivariate density estimation-Theory, practice, and visualization. Wiley, New York Tommasi A, Mainprice D, Cordier P, Thoraval C, Couvy H (2004) Strain-induced seismic anisotropy of wadsleyite polycrystals: constraints on flow patterns in the mantle transition zone. J Geophys Res 109:B12405, 1–10 Vajk KM (1995) Spin space and the strange properties of rotations. MSc thesis, UC Santa Cruz Van den Boogaart KG (2002) Statistics for Individual Crystallographic Orientation Measurements. PhD thesis, TU Bergakademie Freiberg Van den Boogaart KG, Hielscher R, Prestin J, Schaeben H (2007) Kernel-based methods for inversion of the Radon transform on SO(3) and their applications to texture analysis. J Comput Appl Math 199:122–140 Van Houtte P (1980) A method for orientation distribution function analysis from incomplete pole figures normalized by an iterative method. Mater Sci Eng 43:7–11 Van Houtte P (1984) A new method for the determination of texture functions from incomplete pole figures – comparison with older methods. Textures Microstruct 6:137–162 Varshalovich D, Moskalev A, Khersonski V (1988) Quantum theory of angular momentum. World Scientific, Singapore Vilenkin NJ (1968) Secial functions and the theory of group representations. American Mathematical Society, Providence Vilenkin NJ, Klimyk AU (1991) Representation of Lie groups and special fucntions, vol 1. Kluwer Academic, Dordrecht Voigt W (1928) Lehrbuch der Kristallphysik. Teubner-Verlag, Leipzig Vollrath A (2006) Fast Fourier transforms on the rotation group and applications. Diploma thesis, Universität zu Lübeck Watson GS (1969) Density estimation by orthogonal series. Ann Math Stat 40:1496–1498 Watson GS (1983) Statistics on spheres. Wiley, New York Wenk HR (1985) Preferred orientation in deformed metals and rocks: an introduction to modern texture analysis. Academic, New York Zuo L, Xu J, Liang, Z (1989) Average fourth-rank elastic tensors for textured polycrystalline aggregates without symmetry. J Appl Phys 66:2338–2341
Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P- and S-Wave Velocities Near Surface Profiling Andrey Konkov, Andrey Lebedev, and Sergey Manakov
Contents 1 2 3 4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scientific Relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Experimental and Theoretical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Data Acquisition and Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Forward and Inverse Problem Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusions and Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2190 2190 2193 2194 2194 2198 2200 2205 2205 2206
Abstract
This article presents the simultaneous analysis of the frequency dependence of the Rayleigh wave velocity accompanied with the frequency dependence of the ratio of projection amplitudes of particle displacement on the ground. As was shown, this complex analysis enables evaluating both shear wave velocity and the Poisson ratio profile in a horizontally layered media. On the one hand, data inversion including the Poisson ratio profile reduces the ambiguity of the inverse problem solution. On the other hand, the Poisson ratio is a well-known clue to distinguish remotely fluid saturation of rocks and soils as well as a nature of bonds between structural elements. Therefore, the seismic data inversion proposed provides valuable information for practical applications. Experiments were carried out at different times on the same site. Their results manifest the evidence to use the method proposed for remote diagnostics of the degree of fluid saturation of porous media in situ. A. Konkov () • A. Lebedev • S. Manakov The Institute of Applied Physics of the Russian Academy of Sciences, Nizhny Novgorod, Russia e-mail: [email protected]; [email protected]; [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_98
2189
2190
1
A. Konkov et al.
Introduction
The surface layer analysis in situ is of seismic engineering interest. The correlation between seismic and geological properties has been determined many times ago (e.g., Gorjainov and Ljachowickij 1979) that enables remote study of underground structures. The detection of very shallow underground structures and buried objects is of primary concern in many site investigations for civil and environmental engineering purposes. The use of ground-penetrating radars is limited for moderate electric conductivity soil formations. In such cases, seismic technic becomes almost only possible to apply. Some examples of mobile seismic equipment to resolve shallow underground structures and especially carse cavities which are represented a serious problem for Western part of Russia were discussed in review (Lebedev and Malekhanov 2003). The stability of constructions is largely dependent on the shear stiffness, the pore fluid saturation, and the lithology of near-surface geological structures. Therefore, the determination of corresponding parameters is of practical importance. Shear stiffness depends mainly on the shear wave velocity, while shear strength is mainly related to fluid content and lithological structure which both are linked with the value of the Poisson ratio (Nikitin 1981). The empirical relations exist to link shear modulus with shear strength (Nikitin 1981). Although these relations rather approximate the strength value, they are definitely very useful especially for artificially homogenized soils used in civil engineering because in this case the correlation between shear modulus and shear strength is pronounced very well (Nikitin 1981). It is well known that static and dynamic moduli differ (Mavko et al. 2009). Nevertheless, seismic methods are the only ones available to determine elastic properties of rocks and soils in natural conditions. The goal of presented paper is the description of new approaches to near-surface profiling based on the Rayleigh wave.
2
Scientific Relevance
There exist a number of seismoacoustic methods to study natural media in situ (Hatton et al. 1986; Sheriff and Geldart 1995; Yilmas 2001). Methods based on surface wave analysis are of a particular case. Due to the absence of a vertical scale in homogeneous elastic media, the Rayleigh wave on the boundary of homogeneous elastic half-space is nondispersive (Landau et al. 1986). In layered media, there exist vertical scales defined by the depths of layer boundaries, and as a result, the Rayleigh wave becomes dispersive with a set of Rayleigh wave modes (Aki and Richards 1980). The appearance of the Rayleigh wave dispersion is mainly due to
Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P-. . .
2191
Fig. 1 The qualitative dependence of penetration depth of the Rayleigh wave fundamental mode with frequency
the dependence of shear wave velocity with depth. By varying the frequency, one can assess the medium parameters at different depths (see Fig. 1). As one can see, the high frequencies “account for” the most shallow depths, while the low ones penetrate to deeper layers, respectively. The SASW (Spectral Analysis of Surface Waves) method has a long history (Aki and Richards 1980). The method was originated in the 1950s and became popular in the late 1980s with the development of high-powered computers and multichannel digital data acquisition systems (Stokoe et al. 1989). Nowadays, multichannel version of SASW is called MASW and its description can be found in several papers (e.g., Park et al. 1999 and MASW-related site: http://masw.com/). We don’t distinguish SASW and MASW below using SASW abbreviation for the briefness. The main advantage of SASW is the realization simplicity. One needs to have at least two geophones and a source (e.g., impulsive) placed on the line connecting them at a distance which is sufficient for neglecting the near-field effects of the source in the frequency band of interest. In the simplest case, the analysis of dependence of crossspectrum phase of two received signals on a frequency '.!/ allows one to determine the Rayleigh wave velocity as CR D !d =', where ! is a cyclic frequency and d is the distance between geophones (Stokoe et al. 1989). More complicated realization of SASW method with the use of arrays of receiving geophones requires gating and/or F-K filtering procedures to distinguish the Rayleigh wave contribution (Park et al. 1999). Corresponding procedure description and examples of their applications can be found in Hatton et al. (1986) and Yilmas (2001).
2192
A. Konkov et al.
In addition to the ease of SASW realization, there are other benefits. First, the most commonly used vertical force sources spend nearly 60 % of mechanical energy to excite the Rayleigh wave (Miller and Pursey 1954), and about 95 % of energy in subsurface layers is translated by the Rayleigh wave (Bondarev 2003). This fact is usually considered as a hindrance (so-called ground rolls (Aki and Richards 1980)) in commercial large-scale seismic exploration (Sheriff and Geldart 1995; Yilmas 2001), while the same is a benefit in engineering seismics (Nikitin 1981). Exponentially like decay of the Rayleigh wave amplitude with depth allows one to “vary” the layer’s thickness simply by changing the frequency of analysis. Second, the Rayleigh wave as any surface wave has a square root dependence of its amplitude on distance and thus attenuates less than body waves providing higher signal-tonoise ratio for geophones distant from the source. It is well known (e.g., Landau et al. 1986) that the Rayleigh wave velocity at the boundary of a homogeneous elastic half-space has a weak dependence on the Poisson ratio (). Because of that, in SASW applications, the value of the Poisson ratio is usually assigned to D 1=3 that is typical for dry rocks or close to D 0:5 for mud rocks or sandy porous rocks saturated with water. Explicit review including state of the art in SASW is presented in Maraschini (2008). This “standard” realization of the SASW method allows one determining the shear wave profile and evaluating the degree of consolidation via shear strength of the medium. While this material properties are of great importance, one can resume that no information about the Poisson ratio is available in SASW method as it is used now. On the other hand, rock physics points to the Poisson ratio as a clue to classify rock lithology. Mineralogy affects rock velocities in two ways. The most obvious and direct is through the bulk and shear moduli increase as mineral grains are appressed to each other by confining pressure. Indirectly, mineralogy controls the cementation, pore/crack distribution, and other structural properties of the rock (Winkler and Murphy 1995). One of the efficient rules to classify rocks acoustically is the use of Pickett and Castagna’s empirical relations for body wave velocity ratio VS =VP (Mavko et al. 2009). Similar although not such accurate expressions are known for soils (Nikitin 1981). Existing models of granular rocks are discussed in Mavko et al. (2009) providing solid basis to understand better the physics lying behind this. The value of VS =VP ratio directly depends on the Poisson ratio (Landau et al. 1986). From these considerations, a possibility to determine -value directly from seismic data looks very attractive and promising. There exists the problem of ambiguity of “standard” SASW solution because both the Poisson ratio and density are not determined in seismic data inversion but are predefined only. In near-surface layers of several tens depth, the density does not change significantly (Sheriff and Geldart 1995). Therefore, if one incorporates the Poisson ratio in seismic inversion, the ambiguity would be reduced. The Poisson ratio together with shear velocity defines compression body wave velocity VP . Therefore, including the Poisson ratio in the analysis is almost enough to provide unambiguous seismic inversion. All above considerations were the basic motivations of our work when we started.
Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P-. . .
2193
In “standard” realization of SASW technique, the dispersive properties of the Rayleigh wave are analyzed only. No amplitude analysis of ground particle displacement is performed. In a homogeneous elastic half-space, the ratio of horizontal projection of the displacement amplitude to the vertical one at the surface ( D Ux =Uz ) is a function of the Poisson ratio (White 1983). The value of is monotonous function of the Poisson ratio increasing from D 0:54 to D 0:78 with the Poisson ratio decreasing from D 0:5 to D 0. It can be assumed that the dependence of this ratio on a frequency could assess the Poisson ratio profile in a vertically stratified medium.
3
Key Issues
The horizontally homogeneous layered medium was considered. During the research, it was found that the variation of Ux =Uz ratio with frequency can provide sufficient information about the vertical distribution of the value of the Poisson ratio. This fact made it possible to modify the “standard” SASW method so as to realize the ability of restoring the profile of the Poisson ratio with depth. The analysis of the Rayleigh and body wave velocity dependence on the depth is of interest in terms of the assessment of soil’s friability (Bachrach 1999). Earlier in the experiment (Averbakh et al. 2008), a slow logarithmic relaxation time which was being associated with the presence of metastable states was observed. The presence and density of such states apparently relate to the soil’s porosity along with the capability of grains to perform mutual microscopic movements. Thus, the development of effective methods for granular media diagnostics in situ enables considering the fundamental problems of the mechanics of granular materials. This study, in our opinion, is not only of scientific interest (the assessment of the degree of consolidation of granular media and the degree of its saturation with liquid) but also of practical (the slope stability assessment). In the paper presented, we propose the development of the SASW method by applying the joint analysis of the frequency dependence of the Rayleigh wave velocity (dispersion characteristics) and the frequency dependence of the ratio between horizontal and vertical projections of displacement amplitudes. To our knowledge, the joint inversion of these two relations was not made before. It can be assumed that the lack of research in this field is related to the implementation complexity of the corresponding algorithms. Moreover, the analysis of displacement projection ratio requires the use of additional receiving equipment, while usually, when solving applied problems, geophysicists aspire to cover broader area of exploration. Let’s draw a brief summary and enumerate the goals of this paper: (i) The SASW method development based on considering the frequency dependence of the ratio between horizontal and vertical projections of displacement amplitudes in the Rayleigh wave
2194
A. Konkov et al.
(ii) Poisson ratio vertical profile reconstruction for seismic data obtained in field conditions (iii) Assessment of the influence of weather conditions on the medium’s characteristics as a trial implementation of media diagnostics and monitoring scheme in situ
4
Experimental and Theoretical Results
4.1
Data Acquisition and Processing
Two experiments were carried out: in July 2009 and October 2011 on the same site – geophysical test area of the Institute of Applied Physics, located about 30 km near Nizhny Novgorod. The near-surface sedimentary layers on this site are very typical for European Russia and are constituted by clayey sand, sandy clay, and clay itself which are substituted with carbonate rock formations at deep horizons (this sediment depth on average is of 100 m). All engineering communications and buildings are erected on such soils. Due to it, the site is very suitable place to develop seismoacoustic methods for remote diagnostics in situ. Seismic waves were excited by impulsive source in 2011 and by vibrational source in 2009. A broadband source of electromagnetic type producing vertical force directed downward (Fig. 2) was used as a vibrational source (Averbakh et al. 2008). Emitted signal with linear frequency modulation (chirp signal) in the frequency range of 50–500 Hz was generated by computer-controlled digital-to-analog device and was applied to the source through the conditioning electric circuits. In the case of vibrational source, the coherent accumulation of 100 snapshots and their subsequent averaging was performed. The principal advantage of coherent sources is the ability of signal accumulation in order to increase the probing depth and enhance the resolution at moderate and low levels of excitation (Averbakh et al. 2008; Lebedev and Malekhanov 2003). Vibrational source provided the penetration of the Rayleigh wave to small depths because of an excitation inefficiency at low frequencies. Impulsive source was used in 2011 to excite more efficiently low frequencies 10–20 Hz which was almost impossible to excite using vibrational source because its resonant frequency is near 16 Hz. For the impulsive source realization, the set of 10 consecutive hammer blows delivered on a plate of S D 0:1 m2 area was implemented without any accumulation (see below). According to the data obtained with the accelerometer mounted on the plate, its acceleration reached the value of 5 km/s2 with very short duration of 1 ms approximately (black line in Fig. 3). The signal is definitely rich with high frequencies which were not revealed in seismic waves and corresponds rather to plate ringing. Within seismic frequencies of 0– 100 Hz where the plate was rigid and not deformed, the amplitudes of acceleration were about 300 m/s2 (gray line in Fig. 3). With the known plate’s mass (21 kg), the pulse of the source from a hammer blow is estimated as F0 6104 N.
Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P-. . .
2195
Fig. 2 Vertical vibrator used in the experiment of 2009
Fig. 3 Typical acceleration recording in the experiment of 2011. Negative values correspond to downward direction
The amplitude of vertical velocity recorded by the nearest geophone was jVz j D 0:8 mm/s, and the corresponding force, exciting elastic waves, was F D 5;000 N, what is about 1=10 of the pulse force value estimated above. This points to significant plastic deformations in the area of excitation. Simple estimations of applied stress give the value F0 =S 0:6 MPa, while the shear modulus measured
2196
A. Konkov et al.
in the same site was 20 MPa (Averbakh et al. 2008). Therefore, the deformation at the area of excitation was about 0:03 for the pulse source that can be definitely considered as very large. For vibrational source, the deformation level was about three orders less, and the waves excited were almost linear with only small nonlinear distortions in the area close to the vibrator. Due to valuable plasticity at the area of impulse excitation, no averaging of snapshots could be performed and each record was considered individually. The signal-to-noise ratio was much greater for impulse excitation compared with vibrational source excitation, and frequency band of excited waves was within 0–40 Hz with maximum at 20 Hz. As a result, deeper penetration of the Rayleigh wave was observed in the experiment of 2011 compared with the experiment of 2009. Measurements with vibrational source were performed in July 2009. It was hot and dry that is typical for July in western Russia. The upper soil layers were consolidated due to evaporation of liquid from pores. The Rayleigh wave velocity was expected about 150 m/s that was recorded previously on the same site for similar weather conditions and season (Averbakh et al. 2008, 2009). Experiments with impulsive source were carried out in the mid-October. At that time, upper layers were saturated with moisture because of the abundance of precipitation that is also typical for this season and geographical location. Therefore, comparing the experimental results for various seasons, one can judge about a possibility to make remote diagnostics of the fluid saturation of porous medium in situ. The experiment layout is shown in Fig. 4. Marked directions depict the measured velocity recorded by linear array consisting of vector receivers. Arrows on the left image the displacement vector directions recorded by pairs of geophones. The signals were recorded by two digital multichannel engineering seismic stations “Lakkolit X-M2” (see www.geotechru.com for technical details) equipped with receiving geophones which are calibrated to use in 10–500 Hz band. Vector reception was made with the use of geophones with vertical and longitudinal horizontal polarizations located pairwise. By changing the vibrator position, it was additionally proved that the traces were nearly homogeneous along the array of geophones. Due to the symmetry of excitation by the vibrator producing vertical force, a horizontal transverse polarization displacement was negligible. The total number of geophones was 48. Geophones were placed equidistantly (24 for each of the projections) creating thus a receiving linear array. The distance between pairs was x D 1 m in the experiment with vibrating source and x D 2 m in the experiment with low-frequency impulsive source. Corresponding values of recording time of one snapshot were 1,024 and 3,072 ms, respectively. Whole wave responses of the medium are shown in Fig. 5. These responses include not only hodographs corresponding to Rayleigh waves but also hodographs corresponding to reflection by the boundary of 15 m depth (water-table horizon) and head wave refracted at the boundary of 6:3 m depth. As it was expected due to the large layer depths and presumably weak season perturbations, these hodographs are almost coincided for both 2009 and 2011 experiments. The contributions of Rayleigh waves (marked in Fig. 5) are different for two experiments that clearly point to season changes of soil layers above 6 m depth.
Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P-. . .
2197
Fig. 4 Experiment layout. On the right side, there is a photo of a geophone pair (vector receiver)
Fig. 5 Comparison of whole wave responses in the experiments of 2009 (red) and 2011 (blue). The data for Ux -projection are shown in the left and the data for Uz -projection in the right
To extract the Rayleigh wave contribution from the entire seismic response, F-K filtering was used. This is a very effective procedure of seismic data processing (Hatton et al. 1986; Yilmas 2001) which transforms time series in each offset of seismic snapshot into spatial (k-space) and temporal (!-space) frequency domains. Phase velocity is equal to V D !=k, where k is a wave number. Because of that, the F-K filtering is very suitable to analyze dispersive characteristics of waves. On the F-K spectra depicted in Fig. 6, the Rayleigh wave contribution is marked. On the spectra obtained from horizontal and vertical receivers, the characteristic lines corresponding to the Rayleigh wave were derived. Their slope determines the phase velocity. The projection ratio was calculated by division of spatiotemporal spectra in points corresponding to the Rayleigh wave contribution. As in the case of one-dimensional Fourier transform, a two-dimensional spectrum is a subject to aliasing – an effect that causes different signals to become indistinguishable (in our case the signals obtained from receiver array). This leads to a visible periodicity (white stripes with the same slopes) in Fig. 6. The effect of aliasing is due to a sampling in time and space of the medium response received (Hatton et al. 1986). In this case, either temporal or spatial frequencies is above the corresponding Nyquist frequencies. These frequencies are expressed
2198
A. Konkov et al.
Fig. 6 F-K spectra of the signal received from vertical (on the left) and horizontal (on the right) geophones (above – the case of impulsive source; below – the case of vibrational source)
in terms of sampling temporal (t D 1 ms) or spatial (x) intervals: !N D =t ' 3;142 s1 and kN D =x. For the experiments conducted in 2009 .2009/ ' 3:14 m1 and and 2011, respectively, the spatial Nyquist frequencies are kN .2011/ ' 1:57 m1 . Unambiguous frequencies in both spectra are j!j !N and kN jkj kN . In both experiments, the data acquisition in time–frequency domain satisfies Nyquist theorem requirements, and emitted frequencies were below the Nyquist one. The velocity of Rayleigh wave was about CR (80–220) m/s (see below) depending on frequency (layer depth). It is easy to see that some wave numbers k D !=CR do not satisfy the Nyquist theorem and the aliasing occurred in the experiment of 2011: max.k/ kN . However, it is obviously not a hindrance to obtain unambiguous results.
4.2
Forward and Inverse Problem Solution
The standard way to calculate the Rayleigh wave velocity in a layered medium is described in Aki and Richards (1980). Matrix propagator can be defined by different
Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P-. . .
2199
ways. We use standard description of displacement and stress fields via scalar and vector potentials. This way is straightforward although not the best from analytical point of view (see Aki and Richards 1980 for details). For computer realization, the description details are not important. As the theory is well known and described in details (Aki and Richards 1980), we restrict ourselves by brief reminding the scheme of calculations. The physical values measured are assumed being associated with wave motion. The displacement vector can be represented as a superposition of scalar and vector potentials. Due to the problem linearity and wave nature of the displacement field, both potentials are governed by wave equations of Helmholtz type. Both displacement vector and two components of stress tensors above and beneath arbitrary boundary should be equal to each other. The medium beneath the deepest layer can’t contain waves coming from the infinity, and all physical values have to be limited as the depth increased (Sommerfeld conditions or casuality requirements). And finally, boundary conditions at free surface are the equality of all acting forces to zero. These simple rules enable creating the matrix equations to find wave amplitudes for each !; parameter, where is the projection of wave vector to the direction of wave propagation. Actually, .!; / are the same values as in F-K filtration procedure except near-field zone close to the source. The bulky expressions thus coming out are omitted here since they could be found in many sources (e.g., Aki and Richards 1980). They yield to matrix equations for the amplitudes of potentials in each layer. Zeros of the determinant of this matrix correspond to surface waves (Rayleigh wave fundamental and higher modes) as well as waveguide propagation modes (canal waves [ibid.]). Among the other wave motions, only the Rayleigh fundamental mode has zero cutoff frequency and can propagate at zero frequencies. Zeros of the matrix determinant were being calculated starting with the lowest frequencies. With frequency increasing, the value found in the previous step was being used as the initial approximation for the next step and so on. The analysis of F-K spectra of the signals has shown that higher modes appear at frequencies above 130 Hz where their examination did not make sense due to low signal-to-noise ratio. In that way, the dispersion characteristic meets the Rayleigh wave fundamental mode. For each found root of the determinant, the Rayleigh wave phase velocity along with displacement projection ratio was calculated as functions of depth. So, the measured parameters in our model were the Rayleigh wave phase velocity and the displacement projection ratio. Unknown parameters were the number of layers, their thickness, and the velocities of pressure and shear waves (let us remind that the “standard” SASW method realization implies the ratio between body wave velocities to be specified using a priori considerations). The medium parameter retrieval was carried out by satisfying the minimum root-mean-square deviation between measured and calculated frequency dependencies of the Rayleigh wave velocity and the projection ratio. Expressions for calculated values do not include the density by itself but the ratio between densities of adjacent layers. During the investigation, it was proved that the density is not a key parameter and at least for the experimental data received
2200
A. Konkov et al.
the density varied within the reasonable limits of 20 % can perturb the dispersion characteristic and the projection ratio within the limits of 5 % only. Therefore, the density could be specified as an arbitrary constant for all layers (density ratio is equal to one) without valuable effect on the final results. To find parameters (perform an inversion), we used both stochastic simulated annealing method for preliminary parameter estimation and Newton gradient search for parameter adjustment. Simulated annealing algorithm was used to search for an initial approximation for the gradient method. The annealing method does not require the calculations of derivatives that increase the computational efficiency. Gradient search was applied in cases when the parameter values were close to optimal (stable values during a long time of annealing) or the goal function surface became smooth. In the first phase of optimization procedure, we additionally “froze” the interface depth in order to reduce the computation time. If we obtained a reasonable self-consistent result, a more accurate analysis was being performed afterward – using the information derived as an initial approximation and taking into account all the parameters. All these tricks were used to make calculations within a reasonable time using standard desktop computers. Also we encountered with dilemma concerned with a choice of the number of layers in mathematical model. On the one hand, we had to increase it according to the fact that parameterized model should image a real structure in the most precise way in order to superimpose calculated and measured frequency dependencies. On the other hand, for the exclusion of difficulties determined by possible inconsistency of the problem, we had to decrease the layer quantity. This is a well-known fact in the practice of inverse problems: due to noises (both in measurements and the inadequacy of idealized model to real world), the increase of details does not lead to accurate and stable solution (interesting geophysical examples can be found in Hatton et al. 1986). We resolved this problem in the following way: the number of layers was large initially, while during the inverse problem solution, it was being decreased step-by-step on account of their integration – when the medium parameters found in adjacent layers became too close or the layer became too thin.
4.3
Results and Discussion
Figure 7 depicts the measured and calculated frequency dependencies of the Rayleigh wave phase velocity (CR ) and the displacement projection ratio (Ux =Uz ) in the case of impulsive source. The signal obtained from impulsive source had maximal spectral power density in the vicinity of 20 Hz with a significant weakening of spectral components at frequencies above 45 Hz and below 5–10 Hz. Therefore, the data analysis in these frequency bands was hardly possible. Frequencies below 15 Hz were also excluded from consideration because geophones have a resonance at this frequency. At lower frequencies, their sensitivity decreases as the square of frequency, and further, it is limited by input filters of seismic stations below the frequency of 5 Hz. The shaded gray areas image the measurement results for 10 consecutive hammer blows delivered on the plate.
Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P-. . .
2201
Fig. 7 Frequency dependencies in the case of impulsive source. Solid red lines are consistent with the inverse problem solution
Figure 8 depicts the measured and calculated frequency dependencies of the Rayleigh wave phase velocity (CR ) and the displacement projection ratio (Ux =Uz ) in the case of vibrational source. The shaded gray area images the scatter of the data when analyzing the Rayleigh wave velocity separately for each of the displacement projections. The measured values of the ratio of the displacement projection amplitudes are marked with symbols on the right plot. The frequency band was 50–500 Hz (it was impossible to radiate lower frequencies because of the vibrator resonance in the vicinity of 20 Hz). The frequencies above 200– 250 Hz were a subject of strong attenuation presumably due to a scattering by local inhomogeneities of the medium (grass, shrubs, roots) and so were characterized by low signal-to-noise ratio. Therefore, the dispersion characteristics for these frequencies are not given. The values of projection ratio at frequencies above 125 Hz were also excluded from consideration due to noise dispersion. Since the data displayed in Figs. 7 and 8 correspond to measurements made at different times, we have not combined them in one figure in the frequency band 15–250 Hz. The solid red lines in Figs. 7 and 8 depict the calculated values of the medium parameters which are consistent with the inverse problem solution obtained as described above. The parameters just mentioned are listed in Table 1 (corresponding profiles are depicted in Fig. 9). The Rayleigh wave penetration depth is approximately equal to half the wavelength. For the data received in 2011 (with impulsive source) at the average (over all the layers) velocity of 147 m/s, the Rayleigh wave penetrates to depths of 1.5–5 m in the frequency band 15–45 Hz. For the data obtained in 2009 (with vibrational source), the Rayleigh wave velocity is 124 m/s on average and the penetration depth is 0.25–1.2 m in the frequency band 50–250 Hz. Rayleigh wave modes are a component of the whole wave solution for a layered medium that include body waves, head waves, etc. (Aki and Richards 1980). The
2202
A. Konkov et al.
Fig. 8 Frequency dependencies in the case of vibrational source. Solid red lines are consistent with the inverse problem solution
Table 1 Parameters of the layered medium (the depth of the layer interface z, compression and shear wave velocities in the layers (VP and VS , respectively), and the corresponding Poisson ratio ) resulting from the inversion of data received in 2009 and 2011 Data received in 2009 (vibrational source) Layer No. z (m) VP (m/s) VS (m/s) 1 171 105 0:20 2 0:4 264 166 0:17
Data received in 2011 (impulsive source) Layer No. z (m) VP (m/s) VS (m/s) 1 279 86 2 0:5 315 106 3 0:8 216 148 4 1:3 285 201 5 2:3 258 183 6 4:1 379 222
0:44 0:43 0:06 0 0 0:24
type of source used influences mode amplitudes only, while the Rayleigh wave structures are maintained. Because of that, we can compare two profiles although the source types were different in both experiments. The Rayleigh wave penetration depths for frequency margins are approximately the same in the experiments of 2009 and 2011. Due to this, we were able to distinguish medium changes caused by different weather conditions (fluid saturation). It is important to note the larger shear wave velocity (and so shear stiffness) and lower Poisson ratio in the experiment conducted in 2009 comparable with the experiment of 2011. The experiment of 2009 was performed at hot and dry summer time (early July). In this case, it could be assumed that due to a fluid evaporation from the pore space, the formation of capillary menisci had occurred, and as a consequence, the grains were strongly pressed together by capillary forces (e.g., Averbakh et al. 2010). The medium shear stiffness on the depths up to 0.5–1 m was increased, while the Poisson ratio was decreased. The experiment of 2011 was
Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P-. . .
2203
Fig. 9 Inversion results for experiments of 2011 (on the left) and 2009 (on the right). Shear and pressure wave velocity profiles are depicted with red and black lines, respectively. The Poisson ratio for each layer is expressed numerically
carried out in the 11–12th of October. At this time, the ground was saturated with rainwater, while there was no precipitation during measurements. The clay particles composing soil are swollen due to intercalation of water between very thin clay lamellae and creation of double electric layers (Sokolov 2000). Therefore, the bonds containing clay particles are weakened. The elastic moduli and especially shear ones should decrease. On the other hand, the pore space saturation with water leads to the bulk stiffness increase. Both factors are revealed in data shown in Fig. 9. As a result, P-wave velocity should increase as well as the Poisson ratio. In the first approximation, the seismic properties of soils with small percentage of clay particles (e.g., sandy clay) can be described by rock physics theories (Gorjainov and Ljachowickij 1979). If one applies Gassmann’s theory (see Mavko et al. 2009 for details) to determine the Poisson ratio of the soil saturated with fluid, then the value obtained will be within 2 Œ0:49; 0:499 for any reasonable porosity and other governing parameters. The result is definitely out of correspondence with the data in Table 1. This points to either partial saturation or strong influence of clay–water interactions. The evaluation of clay–water interaction effects is very complicated and out of scope of this paper. The helpful information to compare the results obtained (Table 1) with known ones can be found in Gorjainov and Ljachowickij (1979). According to this book, body wave velocity ratio of dry soils with clay content is VS =VP D 0.5–0.6, while full saturation leads to VS =VP D 0.1– 0.2. These values correspond to the Poisson ratio dry D 0.22–0.33 for dry soil and sat D 0.48–0.49 for fully saturated soil. Although these values are less than Gassmann’s theory prediction, they are still greater than those in Table 1. We presume that the disagreement is mainly due to partial saturation. Two reasons can be argued in favor of this presumption. First, it is known that fluid
2204
A. Konkov et al.
Fig. 10 S-wave profile comparison obtained from experiments of 2009 (red line), 2011 (blue line), and 2006 (black line). Black line corresponding to the top layer is dashed because the upper boundary could not be determined precisely due to experiment’s specifics
saturation of soils with clay particles is a complicated process (Gorjainov and Ljachowickij 1979). At initial stage of saturation, the bonds between grains are strengthened, while water is bound by the surface of clay particles. Bond weakening was observed at almost full saturation. Therefore, due to the bond strengthening, the Poisson ratio found (Table 1) can be not as large as could be expected. Second, the Poisson ratio values in layers No. 3–5 seem to be in compliance with the case of dry granular media (Mavko et al. 2009), and this points to possible water drainage downward through these layers. The known lithological structure (see below) and the water-table horizon location at the depth of 15 m are in agreement with the drainage presumption also. The site where experiments with the Rayleigh wave were carried out is used by us for many other experimental investigations to develop novel methods in geophysics. Because of that, it would be very useful to compare the profiles shown in Fig. 9 with the shear wave profile obtained before for the depth range of 1–15 m. The experiment with cross-well profiling using SH-wave source was performed in June 2006, and weather conditions were almost the same as in the experiment of 2009 described here. A very accurate phase method was used to obtain S-wave velocity profile for low-contrast boundaries (the details can be found in Averbakh et al. (2012)). All three S-wave velocity profiles are shown in Fig. 10. The shear wave velocity of the upper layer coincides with the shear wave velocity in the second layer of the profile of 2009. In addition, the same boundary on the depth of 4 m is observed in experiments with SH-wave and impulsive sources. The facts just mentioned give evidence on the reasonableness of the data received. It is possible to distinguish one more interesting peculiarity in S-wave velocity variation for the data of 2006 and 2011. The visible difference in data received
Rayleigh Wave Dispersive Properties of a Vector Displacement as a Tool for P-. . .
2205
within the depths 1–4 m and almost coincidence in data corresponding to the depth of 5 m presumably points to water infiltration up to the depth of 4 m in the conditions of the experiment of 2011. According to geological data (analysis of core) obtained during drilling the well for water supply (that is several hundred meters distant from the site where measurements were performed), the first 2–5 m of depth consists of clayey soil which is substituted with sandy clay beneath 5 m. Clayey soil contains more clay particles which can swell with water saturation than sandy clay. We presume this was a reason of observed changes although more thorough analysis and experiments are required.
5
Future Directions
In the summer of 2012 on the same test area, the field experiments were conducted in order to determine the Rayleigh wave characteristic variation during the artificial soil saturation with water (43 l that is about 11 gal of water has been poured gradually on the control area of 1 m2 ). Preliminary results of the processing of the data obtained show that the scheme of experiment setup with the use of coherent source allows evaluations (including quantitative ones) of water saturation effect on the medium characteristics. Thus, we seem to have all prerequisites to ensure that the studies in this direction are promising. They reveal new possibilities for instantaneous diagnostics and monitoring of natural media in situ. This can have both engineering and ecological value.
6
Conclusions and Acknowledgments
In summary, the following conclusions could be made. The method proposed has shown its efficacy. By combining the Rayleigh wave velocity dispersion with frequency dependence of the ratio between horizontal and vertical projections of the displacement vector, one can recover both the bulk and shear wave velocity profiles. It thus eliminates the ambiguity of the inversion in well-known SASW method. The comparison with other data obtained by different techniques points to correctness of the Rayleigh wave data processing presented here. Found changes of the values of shear modulus and the Poisson ratio point to accurate determination of the horizon where fluid infiltrates. The authors would like to thank their colleagues: N.I. Vasilinenko for the help in experiment arrangement and Dr. V.S. Averbach for his invaluable discussions. The work was partly supported by grants of RFBR No. 11-05-00774, 11-0201419, 13-05-97053, and 13-05-97061 and the program of fundamental scientific investigations “Coherent acoustic fields and signals.”
2206
A. Konkov et al.
References Aki K, Richards P (1980) Quantitative seismology, theory and methods. W.H. Freeman, San Francisco Averbakh V, Lebedev A, Maryshev A, Talanov V (2008) The diagnostics of unconsolidated media acoustic properties in the field conditions. Acoust Phys 54(4):526–537 Averbakh V, Lebedev A, Maryshev A, Talanov V (2009) Observation of slow dynamics effects in nonconsolidated media under in situ conditions. Acoust Phys 55(2):211–217 Averbakh V, Bredikhin V, Lebedev A, Manakov S (2010) Acoustic spectroscopy of fluid saturation effects in carbonate rock. Acoust Phys 56(6):794–806 Averbakh V, Lebedev A, Manakov S, Talanov V (2012) Phase method of cross-well profiling using coherent SH-waves. Acoust Phys 58(5):596–602 Bachrach R (1999) High resolution shallow seismic subsurface characterization. PhD thesis, The department of geophysics of Stanford university Bondarev V (2003) Seismic survey basis. UGGGA, Yekaterinburg Gorjainov N, Ljachowickij F (1979) Seismic methods in engineering geology. Nedra, Moscow Hatton L, Worthington MH, Makin J (1986) Seismic data processing: theory and practice. Blackwell Scientific, London Landau LD, Lifshitz EM (1986) Theory of elasticity. Butterworth-Heinemann, Oxford Lebedev A, Malekhanov A (2003) Coherent seismoacoustics. Radiophys Quantum Electron 46:523–538 Maraschini M (2008) A new approach for the inversion of Rayleigh and Scholte waves in site characterization. PhD thesis, Torino polytechnic university Mavko G, Mukerji T, Dvorkin J (2009) The rock physics handbook: tools for seismic analysis in porous media. Cambridge University Press, Cambridge Miller G, Pursey H (1954) The field and radiation impedance of mechanical radiators on the free surface of a semi-infinite isotropic solid. Proc R Soc (Lond) A223:521–541 Nikitin V (1981) Engineering seismic basics. Moscow State University, Moscow Park C, Miller R, Xia J (1999) Multichannel analysis of surface waves. Geophysics 64(3):800–808 Sheriff R, Geldart L (1995) Exploration seismology. Cambridge University Press, New York Sokolov V (2000) Clays and their properties. Soros Educ J 6(9):59–65 Stokoe K, Rix G, Nazarian S (1989) In situ seismic testing of surface waves. In: Proceedings of 12th international conference on soil mechanics and foundation engineering, vol 1, Rio de Janeiro, pp 331–334 White J (1983) Underground sound, application of seismic waves. Elsevier, New York Winkler K, Murphy W (1995) Acoustic velocity and attenuation in porous rocks. In: A handbook of physical constants, vol 3. American Geophysical Union, Washington, DC, pp 20–34 Yilmas O (2001) Seismic data analysis. Society of Exploration Geophysics, Tulsa
Simulation of Land Management Effects on Soil N2 O Emissions Using a Coupled Hydrology-Biogeochemistry Model on the Landscape Scale Martin Wlotzka, Vincent Heuveline, Steffen Klatt, Edwin Haas, David Kraus, Klaus Butterbach-Bahl, Philipp Kraft, and Lutz Breuer
Contents 1 2
3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Biogeochemistry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Hydrology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Consecutive Operator Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2208 2210 2210 2214 2220 2221
M. Wlotzka () University of Heidelberg, Interdisciplinary Center for Scientific Computing, Engineering Mathematics and Computing Lab, Heidelberg, Germany e-mail: [email protected]; [email protected] V. Heuveline Engineering Mathematics and Computing Lab (EMCL), Karlsruhe Institute of Technology, Karlsruhe, Germany Institute for Applied and Numerical Mathematics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany University of Heidelberg, Interdisciplinary Center for Scientific Computing, Engineering Mathematics and Computing Lab, Heidelberg, Germany e-mail: [email protected];[email protected] S. Klatt • E. Haas • D. Kraus • K. Butterbach-Bahl Karlsruhe Institute of Technology (KIT), Institute of Meteorology and Climate Research, Garmisch-Partenkirchen, Germany e-mail: [email protected]; [email protected]; [email protected]; [email protected] P. Kraft • L. Breuer Institute of Landscape Ecology and Resources Management, Justus-Liebig-University of Giessen, Giessen, Germany e-mail: [email protected]; [email protected]
© Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_86
2207
2208
M. Wlotzka et al.
3.2 Simulation with Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Concurrent Operator Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Simulation with OpenPALM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2221 2221 2223 2224 2228 2228
Abstract
Agricultural soils are the primary anthropogenic source of atmospheric N2 O. Greenhouse gas (GHG) emissions from soils are mainly the result of microbial processes such as nitrification/denitrification. These processes have a strong dependency on environmental factors like temperature, moisture, soil and vegetation properties, or the land management. Therefore, emissions occur with a high spatial and temporal variability giving rise to hot spots and hot moments. Quantifying sources and sinks of GHG like CO2 , N2 O, and CH4 for natural, agricultural, and forest ecosystems is crucial for our understanding of impacts of land management on the biosphere-atmosphere exchange of GHG and for the development of mitigation options. GHG exchange from soils is driven by complex microbial and plant nutrient turnover processes, and it is the net result of all physicochemical and biological processes involved in production, consumption, and transport. Process-oriented biogeochemical models are useful tools for integrating our knowledge of the key processes and drivers to estimate carbon and nitrogen (C and N) trace gas emissions from soils. In this study we have coupled the LandscapeDNDC ecosystem model to the CMF (Catchment Modeling Framework) hydrology model generating a modeling system capable to assess the C and N cycling and their feedbacks to crop growth and microbial processes on the landscape scale. The deployed coupling approach by the use of the parallel MPI-based OpenPALM coupler enables the simulation of lateral exchange of nutrients (nitrate) with the soil water fluxes and therefore to assess the C and N cycling on the landscape scale. In this study we describe the coupling approach and present simulation results of crop growth, nutrient cycling and resulting nitrous oxide emissions on a virtual landscape.
1
Introduction
Agriculture contributes remarkably to the world greenhouse gas emissions. The agricultural sector is mainly a source for the greenhouse gases methane (CH4 ) and nitrous oxide (N2 O). Besides the production of methane in livestock and manure, both CH4 and N2 O volatize from agricultural soils. Microbial processes in the soil generate these gases. For example, CH4 levels can result from methanogenesis and methanotrophy in wetlands or rice paddies (Cicerone and Shetter 1981; Wassmann et al. 1993), and N2 O can result from nitrification and denitrification in arable soils or grasslands (Firestone and Davidson 1989). The microbial processes depend strongly on environmental factors like temperature, moisture, soil and vegetation properties, and the anthropogenic land management. Thus, greenhouse
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
2209
gas emissions from soils exhibit a high degree of temporal and spatial variability (Butterbach-Bahl et al. 2004b; Li et al. 2005; Del Grosso et al. 2005, 2010; Blagodatsky et al. 2011). A reliable estimate of the regional greenhouse gas source strength of soils based on measurements is often not available. That might require a combination of bottom-up and top-down approaches, like chamber measurements, eddy covariance flux measurements, remote sensing, and tall-tower measurements of greenhouse gas fluxes (Schulze et al. 2010). Many countries apply the IPCC emissions factor methodology (IPCC 2006; IPCC 2007) for calculating and reporting the national greenhouse gas sink and source strength of soils (Dämmgen and Grünhage 2002; Luttich et al. 2007; Mander et al. 2010). This methodology assumes a linear relation between the intensity of activities like cultivation of crops or application of fertilizer and the emissions resulting from these activities. The emission factor approach has shortcomings, since it cannot account for temporal and spatial variations. It can hardly cover emission hot spots due to soil or climatic conditions or hot moments due to nitrogen fertilizer application. Moreover, the emission factor methodology does not allow the development of region and site-specific mitigation strategies. To characterize spatial and temporal patterns of ecosystem greenhouse gas exchange, more detailed approaches are needed (Butterbach-Bahl et al. 2004a; Del Grosso et al. 2005; Li et al. 2005; Smith et al. 2008). Recent studies (Chatskikh et al. 2005; Werner et al. 2007; Beheydt et al. 2007; Blagodatsky et al. 2011; Chirinda et al. 2011; Haas et al. 2012) have shown that process-based biogeochemical models allow the simulation of soil greenhouse gas emissions for a range of ecosystem types. These models are predominantly targeting the site scale. They have been used for calculating regional and national N-trace gas emission inventories, but generating, handling, and assessing of input and output data as well as the simulation runtime remain a major obstacle. For that reason some models, e.g., ECOSSE (Smith et al. 2010), use different levels of process description and input information when running in site or regional mode. Models which have been designed for regional to continental scale applications, like the dynamic vegetation models LPJmL (Zaehle et al. 2005; Bondeau et al. 2007), ORCHIDEE-STICS (de Noblet-Ducoudre et al. 2004; Gervois et al. 2008), or JULES (Van den Hoof et al. 2011), mostly focus on ecosystem C dynamics over decades up to centuries. Due to performance reasons these models comprise a low level of detail in the process descriptions. Some models even lack N-cycle simulations. To our knowledge, all existing models which have been used for calculating regional or national greenhouse gas inventories so far are of one-dimensional character. They neglect lateral matter exchange with adjacent simulation units which can be driven by topographical differences. However, lateral fluxes may significantly affect carbon, nitrogen, and water cycles of the simulated ecosystem. For example, the biosphere-atmosphere exchange in riparian zones depends largely on water and nutrient input from the surrounding landscape (Haas et al. 2012). Up to now, water and nutrient fluxes have been considered either by complementing existing hydrological models with mostly simple biogeochemical features (Pohlert et al. 2007a,b; Kemanian et al. 2011) or by offline coupling of biogeochemical models with hydrological transport models (Cui et al. 2005).
2210
M. Wlotzka et al.
In this article, we show that a two-way coupling of biogeochemical and hydrological models allows consistent simulations with a high level of detail. We use LandscapeDNDC (Haas et al. 2012) from the denitrification-decomposition (DNDC) model family in combination with a process-based plant growth model for simulating the carbon and nitrogen turnover in the soil. LandscapeDNDC has been coupled with the Catchment Modeling Framework (CMF) (Kraft et al. 2011) for simulating the water flow and the transport of nutrients in the saturated and unsaturated zone. We realize the construction of the coupled model prototype by compiling both models into libraries and making them accessible via a coupling program in Python. The lab scale Python prototype is used to proof consistency of the coupling approach. For large-scale application of the coupled modeling system, we use the parallel software coupler tool OpenPALM. This model coupling approach enables us to perform consistent simulations of landscape scale carbon and nitrogen cycling including lateral water and nutrient transport and feedbacks to the biogeochemistry and plant physiology. Using high performance computing resources, we aim at simulations on the landscape to the regional scale in space and over decades on the temporal scale.
2
Models
Process-oriented biogeochemical models are useful tools for integrating our knowledge of the key processes and drivers to estimate carbon and nitrogen (C and N) trace gas emissions from soils. GHG exchange from soils is driven by complex microbial and plant nutrient turnover processes, and it is the net result of all physicochemical and biological processes involved in production, consumption, and transport.
2.1
Biogeochemistry
In the following we will outline the most important processes, which directly influence nitrous oxide (N2 O) emissions, namely, microbiological nitrification and denitrification and diffusive gas transport. Nitrification is an aerobic process and thus bound to oxygen contained within the soil matrix. Denitrification, in contrast, is an anaerobic process, which is strongly inhibited in the presence of even low concentrations of oxygen (Groffman et al. 2009). Both processes depend on the so-called anaerobic volume fraction fav , which is calculated in dependency on oxygen concentration. Let ˝ R be a one-dimensional vertical soil column, then the anaerobic volume fraction is defined as fav W ˝ ! Œ0; 1 ; p
fav .x/ WD e
˛cO2
;
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
2211
with ˛ > 0 being an appropriate scaling parameter. This definition is used for nitrification as well as denitrification rate calculations. Nitrification encompasses two steps, firstly turning ammonium (NHC 4 ) to nitrite (NO 2 ) and secondly turning nitrite to nitrate (NO3 ). Nitrous oxide is produced as by-product of the first step depending on temperature (T), water content (), pH, and nitrification rate: @t cN2 O D KN2 O ftm;N2 O .T; / fpH;N2 O @t cNO2 : The nitrification rate is given by @t cNO2 D KNO2
2 cam .T; / 1 1 fpH;NO C fNH 2
fav : 4
KN2 O , KNO2 , and KNH4 are process-specific constants, which can be partly derived from experimental measurements. Following the concept of microbial activity after Blagodatsky and Richter (1998) the term cam .T; / represents the current active part of total microbial biomass (cm ): cam .T; / WD cm ftm;C .T; /: The microbial activity ftm;C .T; / again depends on temperature and water content. The function ftm;C W R RC 0 ! Œ0; 1 is given by the harmonic mean of factors accounting for temperature and water content, respectively. The water content dependency is modeled by a Weibull distribution. The terms fpH W RC 0 ! Œ0; 1 and fNH4 W RC 0 ! Œ0; 1 are reduction factors accounting for unfavorable pH and ammonium concentration. In contrast to nitrification as nitrate production process, denitrification describes the stepwise reduction of nitrate to molecular nitrogen (N2 ) via nitrite, nitric oxide (NO), and nitrous oxide. Description of denitrification was proposed in Leffelaar and Wessel (1988). Let D fNO 3 ; NO2 ; NO; N2 Og be the set of the concentrations of reducible N compounds . The terms @t c denote consumption rates of the next higher oxidized N compound. The turnover rate of each N compound 2 is given by P 1 0 C 2 B P @t c D cam .T; / @ fC fO2 fpH; @t c ; C m P A Y K C 2
2
where ; Y , and m are constants representing microbial growth rate, yield, and maintenance respiration for each N compound , respectively. As for nitrification there is for each step of denitrification a specific pH reduction given by fpH;N W C RC 0 ! Œ0; 1 . Moreover, denitrification is limited by carbon availability fC W R0 ! Œ0; 1 . Oxygen consumption occurs due to heterotrophic and autotrophic respiration
2212
M. Wlotzka et al.
while oxygen supply is determined by soil diffusion, which in turn strongly depends on soil water content. The diffusion for volatile compounds 2 , is calculated using the Fickian ansatz: @t c D @ ŒD
;e ./@
c
with D ;e ./ being the species typical diffusion coefficient D reduced by the effective diffusion coefficient De ./ depending on water content, porosity n, and temperature T : D
;e ./ WD D
De ./ D D
.1 /1 n2
T T0
3
The processes described above can be written in a closed form as @t c D fbgc .; c; t/:
(1)
LandscapeDNDC LandscapeDNDC (Haas et al. 2012) is a newly developed ecosystem model including granular functionalities for simulating biogeochemical carbon and nitrogen (C & N) cycling, plant growth, and the water cycle on site and regional scale. It belongs to the process-based biogeochemical models simulating ecosystem functioning on the basis of the underlying plant physiological, soil microbial, and physicochemical processes. The model enables the combined simulation of different ecosystems of different temporal and spatial scales. All calculations are structured in a modular form representing different ecosystem components/functionalities, like plant growth, water dynamics, microclimate, soil biogeochemistry, and microbiology. Land use management for agricultural and forest ecosystems such as fertilization, tillage, harvest, thinning, and others can also be simulated. For the simulation of forest ecosystems, the PNET-N-DNDC (Stange et al. 2000) – an advanced DNDC version for forest ecosystems – has been implemented and integrated into LandscapeDNDC. This includes the PNET-N-DNDCtropica for tropical forest systems from Kiese et al. (2005) and Werner et al. (2007). For modeling arable and grassland ecosystems, the DNDC functionalities regarding crop growth processes and agricultural management activities are included in LandscapeDNDC. In contrast to the different versions of the DNDC model (agricultural DNDC, Forest-DNDC, Wetland-DNDC), LandscapeDNDC is built upon one generalized soil biogeochemical process description to be applied to the different ecosystems (arable, grassland, forest). The model is using meteorological data (e.g., max. and min. air temperatures, precipitation, radiation) as well as management data (e.g., seeding/harvesting, tillage, fertilizer application) with a time resolution of at least a day as driving input. Furthermore, information about soil and vegetation properties (e.g., texture, pH, crop types) serve as initialization parameters to calculate daily rates of plant N uptake, litter production, mineralization, nitrification, denitrification, and others.
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
2213
Discretization LandscapeDNDC enables the simulation of ecosystem processes on the regional scale incorporating many site-scale simulations (kernels) each representing any of the supported ecosystems. However, no spatial relationships are associated with kernels, i.e., a LandscapeDNDC simulation region is a set of independent single sites. Each site may therefore take the shape of an arbitrary polygon within an unstructured grid. In addition to the longitude-latitude grid, soil horizons are discretized according to available soil strata properties (e.g., bulk density, porosity) for various depths dl , .l D 1; : : : ; L/. Requiring dl1 < dl and choosing d0 D 0 yield stratum heights Hl D dl dl1 . Finally, soil layer heights hk are given by h.l1/C1 ; : : : ; h.l/
Hl D l
with
.l/ D
l X
j ;
j D1
where l 2 N is an optional discretization parameter defaulting to 1. The total number of soil layers is hence equivalent to .L/. Note that soil layers in the same stratum have initially identical properties. For the proposed modeling setup, all kernels use the same soil layer discretization.
Communication Communication and neighborhood relations among kernels, e.g., to address lateral transport of nutrients between adjacent kernels, are not handled by LandscapeDNDC but superimposed by external transport models by means of code coupling. An example using the hydrological model framework CMF is subject of this study. Implementation Due to the design concept of treating kernels as independent subtasks, LandscapeDNDC can store them in a simple array type data structure without any additional information. Simulations are then carried out by iterating over the list of kernels and running each one for a single time step. It is worth pointing out that having synchronized kernels is essential for the proposed coupling scheme to work. For small numbers of kernels, a nearly linear speedup can be achieved by using multi-core processing units. LandscapeDNDC uses OpenMP to execute kernels in parallel on a single multi-core CPU. For large simulations input data and kernel states require substantial amounts of memory and input data refresh time. Data refreshes occur periodically after an input buffer’s content has been completely consumed. At this point input data is read from disk into each kernels input data buffers. A prominent example of streamed input data are driving forces like climate. Both of these issues can be alleviated by parallelizing LandscapeDNDC and deploying on multiprocessor systems or clusters. LandscapeDNDC supports MPI parallelization and scales perfectly for all numbers of nodes up to the total number of kernels. Every node processes a partition of the set of kernels. Such a domain decomposition is imposed by the selection strategy for kernels used by
2214
M. Wlotzka et al.
all LandscapeDNDC instances. For example, a simple round-robin scheme would select for node r from the full domain all kernels K where K mod P r
(P 2 N is the number of processing units),
is satisfied. Here, kernels are gaplessly numbered starting at 0.
2.2
Hydrology
Water plays an important role on the emission of trace gases. Soil moisture can form a barrier for gas exchange between soil air and atmospheric air. It promotes the formation of temporal anaerobic zones in the soil. Furthermore, water is the most important and effective transport medium for reactive nitrogen in the soil, both vertically and laterally. The formation of trace gas emission hot spots and hot moments often depends on the water quantity and quality (Groffman et al. 2009). Complex, process-based biogeochemical models often include submodels for water transport in order to simulate the water-filled pore space dynamically in time and the transport of nutrients (Li et al. 1992; Parton et al. 1994; Haas et al. 2012). However, water flow is limited to a one-dimensional, vertical domain to reduce complexity in these models. Regional coverage is maintained by sequentially calculating a large number of soil columns without any interaction. But in reality, subsurface water flow and solute transport is a nonlinear three-dimensional process.
Water Flow in Porous Media and Transport of a Dissolved Substance Soil is a porous medium from the mathematical point of view. Porous media can be defined as a portion of space with the following properties: • The space is occupied by a number of phases. A phase is defined as that portion of the space which is occupied by a material with uniform properties and which is separated from other materials by a well-defined interface. • At least one of the phases is solid. • The phases are distributed throughout the whole space. The solid phase of the porous media of interest in this work is the soil. The voids or open spaces between the particles of the soil are referred to as pores. The pore space is occupied by water and air, which are the two other phases. Let the soil occupy a domain ˝ R3 . An important property of such a waterbearing formation is the porosity n W ˝ ! Œ0; 1 ; n.x/ WD
volume of the pores in V ; vol.V /!0 vol.V / lim
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
2215
where V ˝ contains the point x 2 ˝ and vol.V / is the volume of V . This definition is subject to the continuum paradox: on the one hand, the limit is necessary to allow the description of phenomena at a point by means of infinitesimal calculus; on the other hand, the volume V should contain a meaningful ensemble average over pores of many different sizes. The volumetric water content of a soil can be defined in a similar way, namely, W ˝ .0; T / ! Œ0; 1 ; .x; t/ WD
lim
vol.V /!0
volume of water in V at time t ; vol.V /
where .0; T / with some T > 0 is the time interval under consideration. Clearly, n since the maximum water content is achieved when the pores are entirely filled. This state is called full saturation and then D n holds by definition. In general, when water enters the soil, a certain amount of entrapped air might reside in dead-end pores, which cannot be displaced by water. Then the maximum water content is denoted s , which is called the state of satiation. However, the two terms are sometimes used interchangeably in the literature (Brutsaert 2005). The residual water content r consists of the moisture in dead-end pores or otherwise so strongly held that it is unavailable for flow. It is convenient to define the effective saturation or wetness as e WD
r : s r
Conservation of Mass We give a brief idea how the conservation of mass leads to the basic equations which govern the water flow in porous media and the transport of a dissolved substance. The mass mw of the amount of water, which resides in a control volume V ˝ of the porous media, is given by Z mw D
dv; V
where denotes the density of the water. Similarly, the mass ms of the solute which is dissolved in the water contained in V is given by Z ms D
c dv; V
where c W ˝ .0; T / ! R denotes the concentration of the solute. The physical principle of conservation of mass gives
2216
M. Wlotzka et al.
d d mw D dt dt d d ms D dt dt
Z
Z dv D
Z
V
@t ./ C r .u/ dv D 0; Z
V
c dv D
Z @t .c/ C r .cu/ dv D
V
V
dv; V
where u W ˝ .0; T / ! R3 denotes the velocity field of the water and W ˝ .0; T / ! R represents any sources or sinks of the substance. The equalities follow from the Reynolds transport theorem (Aris 1989). We denote by q D u the volumetric water flux, i.e., the volume of water crossing unit area perpendicular to the direction of flow in unit time. The equations above hold for any control volume V ˝ at any time t 2 .0; T /, and therefore lead to the continuity equations @t C r q D 0 @t .c/ C r .cq/ D
(water flow);
(2a)
(transport):
(2b)
The water density has dropped out of Eq. (2a) since it is convenient to assume it being constant in hydrology.
Darcy’s Law for Saturated Flow in Porous Media Let p W ˝ .0; T / ! R denote the water pressure in the soil. The pressure p can be expressed with an equivalent height of a water column as a hydrostatic pressure p D g , where g denotes the gravitational acceleration. is also called pressure head. This leads to the definition of the hydraulic head h W ˝ .0; T / ! R h WD
C z;
where z is the coordinate on the vertical axis relative to some reference height. In an 1856 report on the public fountains and water supply for the city of Dijon (France), Henry Darcy presented the results of his experiments on the seepage of water through a pipe filled with sand. He found that the rate of flow through a sand layer was directly proportional to the cross-sectional area of the sand column and to the difference of the hydraulic head across the layer and inversely proportional to the length of the sand column (Brutsaert 2005; Simmons 2008). More precisely, Darcy’s law for the saturated flow of water in porous media can be stated as q D Krh;
(3)
where K denotes the proportionality factor known as hydraulic conductivity and q is the volumetric flux.
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
2217
Richards Equation for Unsaturated Flow in Porous Media Buckingham postulated in 1907 that Darcy’s law is also valid for a soil which is only partly saturated (Narasimhan 2005). In this case the hydraulic conductivity is a function of the water content, K D K./ (Brutsaert 2005). For a medium in fully saturated state D s , the saturated conductivity is denoted Ks WD K.s /. The relative conductivity is defined as Kr WD
K : Ks
The functional relation between the conductivity and the water content is a characteristic property of the medium. Two well-established modelizations of Kr as a function of e are given as follows: 1. Brooks and Corey (1964) developed the parameterization KrBC .e / WD e ;
(4)
which is based on the material-specific constant . KrBC is called Brooks-Corey retention curve. 2. Based on Mualem’s model of soil water retention, Mualem (1976) and Van Genuchten (1980) developed the parameterization KrGM .e / WD
i2 p h 1 e 1 .1 em /m ;
(5)
where the material-specific parameter m 2 .0; 1/ must be derived from experiments. KrGM is called Van Genuchten-Mualem retention curve. Substituting q according to Darcy’s law (3) in the continuity equation (2a) yields @t r k./rh D 0;
(6)
which is known as the Richards equation (Richards 1931; Brutsaert 2005).
Boundary Conditions Let s @˝ denote the upward part of the boundary representing the soil surface. The boundary conditions on s reflect meteorological and environmental factors like precipitation and evaporation. Precipitation rates can be taken from recorded measurement data or from weather forecasts. Evaporation rates are calculated on the basis of recorded or forecasted soil and air temperature and sun radiation and the wetness of the near-surface soil. The surface boundary condition for Eq. (2a) reads
2218
M. Wlotzka et al.
q D q in q out ./
on s ;
where q in represents precipitation inflow and q out denotes the outflow caused by evaporation. The surface boundary condition for Eq. (2b) reads cq D cin q in
on s :
Here, cin denotes the concentration of the substance in the precipitated water. The neighborhood of the lowest point on the boundary, where water flows out of the area of study, is denoted by out . We impose a Dirichlet boundary condition on the hydraulic head h D hout
on out :
The value of hout is taken such that it causes an outflow due to a pressure difference. The other parts of the boundary are treated as solid walls through qD0
on @˝n s [ out :
Finite Volume Discretization We give a short description of a finite volume discretization of the flow and transport problem. Equations (2a) and (2b) are integrated over a control volume V ˝, and the divergence theorem (Heuser 2002) is applied, yielding Z Z @t dv C q n ds D 0; (7a) V
Z
Z
@V
c dv C
@t V
Z
cq n ds D @V
dv;
(7b)
V
where @V denotes the boundary of V and n denotes the outer unit normal field N [ on @V . Let the computational domain ˝ be covered by a mesh ˝h D Ci of i D1
polyhedral, nonoverlapping cells. The water content in cell Ci is defined as Z wi WD
dv
.i D 1; : : : ; N /;
and the solute content is Z c dv si WD
.i D 1; : : : ; N /:
Ci
Ci
Clearly, the average concentration of the solute in cell Ci is ci D
si wi
.
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
2219
Taking the control volume as any of the mesh cells, the boundary integrals in Eqs. (7a) and (7b) turn into sums of fluxes to adjacent cells: Z q n ds D @Ci
X
q ij Aij
.i D 1; : : : ; N /;
cij q ij Aij
.i D 1; : : : ; N /;
j 2Ni
Z cq n ds D @V
X j 2Ni
where Ni is the index set of the adjacent cells Cj sharing an interface of area Aij with cell Ci . Thus, a system of ordinary differential equations for the water and solute content of the cells results: wP i C
X
q ij Aij D 0
.i D 1; : : : ; N /;
(8a)
cij q ij Aij D N i
.i D 1; : : : ; N /;
(8b)
j 2Ni
sPi C
X
j 2Ni
R where Ni D Ci dv. The boundary conditions can easily be incorporated in this notation. In general, the flux q ij D q.wi ; wj / from cell Ci to Cj is a function of the water content. For Richards equation (6), the fluxes are approximated as q hi hj q ij D K.wi /K.wj / ; kx i x j k2
(9)
where K.wi / is the conductivity, hi is the hydraulic head of cell Ci , and x i is the coordinate vector of some reference point of cell Ci . Here, the gradient of the hydraulic head is approximated by a finite difference, and the conductivity is taken as the geometric mean of the two cells. In Eq. (8b), the value of cij is taken as 8 ˆ ˆ 0; if q ij < 0;
(10)
else;
which is the average concentration of the substance in the cell where the flux originates from. For the sake of simplicity, we subsume the system of ordinary differential equations (8) in the notation @t .w; s/ D fhyd .w; s; t/:
(11)
2220
M. Wlotzka et al.
Catchment Modelling Framework The Catchment Modelling Framework (CMF) (Kraft et al. 2011) is a C++ library for creating hydrological simulation models. It offers a variety of classes and functions which represent the ingredients of a finite volume discretization as proposed by Qu and Duffy (2007). Hydrological models are set up as a network using node and connection objects. The nodes of the network are the cells of ˝h . They have the hydrological meaning of a water storage keeping track of the water content. They are equipped with material-specific constants, boundary conditions, and a position in the geometry. The connection objects represent the edges of the network between adjacent cells. They contain the definition of the flux approximation. We use the Richards approximation (9) between adjacent cells and evaporation and rainfall on the soil surface. CMF also provides conceptual equations for water transport for large-scale applications as well as flux connections for surface water, which are not shown in this paper. For modeling the transport of dissolved substances, CMF attaches solute storage objects to each water storage. The solute storages keep track of the solute content in the cells. Both storage types provide methods to calculate the time derivative of their state. For each cell Ci 2 ˝h , the corresponding water storage computes w P i, and the attached solute storage computes sPi by iterating over the flux connections and evaluating the contributions using (9) and (10). Thus, the CMF model calculates the right-hand-side function fhyd of Eq. (11). This function evaluation is parallelized with OpenMP for shared memory machines. For the time integration, CMF provides a number of well-known ODE solvers, ranging from a naive explicit Euler and a classical Runge-Kutta-Fehlberg method to the implicit multistep and errorcontrolled CVODE solver by Hindmarsh et al. (2005), which we use for this study.
3
Model Coupling
The space-discretized models for the biogeochemical processes resulting from (1) and the water flow and transport of dissolved substances (11) form the coupled system of ordinary differential equations: @t .w; s/ D fhyd .w; s; t/ in Œ0; T ; @t s D fbgc .w; s; t/ in Œ0; T ;
(water flow and solutes transport)
(12a)
(biogeochemical processes)
(12b)
w.0/ D w0 ;
(12c)
s.0/ D s0 :
(12d)
We propose two time-stepping schemes for solving the coupled system. Both schemes employ an operator splitting such that the global time steps of the coupled system are composed of local time steps of the individual models. The time interval Œ0; T is divided into a sequence of discrete time steps 0 D t0 < t1 < : : : < tnmax D T:
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
3.1
2221
Consecutive Operator Splitting
A simple global time-stepping scheme uses the consecutive operator splitting described in Algorithm 2. In each cycle of the time loop, the hydrological model is solved for the current time interval (line 3). This gives the water state wnC1 for (h) the next time step and an intermediate result snC1 for the solute state. These states are taken as input for the biogeochemical model. It is subsequently solved for the same time interval (line 4), yielding the solute state snC1 for the next time step. Algorithm 2 Consecutive operator splitting 1: Set initial values w0 , s0 , n D 0. 2: while n < nmax do (h) 3: Solve @t .w; s/ D fhyd .w; s; t / in Œtn ; tnC1 , w.tn / D wn , s.tn / D sn to obtain wnC1 , snC1 . (h)
4: Solve @t s D fbgc .wnC1 ; s; t / in Œtn ; tnC1 , s.tn / D snC1 to obtain snC1 . 5: n nC1 6: end while
3.2
Simulation with Python
A straightforward way to simulate the coupled system is to create an application which contains both models. We decided to implement this coupling application in Python (www.python.org). The Python programming language is freely available for a wide range of machines like laptops, workstations, and clusters. We compiled the models into Python modules using SWIG (www.swig.org). The SWIG tool generates interface code for accessing the C++ models from Python. It allows to use the model objects and functions in Python applications. This makes it easy to set up simulations on various platforms, and it enables for a fast prototyping. We implemented the consecutive operator-splitting scheme in a Python coupling application. The models can exchange data by accessing global variables. The internal OpenMP parallelization of the models is not affected by the SWIG wrapping. Thus, the simulation application can be run in parallel on shared memory platforms. The python coupling scheme is depicted in Fig. 1.
3.3
Concurrent Operator Splitting
A more sophisticated global time-stepping scheme uses the concurrent operator splitting described in Algorithm 3. In each cycle of the time loop, the models are solved for one time interval using only input data which is available from the last time step (lines 3 and 4). This allows for concurrent computations in both models. The hydrological model produces the water state wnC1 for the next time step and (h) the intermediate result snC1 for the transported solute. The biogeochemical model (b) (b) . The difference snC1 sn represents the computes the intermediate solute state snC1
2222
M. Wlotzka et al.
Fig. 1 Coupling scheme for simulations with Python
production or loss of NO3 due to the soil chemical processes. The solute state snC1 for the next time step is taken as the sum of the transported intermediate result and the production or loss term (line 5). As in the consecutive operator splitting case, each of the models can be implemented in parallel. The concurrent operator splitting introduces a second level of parallelism. The models can compute the results concurrently in each iteration of the time loop, since their input depends only on data from the last time step. This allows to employ individual parallelization concepts for the models and to run them on individual sets of processors. Algorithm 3 Concurrent operator splitting 1: Set initial values w0 , s0 , n D 0. 2: while n < nmax do (h) 3: Solve @t .w; s/ D fhyd .w; s; t / in Œtn ; tnC1 , w.tn / D wn , s.tn / D sn to obtain wnC1 , snC1 . 4:
(b)
Solve @t s D fbgc .wn ; s; t / in Œtn ; tnC1 , s.tn / D sn to obtain snC1 .
5: Set snC1 D 6: n nC1 7: end while
(h) snC1
C
(b) snC1
sn .
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
3.4
2223
Simulation with OpenPALM
In order to exploit the opportunities for parallelism offered by the concurrent operator splitting, we use the OpenPALM software coupler tool (Piacentini and the PALM Group 2003; Buis et al. 2006) for the simulation. The fundamental idea of OpenPALM is to consider complex simulations as a coupled application. The OpenPALM approach is to divide the computations into a number of tasks and to define a coupling algorithm which controls the execution and interaction of these tasks. Each task can be implemented individually, offering the possibility to reuse existing codes with minimal modifications for building a computational component that can be coupled to other components. OpenPALM features both levels of parallelism. On the one hand, independent tasks can run concurrently on separate sets of processors. OpenPALM handles the concurrent execution of such tasks, establishes an intercommunication context between them, and grants a synchronization mechanism. On the other hand, OpenPALM is able to couple components which are internally parallelized, using OpenMP as well as MPI. For such components, OpenPALM establishes a private intracommunication context and manages data exchange between different sets of processes. In order to grant modularity of the components and full flexibility for defining a coupling algorithm, OpenPALM implements the endpoint communication scheme. It provides two basic communication routines for sending and receiving data objects, PALM_Put and PALM_Get, respectively. Components can use these routines to exchange data without the need to have a reference on the origin and destination of the communication. Instead, PALM_Put and PALM_Get announce a request for sending and receiving data. OpenPALM acts as a broker on these requests, i.e., it derives the actual communication pattern from the current state of the coupling algorithm and organizes a rendezvous of the components which are involved in the communication. In order to avoid deadlocks between senders and receivers, OpenPALM provides buffer storage for pending communications. In the OpenPALM terminology, a computational component which can be executed in a coupling algorithm is called a unit. The task represented by a unit may vary from simple algebraic operations over linear or nonlinear solvers up to complex physical models. The granularity of the tasks can be freely chosen by the user. Units are defined by so-called identity cards which can be recognized by OpenPALM. The identity card of a unit basically provides information about its data objects which may be exchanged with other units and, in case of a parallel unit, the data distribution. Units can be implemented in Fortran, C and C++. Existing codes can easily be turned into a unit by creating an appropriate identity card and by changing the original program entry point (program statement in Fortran or main function in C/C++) into a normal subroutine or function. OpenPALM spawns separate executables for calling the unit functions. Inside the unit source code, the PALM_Put and PALM_Get primitives may be used for communication with other units. According to the endpoint communication scheme, units do not need to know their communication partners, but OpenPALM arranges the connection.
2224
M. Wlotzka et al.
Fig. 2 Coupling scheme for simulations with OpenPALM
This modularity allows both to individually develop units for a coupled application in collaborative projects and to reuse existing units in any coupling algorithm. We built two units, one for the hydrological model using CMF and one for the biogeochemical model using LandscapeDNDC. The CMF unit uses the OpenMP parallelization as described in Sect. 2.2. The MPI parallelization of the LandscapeDNDC unit is based on a domain decomposition. The biogeochemical model kernels are distributed among a number of computational nodes. Each node can advance its model kernels for one time step independent of each other. Both units use the PALM_Put and PALM_Get routines to exchange the water and solute states and to transfer meteorological data. The coupling scheme is depicted in Fig. 2.
4
Application
We designed a numerical experiment to show the feasibility of the described approach for quantifying the ecosystem carbon and nitrogen cycling on the landscape scale. The computational domain ˝h with N D 13;120 cells was a virtual landscape forming a valley. The horizontal discretization yielded 40 41 cells with
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
2225
Fig. 3 Computational mesh ˝h with N D 13;120 cells
Fig. 4 Soil water content Œmm=m3 after 420 days
a size of 10 by 10 m each. The vertical discretization consisted of 8 soil layers of different heights (5,5,20,30,30,30,30, and 50 cm). We placed a stream outlet out with hout D 0:1 m belowground near the lowest point of the geometry; see Fig. 3. Precipitation and evaporation fluxes occurred at the surface boundary s . All other parts of the boundary were equipped with the no-flow condition q D 0. The hydraulic head was set to 0:5 m below surface as initial condition. The initial NO3 concentration was set to c D 103 e 4d kg per m3 , where d denotes the soil depth. Temporal resolution of the data exchange of the coupled system was 1 day. The landscape represented an intensive agricultural system of maize rotations. Fertilization with organic and inorganic N was more than 300 kg N/ha/a. It was split into several applications in the growing season in spring and after harvest in autumn uniformly on the surface. The domain represented an average pre-alpine climate
2226
M. Wlotzka et al.
Fig. 5 Nitrate (NO3 ) concentration [kg N/ha] in the soil water after 420 days
Fig. 6 The spatial distribution of the plant growth/gross primary production [kg DW/ha] as a result of differences in nutrient availability
with approx. 800 mm average precipitation and an average temperature of 9.6 ı C throughout the surface. The geometry enforced interflow transport of nutrients from upland into the riparian zone, entering the stream which was formed at the bottom of the slopes. Nitrate and ammonium play a significant role in plant nutrition. Due to the low mobility of ammonium, we only accounted for the nitrate transport. It was formed by microbial processes converting NH4 via NO2 into NO3 at aerobic conditions as described in Sect. 2. It was also substrate for the denitrification, the microbial transformation of NO3 into N2 , which occurred at anaerobic conditions. In Fig. 4 the soil water redistribution after a simulation time of 420 days is illustrated. Rainfall events and corresponding water fluxes along the slopes resulted in saturated soil water conditions along the riparian zone. Soil nitrate concentrations following water fluxes are shown in Fig. 5. Nitrate was formed by mineralization and consecutive nitrification of plant litter, and it was percolated into deeper soil layers. It was transported towards the riparian zone where most of the excess nitrate was discharged towards the surface water due to soil water saturation.
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
2227
Fig. 7 The spatial distribution of the nitrous oxide (N2 O) emissions [kg N/ha] as a result of the nitrification/denitrification process of nutrient availability
Fig. 8 The spatial distribution of the dinitrogen emissions [kg N/ha]. Dinitrogen is the end product of the denitrification process transforming nitrate via nitrite into dinitrogen by denitrifying microorganisms at anaerobic soil conditions
The vegetation development feedback due to the nutrient availability in the landscape resulted in a biomass production gradient along the slopes. Figure 6 illustrates the strong variations of the gross primary productivity of the plant growth. Downslope regions with large nutrient and water availability showed a strong increase in plant growth as there was no limitation by nitrogen, whereas upslope regions were less productive. The fertilization regime supplied enough excess nitrogen to be redistributed within the landscape. Figure 8 reflects hot spots of denitrification, which is illustrated by the yearly accumulated N2 emissions. Fully saturated conditions in the riparian zone lead to anaerobic conditions, which in turn lead to the complete reduction of nitrate to dinitrogen. Under such conditions almost all N2 O which was produced during denitrification was further reduced to N2 . In contrast, the transition from complete anaerobic to aerobic zones in higher altitudes yielded increased N2 O emissions because denitrification was not completed. As a consequence of this incomplete process, N2 O diffused to the surface and was emitted to the atmosphere. The
2228
M. Wlotzka et al.
transition zone between aerobic and anaerobic regions as well as the temporal availability of nitrogen varied. These variations caused emission patterns that form hot spot regions around the riparian zone as illustrated in Figs. 7 and 8.
5
Conclusion
The results of the simulation with the coupled model system show that considering lateral transport of water and nutrients exposes more realistic regional emission patterns due to spatial gradients in nutrient availability and their feedbacks to crop growth and microbial activity. Despite the lack of validation of the modeling approach, the presented simulation results highlight the effect of generating indirect N2 O emissions caused by lateral nutrient redistribution and near-surface water eutrophication. These effects are well known and have been observed before. IPCC (2007) acknowledges them in its methodologies to assess the source strengths of the associated GHG emissions by the use of specific emission factors. These factors are constant and do not reflect any local conditions, e.g., climate, management, hydrology, and soil properties. The coupled simulations do reflect many effects associated with the nutrient transport and have therefore proved to be a powerful tool to assess the effects of nutrient cycling within landscapes. This is a fundamental requirement for the identification of hot spots and hot moments of GHG emissions and for their mitigation. Acknowledgements This work was supported by the German Research Foundation (DFG) under research grants HE 4760/4-1 and BU 1173/12-1.
References Aris R (1989) Vectors, tensors, and the basic equations of fluid mechanics. Dover, New York Beheydt D, Boeckx P, Sleutel S, Li C, Van Cleemput O (2007) Validation of DNDC for 22 longterm N2O field emission measurements. Atmos Environ 41(29):6196–6211 Blagodatsky S, Richter O (1998) Microbial growth in soil and nitrogen turnover: a theoretical model considering the activity state of microorganisms. Soil Biol Biochem 30(13):1743– 1755. doi:10.1016/S0038-0717(98)00028-5, http://www.sciencedirect.com/science/article/pii/ S0038071798000285 Blagodatsky SA, Grote R, Kiese R, Werner C, Butterbach-Bahl K (2011) Modelling of microbial carbon and nitrogen turnover in soil with special emphasis on N-trace gases emission. Plant Soil 346:297–330 Bondeau A, Smith PC, Zaehle S, Schaphoff S, Lucht W, Cramer W, Gerten D, Lotze-Campen H, Mueller C, Reichstein M, Smith B (2007) Modelling the role of agriculture for the 20th century global terrestrial carbon balance. Glob Change Biol 13(3):679–706 Brooks RH, Corey AT (1964) Hydraulic properties of porous media. Hydrology Papers No 3, Colorado State University Brutsaert W (2005) Hydrology. Cambridge University Press, Cambridge/New York Buis S, Piacentini A, Declat D, the PALM Group (2006) Palm: a computational framework for assembling high performance computing applications. Concurr Comput Pract Exp 18:231–245
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
2229
Butterbach-Bahl K, Kesik M, Miehle P, Papen H, Li C (2004a) Quantifying the regional source strength of N-trace gases across agricultural and forest ecosystems with process based models. Plant Soil 260(1/2):311–329 Butterbach-Bahl K, Kock M, Willibald G, Hewett B, Buhagiar S, Papen H, Kiese R (2004b) Temporal variations of fluxes of NO, NO2, N2O, CO2, and CH4 in a tropical rain forest ecosystem. Glob Biogeochem Cycles 18:GB3012 Chatskikh D, Olesen JE, Berntsen J, Regina K, Yamulki S (2005) Simulation of effects of soils, climate and management on N2O emission from grasslands. Biogeochemistry 76(3):395–419 Chirinda N, Kracher D, Lægdsmand M, Porter JR, Olesen JE, Petersen BM, Doltra J, Kiese R, Butterbach-Bahl K (2011) Simulating soil N2O emissions and heterotrophic CO2 respiration in arable systems using FASSET and MoBiLE-DNDC. Plant Soil 343(1–2):251–260 Cicerone RJ, Shetter JD (1981) Sources of atmospheric methane: measurements in rice paddies and a discussion. J Geophys Res 86(C8):7203–7209 Cui J, Li C, Sun G, Trettin C (2005) Linkage of MIKE SHE to wetland-DNDC for carbon budgeting and anaerobic biogeochemistry simulation. Biogeochemistry 72(2):147–167 Dämmgen U, Grünhage L (2002) Trace gas emissions from German agriculture as obtained from the application of simpler or default methodologies. Environ Pollut 117(1):23–34 Del Grosso SJ, Parton WJ, Mosier AR, Holland EA, Pendall E, Schimel DS, Ojima DS (2005) Modeling soil CO2 emissions from ecosystems. Biogeochemistry 73(1):71–91 Del Grosso SJ, Ogle SM, Parton WJ, Breidt FJ (2010) Estimating uncertainty in N2O emissions from US cropland soils. Glob Biogeochem Cycles 24:GB1009 de Noblet-Ducoudre N, Gervois S, Ciais P, Viovy N, Brisson N, Seguin B, Perrier A (2004) Coupling the soil-vegetation-atmosphere-transfer scheme ORCHIDEE to the agronomy model STICS to study the influence of croplands on the European carbon and water budgets. In: Agronomie, laboratory of sciences of the climate & environment, Gif Sur Yvette, pp 397–407 Firestone M, Davidson E (1989) Microbiological basis of NO and N2O production and consumption in soil. In: Exchange of trace gases between terrestrial ecosystems and the atmosphere. Wiley, Chichester/New York, pp 7–21 Van Genuchten MT (1980) A closed form equation for predicting the hydraulic conductivity of unsaturated soils. Soil Sci Soc Am J 44:892–898 Gervois S, Ciais P, de Noblet-Ducoudre N, Brisson N, Vuichard N, Viovy N (2008) Carbon and water balance of European croplands throughout the 20th century. Glob Biogeochem Cycles 22:GB2018 Groffman PM, Butterbach-Bahl K, Fulweiler RW, Gold AJ, Morse JL, Stander EK, Tague C, Tonitto C, Vidon P (2009) Challenges to incorporating spatially and temporally explicit phenomena (hotspots and hot moments) in denitrification models. Biogeochemistry 93(1–2): 49–77 Haas E, Klatt S, Froehlich A, Kraft P, Werner C, Kiese R, Grote R, Breuer L, Butterbach-Bahl K (2012) LandscapeDNDC: a process model for simulation of biosphere-atmosphere-hydrosphere exchange processes at site and regional scale. Landsc Ecol. doi:10.1007/s10980-012-9772-x Heuser H (2002) Lehrbuch der analysis. Teubner, Stuttgart Hindmarsh AC, Brown PN, Grant KE, Lee SL, Serban R, Shumaker DE, Woodward CS (2005) SUNDIALS: suite of nonlinear and differential/algebraic equation solvers. ACM Transactions on Mathematical Software. Tech. rep., LLNL technical report UCRL-JP-200037 IPCC (2006) 2006 IPCC guidelines for national greenhouse gas inventories: volume 4 agriculture, forestry and other land use. Tech. rep., Prepared by the National Greenhouse Gas Inventories Programme, Hayama IPCC (2007) Climate change 2007: the physical science basis – contribution of working group to the fourth assessment report of the intergovernmental panel on climate change. Tech. rep., Intergovernmental Panel on Climate Change, Cambridge/New York Kemanian AR, Julich S, Manoranjan VS, Arnold JR (2011) Integrating soil carbon cycling with that of nitrogen and phosphorus in the watershed model SWAT: theory and model testing. Ecol Model 222(12):1913–1921
2230
M. Wlotzka et al.
Kiese R, Li C, Hilbert DW, Papen H, Butterbach-Bahl K (2005) Regional application of PNETN-DNDC for estimating the N2O source strength of tropical rainforests in the wet tropics of australia. Glob Change Biol 11(1):128–144. doi:10.1111/j.1365-2486.2004.00873.x Kraft P, Vaché KB, Frede HG, Breuer L (2011) A hydrological programming language extension for integrated catchment models. Environ Model Softw 26:828–830 Leffelaar PA, Wessel WW (1988) Denitrification in a homogeneous, closed system: experiment and simulation. Soil Sci 146(5):335–349 Li C, Frolking S, Frolking TA (1992) A model of nitrous-oxide evolution from soil driven by rainfall events. 1. Model structure and sensitivity. J Geophys Res 97(D9):9759–9776 Li C, Frolking S, Butterbach-Bahl K (2005) Carbon sequestration in arable soils is likely to increase nitrous oxide emissions, offsetting reductions in climate radiative forcing. Clim Change 72(3):321–338 Luttich M, Dämmgen U, Haenel HD, Eurich-Menden B, Dohler H, Osterburg B (2007) Calculations of emissions from German agriculture – National Emission Inventory Report (NIR) 2007 for 2005. Tech. rep., Landbauforschung Völkenrode, Institut für Agrarrelevante Klimaforschung Johann Heinrich von Thünen Institut (vTI) Bundesforschungsinstitut für Ländliche Räume, Wald und Fischerei Mander U, Uuemaa E, Kull A, Kanal A, Maddison M, Soosaar K, Salm JO, Lesta M, Hansen R, Kuller R, Harding A, Augustin J (2010) Assessment of methane and nitrous oxide fluxes in rural landscapes. Landsc Urban Plan 98:172–181 Mualem Y (1976) A new model for predicting the hydraulic conductivity of unsaturated porous media. Water Resour Res 12:513–522 Narasimhan TN (2005) Buckingham, 1907: an appreciation. Vadose Zone J 4:434–441 Parton WJ, Ojima DS, Cole CV, Schimel DS (1994) A general model for soil organic matter dynamics: sensitivity to litter chemistry, texture and management, quantitative modeling of soil forming processes. Tech. rep., Soil Science Society of America Piacentini A, the PALM Group (2003) Palm: a dynamic parallel coupler. Lect Notes Comput Sci 2565:479–492 Pohlert T, Breuer L, Huisman JA, Frede HG (2007a) Assessing the model performance of an integrated hydrological and biogeochemical model for discharge and nitrate load predictions. Hydrol Earth Syst Sci 11(2):997–1011 Pohlert T, Huisman JA, Breuer L, Frede HG (2007b) Integration of a detailed biogeochemical model into SWAT for improved nitrogen predictions—model development, sensitivity, and GLUE analysis. Ecol Model 203(3–4):215–228 Qu Y, Duffy CJ (2007) A semidiscrete finite volume formulation for multiprocess watershed simulation. Water Resour Res 43:W08419 Richards LA (1931) Capillary conduction of liquids through porous mediums. Physics 1:318–333 Schulze ED, Ciais P, Luyssaert S, Schrumpf M, Janssens IA, Thiruchittampalam B, Theloke J, Saurat M, Bringezu S, Lelieveld J, Lohila A, Rebmann C, Jung M, Bastviken D, Abril G, Grassi G, Leip A, Freibauer A, Kutsch W, Don A, Nieschulze J, Boerner A, Gash JH, Dolman AJ (2010) The European carbon balance. Part 4: integration of carbon and other trace-gas fluxes. Glob Change Biol 16(5):1451–1469 Simmons C (2008) Henry Darcy (1803–1858): immortalised by his scientific legacy. Hydrogeol J 1023–1038. doi:10.1007/s10040-008-0304-3 Smith P, Martino D, Cai Z, Gwary D, Janzen H, Kumar P, McCarl B, Ogle S, O’Mara F, Rice C, Scholes B, Sirotenko O, Howden M, McAllister T, Pan G, Romanenkov V, Schneider U, Towprayoon S, Wattenbach M, Smith J (2008) Greenhouse gas mitigation in agriculture. Philos Trans R Soc B Biol Sci 363(1492):789–813 Smith J, Gottschalk P, Bellarby J, Chapman S, Lilly A, Towers W, Bell J, Coleman K, Nayak D, Richards M, Hillier J, Flynn H, Wattenbach M, Aitkenhead M, Yeluripati J, Farmer J, Milne R, Thomson A, Evans C, Whitmore A, Falloon P, Smith P (2010) Estimating changes in Scottish soil carbon stocks using ECOSSE. I. Model description and uncertainties. Clim Res 45(1): 179–192
Simulation of Land Management Effects on Soil N2 O Emissions Using a. . .
2231
Stange F, Butterbach-Bahl K, Papen H, Zechmeister-Boltenstern S, Li C, Aber J (2000) A process-oriented model of N2O and NO emissions from forest soils 2. Sensitivity analysis and validation. J Geophys Res Atmos 105:4385–4398 Van den Hoof C, Hanert E, Vidale PL (2011) Simulating dynamic crop growth with an adapted land surface model - JULES-SUCROS: model development and validation. Agric For Meteorol 151(2):137–153 Wassmann R, Papen H, Rennenberg H (1993) Methane emission from rice paddies and possible mitigation strategies. Chemosphere 26(1–4):201–217 Werner C, Butterbach-Bahl K, Haas E, Hickler T, Kiese R (2007) A global inventory of N2O emissions from tropical rainforest soils using a detailed biogeochemical model. Glob Biogeochem Cycles 21(3). doi:10.1029/2006GB002909, http://dx.doi.org/10.1029/2006GB002909 Zaehle S, Sitch S, Smith B, Hatterman F (2005) Effects of parameter uncertainties on the modeling of terrestrial biosphere dynamics. Glob Biogeochem Cycles 19:GB3020
Part V Statistical and Stochastic Methods
An Introduction to Prediction Methods in Geostatistics Ralf Korn and Alexandra Kochendörfer
Geostatistics offers a way of describing the spatial continuity of natural phenomena and provides adaptations of classical regression techniques to take advantage of this continuity. Isaaks and Srivastava, 1989
Contents 1 2 3
4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Concepts: Random Field, Stationarity, and Variogram . . . . . . . . . . . . . . . . . . . . . Best Linear Prediction and Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Simple Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Ordinary Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Universal Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Kriging in the Gaussian Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Aspects of Interpolation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Indicator Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Multivariate Kriging and Cokriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Further Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2236 2237 2241 2241 2243 2244 2245 2245 2246 2247 2249
The authors gratefully acknowledge financial support by the Bundesministerium für Umwelt, Naturschutz und Reaktorsicherheit for the project “GEOFÜND”. R. Korn () Department of Mathematics, University of Kaiserslautern, Kaiserslautern, Germany Department of Financial Mathematics, Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany e-mail: [email protected] A. Kochendörfer Department of Financial Mathematics, Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany e-mail: [email protected]
© Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_46
2235
2236
R. Korn and A. Kochendörfer
5 An Example with Synthetic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Some Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2249 2254 2255
Abstract
In this survey we present various classical geostatistical prediction methods with a focus on interpolation methods that are known as Kriging. For this, we introduce basic concepts in spatial statistics, such as random field, stationarity, and variogram. Then, the main types of Kriging interpolation methods such as simple, ordinary, and universal Kriging are derived as best linear predictors in the mean squared sense. We further comment on multivariate and nonlinear generalizations such as cokriging or indicator Kriging and their aspects of application. Finally, we demonstrate the performance of Kriging prediction with the help of synthetic data.
1
Introduction
When working with spatial data, we are usually given a finite set of measurements of some continuous quantity at different locations in space. This quantity can be, e.g., temperature, porosity of some medium, the height of trees, or radiation. This quantity is not necessarily one dimensional. In geothermal feasibility studies, one is, e.g., interested in the flow rate and temperature of groundwater, both crucial aspects for the success of the drilling. Natural questions that arise given some measurements are the following: Can we say something about the value of the varying quantity in areas where no measurements are available, i.e., can we predict its value in those areas? What is the probability that these values exceed a certain level? How precise is our estimation? The simplest way to answer the first question is via using a suitable deterministic interpolation method such as spline interpolation or polynomial interpolation. However, such a deterministic approach is not able to answer the last two questions which play a crucial role in feasibility studies. To answer them, a stochastic approach is necessary. More precisely, we consider the value of the quantity of interest at some location in space s 2 Rd to be modeled by a random variable Z.s/. This allows to interpret the set of measurements .z.s1 /; : : : ; z.sn // as a set of realizations of these variables at different locations s1 ; : : : ; sn 2 D; where D Rd . This interpretation allows us to apply statistical inference methods to the data. In this survey, we will particularly focus on the presentation of popular geostatistical tools for prediction which are all centered around the class of interpolation methods summarized under the name of Kriging. In doing so, we will first introduce some necessary stochastic concepts, then survey the prediction methods, and will finally demonstrate aspects of their application.
An Introduction to Prediction Methods in Geostatistics
2
2237
Basic Concepts: Random Field, Stationarity, and Variogram
In this section, we introduce some essential concepts of geostatistics for spatial data. As mentioned above, we interpret the set of spatial measurements as realizations of random variables at different locations in space. Below, we formalize this interpretation using the definition of a random field and consider a useful special case of random fields: the Gaussian random field. The methods of geostatistical interpolation (prediction), which we present in the following sections, require the understanding of basic concepts such as the stationarity of a random field and the variogram, which we also introduce below. Definition 1. A random field .Z.s/ W s 2 Rd / is a collection of random variables all defined on the probability space .; F ; P/. According to this definition, a random field is represented by the mapping Z W Rd ! R; .s; !/ 7! Z.s; !/; where for a fixed location s 2 Rd the mapping Z.s; / is a random variable. For simplicity, we often write Z.s/ instead of Z.s; /. The case of s 2 RC is often used for modeling time, and the resulting random field .Z.s/ W s 2 RC / is then called a stochastic process or a time series. Such processes are widely used in, e.g., financial mathematics for modeling prices of stocks, interest rates, and other financial quantities. As in the current setting, we concentrate on geographic quantities; we use random fields with space as parameter for modeling. Our primary aim is to examine the spatial relationship between the variables Z.s/ and to use this knowledge for prediction. For a given random field .Z.s/ W s 2 Rd / with mean function .s/ WD EŒZ.s/ , covariance function C .si ; sj / WD C ov.Z.si /; Z.sj // D EŒZ.si /Z.sj / EŒZ.si / EŒZ.sj / ; and given data at locations s1 ; : : : ; sn , we introduce the following notations: Z WD .Z.s1 /; : : : ; Z.sn //> ; WD ..s1 /; : : : ; .sn //; C WD C ov.Z; Z> / D .C .si ; sj //i;j D1;:::;n ; c WD C ov.Z.s0 /; Z/ D .C .s0 ; si //i D1;:::;n ; 1 WD .1; : : : ; 1/:
2238
R. Korn and A. Kochendörfer
Fig. 1 Ocean surface as a realization of a random field
Here, Z is the n-dimensional vector of observations, is the n-dimensional mean vector of Z, C is the n n covariance matrix of Z, and c is an n-dimensional vector of covariances between a reference observation Z.s0 / and Z. A good example to visualize and explain the concept of random fields is the height of the ocean surface above a given level. Figure 1 shows the surface .Zt .s; !/ W s 2 D/ somewhere in the ocean D 2 R2 at some fixed time t 0 for some possible scenario ! 2 . For a fixed !, the realization of a random field is a deterministic function of space. A random field is fully described by its finite-dimensional distributions, i.e., by probabilities P .Z.s1 / < z1 ; : : : ; Z.sk / < zk / for every s1 ; : : : ; sk 2 Rd and each k 2 N. As in reality we are typically given only one set of realizations, it is impossible to draw conclusions concerning the law of a random field without further assumptions. We could extract more information from the measurements if we assume that the distributions are invariant under arbitrary translation of the points. This property of a random field is called stationarity. It means that the examined property is homogeneous in space such as in Fig. 2. Definition 2. A random field .Z.s/ W s 2 Rd / is strictly stationary if for any s1 ; : : : ; sk 2 Rd and any k 2 N; h 2 Rd , we have P .Z.s1 / < z1 ; : : : ; Z.sk / < zk / D P .Z.s1 C h/ < z1 ; : : : ; Z.sk C h/ < zk /: As it is difficult to justify this assumption on a given set of measurements, we relax the concept of stationarity by introducing second-order stationarity or weak stationarity. Definition 3. A random field with a constant mean function EŒZ.si / D
An Introduction to Prediction Methods in Geostatistics
2239
Fig. 2 An example of a stationary random field
4 2 0 −2 −4 −6
3 2 1
2 1 0
0 −1 −2 −3
−1 −2 −3 −4
−4 Fig. 3 Examples of anisotropic random fields
and a covariance function which only depends on the difference between two points C ov.Z.si /; Z.sj // D c.si sj / for any sj ; sj 2 Rd and a given function c W Rd ! R is called weakly stationary (or, simply, stationary). If additionally the covariance function only depends on the distance ksi sj k between two points, the random field is called isotropic. Random fields where the covariance C ov.Z.si /; Z.sj // depends also on the direction of si sj are called anisotropic (see Fig. 3). To weaken the above assumption of a stationary random field, we consider the so-called intrinsic random fields for a more realistic modeling of its spatial behavior.
2240
R. Korn and A. Kochendörfer
Intrinsic stationarity here means that the variance of differences Z.si / Z.sj / only depends on the distance between the locations si sj . Definition 4. A random field is called intrinsic if for every h 2 Rd the differences Z.s C h/ Z.s/ are weakly stationary. More precisely, we have EŒZ.s C h/ Z.s/ D ha; hi; Var.Z.s C h/ Z.s// D 2.h/; for a suitable constant a 2 Rd and a function , the so-called variogram function. The advantage of assuming an intrinsic random field is that we only need to estimate (or to model) the variogram function instead of the covariance function which can be very challenging. We will focus on this topic in detail in the following sections. Note that weak stationarity implies the intrinsic property as we have Var.Z.s C h/ Z.s// D Var.Z.s C h//CVar.Z.s// 2C ov.Z.s C h/; Z.s// D c.0/ C c.0/ 2c.h/ D 2.c.0/ c.h//: Thus, in the stationary case, we have .h/ D 12 .c.0/ c.h//. On the other hand, an intrinsic random field does not have to be stationary. For example, consider .Z.s/ W s 2 RC / to be a one-dimensional Brownian motion, i.e., a continuous stochastic process with independent and stationary increments with Z.t/ Z.s/ N .0; t s/; for all 0 s < t. Then, .Z.s/ W s 2 RC / is not stationary since we have Var.Z.s// D s but also Var.Z.s C t/ Z.s// D t. Hence, the Brownian motion is intrinsic. The most convenient and popular example of a random field is the Gaussian random field. To introduce it, let us recapitulate some facts about the normal distribution. A random variable Z is normally distributed with mean and variance 2 , denoted by N .; 2 /, if Z has a density given by 1 .x /2 : f .x/ D p exp 2 2 2 A random vector Z is said to follow a multivariate normal distribution if every finite linear combination of its components is normally distributed, i.e., for every 2 Rn , > Z is normally distributed.
An Introduction to Prediction Methods in Geostatistics
2241
Definition 5. .Z.s/I s 2 Rd / is a Gaussian random field if for all s1 ; : : : ; sn 2 Rd the random vector .Z.s1 /; : : : ; Z.sn // is multivariate normally distributed. A Gaussian random field is fully described by its mean function .s/ and covariance function C .s; s 0 /. We denote a multivariate normally distributed random vector Z by Z N .; C/:
3
Best Linear Prediction and Kriging
Equipped with the above concepts, we are now going to introduce the basic methods of prediction. They will turn out to be linear predictors, i.e., the data enter the prediction procedure in the form of linear combinations. Historically, the work of D.G. Krige who in his Master’s thesis and an accompanying paper Krige (1951) introduced interpolation methods to a particular mining problem can be seen as the start of the application of interpolation methods in geostatistics. Krige’s work was then put into a rigorous mathematical framework and generalized in numerous papers and books by G. Matheron (see, e.g., Matheron (1962) for the start of this work). Matheron not only translated Krige’s papers but also coined the term Kriging as a synonym for applying interpolation methods in space based on least-squares principles.
3.1
Simple Kriging
In this section, we follow the approach given in Stein (1999). At a particular reference location s0 , we want to predict the value Z.s0 / using realizations .Z.s1 /; : : : ; Z.sn // of a random field .Z.s/ W s 2 Rd /. We call any measurable O 0 /. function applied to the realizations a predictor and denote this function by Z.s For now, we restrict our attention only to linear predictors, i.e., predictors of the O 0 / D 0 C > Z, where 0 2 R and 2 Rn . The aim is to find 0 and , form Z.s such that the mean squared error O 0 //2 EŒ.Z.s0 / Z.s is minimized. We suppose that .s/ and C .si ; sj / exist and calculate the mean squared error as EŒ.Z.s0 / 0 > Z/2 ˙ EŒZ.s0 / 2 ˙ 2EŒZ.s0 / EŒ> Z ˙ .EŒ> Z /2 D .EŒZ.s0 / 0 EŒ> Z /2 C EŒZ.s0 /2 EŒZ.s0 / 2 2.EŒZ.s0 /> Z EŒZ.s0 / EŒ> Z / C EŒ.> Z/2 EŒ> Z 2 D ..s0 / 0 > /2 C C .s0 ; s0 / 2> c C > C:
2242
R. Korn and A. Kochendörfer
Clearly, for every 2 Rn , the quadratic term in the last line is minimized by 0 D .s0 / > : It remains to minimize the variance O 0 // D C .s0 ; s0 / 2> c C > C: Var.Z.s0 / Z.s First, we show that there exists 2 Rn , such that C D c. Notice that C is a symmetric positive-semidefinite matrix. Let ˛ 2 ker C. We then have Var.˛ > Z/ D ˛ > C˛ D 0; hence, the random variable ˛ > Z is almost surely equal to a constant. Consequently, for any 2 Rn , we have C ov. > Z; ˛ > Z/ D > C˛ D 0; hence, C˛ D 0. Moreover, C ov.Z.s0 /; ˛ > Z/ D c> ˛ D 0. Hence for any ˛ 2 ker C, c> ˛ D 0, which means that c? ker C and consequently c 2 I mage.C> / D I mage.C/. Thus, there exists 2 Rn with C D c. The variance attains its minimum in this , since for every ˇ 2 Rn we have the following inequality: Var.Z.s0 /. Cˇ/> Z/ D C .s0 ; s0 / 2> c C > C C ˇ > CˇC2.Cc/> ˇ D C .s0 ; s0 / 2> c C > C C ˇ > Cˇ C .s0 ; s0 / 2> c C > C: O 0 / D 0 C > Z is optimal regarding the mean Consequently, the predictor Z.s O 0 / is determined squared error, and we call it the best linear predictor (BLP). Z.s uniquely through 0 and in a sense that for any other minimizer of mean squared error 00 ; 0 we have EŒ.0 C > Z 00 0> Z/2 D 0: If C is invertible, we can derive the BLP explicitly in terms of covariances and obtain the so-called Simple Kriging Weights D C1 c 0 D .s0 / c> C1 2 O 0 //2 D C .s0 ; s0 / c> C1 c: SK WD EŒ.Z.s0 / Z.s
Note that to obtain the optimal solution we did not require the random field to be stationary, intrinsic, or Gaussian. We only assumed however that the covariance function and the mean function of the random field are known. Unfortunately, this is typically not the case. In practice, the mean and covariance functions have to be
An Introduction to Prediction Methods in Geostatistics
2243
estimated from the measurements. The estimation error is not included in the mean squared error which is minimized by the BLP. We are going to relax the unrealistic assumption of a known mean in the next section but build on the just introduced concept of the BLP.
3.2
Ordinary Kriging
Ordinary Kriging is a method of finding the BLP without knowing the mean. However, we have to assume that the relevant random field is intrinsically stationary, i.e., the mean is constant, and the spatial dependencies are fully described by the variogram function. Note that the mean squared error is the sum of the variance of prediction error and the bias h i h i2 O 0 //2 D Var Z.s0 / Z.s O 0 / C E .Z.s0 / Z.s O 0 // : E .Z.s0 / Z.s Assuming a constant mean, the bias term becomes h
O 0 // E .Z.s0 / Z.s
i2 D
1
n X
!
!2
i .s0 / 0
:
i D1
P Thus, for the bias to vanish, we do not have to know .s0 / if we require niD1 i D 1 and set 0 D 0. In the light of this unbiasedness constraint, we can use the Lagrangian method to yield the optimal weights for the BLP, which in the case of a strictly positive covariance matrix C are given as the Ordinary Kriging Weights D 1 C
1 1> 1 1> 1 1
! 1 ;
where D ..s0 s1 /; : : : ; .s0 sn //;
i;j D .si sj /:
and .:/ denotes the variogram of the underlying random field. Plugging in these weights, we obtain the prediction error as
2 OK
h
O 0 // WD E .Z.s0 / Z.s
2
i
>
D
1
1 1> 1 1> 1 1
!2 :
2244
R. Korn and A. Kochendörfer
Note that the existence and uniqueness of the BLP are given if and only if the covariance matrix C is strictly positive definite.
3.3
Universal Kriging
Figure 4 shows an example of the Gaussian random field with a quadratic trend. In this case, the assumption of (intrinsic) stationarity is no longer valid. The universal Kriging approach detrends the random field implicitly. Universal Kriging allows the mean function of a random field to be a linear combination of some basis functions fl ; l D 0; : : : ; L with constant unknown weights ˛l ; hence, .s/ D
L X
˛l fl .s/:
lD0
The most common choice for the basis functions is polynomials of degree smaller or equal to two. If we decompose the mean squared error using the assumption about the mean, the bias term will be n L L h i X X X O 0 / Z.s0 / D E Z.s i ˛l f l .si / ˛l f l .s0 /: i D1
lD0
lD0
This equation can also be written as h
i
O 0 / Z.s0 / D E Z.s
L X lD0
Fig. 4 Gaussian random field with quadratic trend
˛l
n X
! i fl .si / fl .s0 / :
i D1
10 8 6 4 2 0 −2
An Introduction to Prediction Methods in Geostatistics
2245
For the bias to vanish, we set 0 D 0, and we have to find i ; i D 1; : : : ; n which fulfill the following L C 1 conditions: n X
i fl .si / D fl .s0 /:
i D1
Here, we see that the optimal weights do not depend on the coefficients ˛i , which are chosen for the modeling of the mean. Hence, the predictor is unbiased for every choice of the constants ˛i . Under these conditions, we minimize the variance of the prediction error using Lagrange multipliers. In particular, we have to solve a system of nCLC1 equations. The solution exists and is unique if and only if the covariance matrix is strictly positive definite and the basis functions are linearly independent. However, one here has to rely to numerical methods. An explicit analytical formula for the optimal solution, the universal Kriging weights, is typically not available.
3.4
Kriging in the Gaussian Setting
So far, we considered only linear predictors. In the general case, the best predictor for the actual value in the mean squared sense is the conditional expectation given the observations EŒZ.s0 /jZ . Indeed, it can elementary be shown that for all squareintegrable Z-measurable random variables Y , we have h i h i E .EŒZ.s0 /jZ Z.s0 //2 jZ E .Y Z.s0 //2 jZ for any distribution of .Z.s/ W s 2 Rd / (Bauer 1991, Theorem 15.8). The Gaussian random field is a rare case where we can compute the conditional expectation explicitly. This fact yields a very useful result. Theorem 1. For a Gaussian random field, the best linear predictor is optimal in mean squared sense among all (also nonlinear) predictors, and we also have i h O 0 /; E .Z.s O 0 / Z.s0 //2 : Z.s0 /jZ N Z.s Note that this means that in the Gaussian setting, we can use a suitable Kriging method as given above and can be sure that we have used the best possible filter in the mean squared sense.
4
Further Aspects of Interpolation Methods
After having presented the basic Kriging approaches, we will shortly comment on some important variants and generalizations. However, going into deep details goes beyond the scope of this contribution, and we will point to the relevant literature where appropriate.
2246
4.1
R. Korn and A. Kochendörfer
Indicator Kriging
If we have zero-one-data (“success or nonsuccess” or “group membership or not”), then we are in a slightly different situation as in the foregoing sections as we do not have continuous data. However, on one hand, we can interpret the use of zeroone-data as a discretization of continuous data where only the crossing of a certain level is relevant for our application (such as in geothermal exploration where a water temperature above a certain level is necessary for the success of the whole project) or simply as an indicator that states if a particular property is observed in a data point or not. Typical applications are then the estimation of the probability to observe this property when measuring the relevant data at a (not yet considered) point in space. Still, we can use the Kriging approach as a possible prediction method. The use of indicator functions in the standard Kriging approach goes back to Journel (1983). We use the standard convention of an indicator function compatible with the distribution function FZ .x/ D P .Z x/ ( Q1Zmin .x/ D 1 if x Zmin 0 if x > Zmin
;
for a given value Zmin . Assuming that the underlying random process Z.s/ is strongly stationary, one obtains
Z .s/ D FZ.s/ .Zmin / D E 1Q Zmin .Z .s// ; 2Z .h/ D Var 1Q Zmin .Z .s C h// 1Q Zmin .Z .s// for s; h 2 Rd . Note in particular that we have to use a new variogram, the indicator variogram. We then calculate the mean-square error optimal weights i , the Indicator Kriging Weights D
Z1
Z C 1
1 1> Z1 Z 1> Z1 1
! :
Note further that the indicator Kriging estimator is the mean squared optimal estimator for the conditional probability to observe a success at location s0 for each variable given the already observed data, i.e., for P .Z .s0 / Zmin jZ .s1 / ; : : : ; Z .sn // : It is important to note that although we consider a weighted sum of 0–1-variables, the indicator Kriging estimator might not (!) deliver a probability as the weights are not necessarily nonnegative. To ensure nonnegativity of the indicator Kriging weights, one has to solve the suitably constraint optimization problem. This can
An Introduction to Prediction Methods in Geostatistics
2247
easily be achieved by standard software tools. However, the then obtained Kriging weights do in general not have an easy explicit form as the ones above.
4.2
Multivariate Kriging and Cokriging
There are two straightforward arguments for considering multivariate Kriging. One is the fact that there are more than quantities that have to be predicted as a basis of a decision (such as water pressure and temperature in geothermal drilling or quantity and quality of an ore in mining) for further investigation of a possible place. Another practical situation is that the data contains not only samples of the random field of interest but also measurements of secondary random fields. The realizations of the secondary variables are not necessarily measured at the same locations as the realizations of the random field. They can be even disjoint. Nevertheless, the information of the secondary variables even if they have no obvious explanatory meaning can and should be used for prediction. However, in the case where we have no measurement of them at all data points, we have to predict them, too, to make use of them. Both these situations call for methods of multivariate Kriging, i.e., the joint (least-squares optimal) prediction of more than one variable. This is then referred to as cokriging. It will come in the same versions as the one-dimensional Kriging methods of the last chapter, and we below will virtually say that one has to replace variables by suitable vectors and vectors by suitable matrices (see Chiles and Delfiner (2012, Chapter 5) for a detailed treatment). However, before we start with multivariate Kriging, we need to adopt our notations to the multivariate setting. We now consider p random fields over a joint domain D and denote them by fZi .s/ W s 2 D Rd g for i D 1; : : : ; p. Furthermore, we assume that all random fields are intrinsically stationary. The realizations of the random field Zi .s/ are measured at locations Si WD fsk 2 D; k D 1; : : : ; Ni g. We denote the corresponding vector of observations by Zi WD .Zi .s1 /; : : : ; Zi .sNi //> : We denote the objective of prediction, which can belong to one of the observed random fields, by Z.s0 / and use analogous notations as before for mean and covariance functions i WD .i .s1 /; : : : ; i .sNi //; Cij WD C ov.Zi ; Z> j /; ci 0 WD C ov.Z.s0 /; Zi /: As we now have a whole set of random fields, the linear predictor in the multivariate case is given by the sum over all realizations and all samples
2248
R. Korn and A. Kochendörfer
Z .s0 / D 0 C
XX i
i k Zi .sk / D 0 C
X
> i Zi ;
i
k
where 0 2 R and i D .i 1 ; : : : ; iNi /> 2 RNi . To demonstrate briefly how the multivariate prediction works, we assume that the mean of each random field is known, i.e., we perform simple cokriging. The approach is the same as in the case of simple Kriging. First, we look at the mean squared error of the linear predictor Z .s0 /. By setting 0 D 1 .s0 /
X
> i i ;
i
we eliminate the bias. Thus, it remains to minimize the variance Var.Z.s0 / Z .s0 // D
XX i
> i Cij j 2
j
X
> i ci 0 C Var.Z.s0 //
i
which leads to the simple cokriging system X
Cij j D ci 0 ;
i D 1; : : : ; p:
j
The resulting cokriging variance is then given by 2 D Var.Z.s0 // CK
X
> i ci 0 :
i
With regard to performing cokriging, thus everything is completely similar to the one-dimensional case; only there are more equations to solve for obtaining the optimal weights. However, the major difficulty lies in the needed input. In contrast to simple Kriging, we need to model and estimate not only one variogram function but also cross-variograms ij .h/ D Var.Zi .s C h/ Zj .s// for all pairs i; j D 1; : : : ; p. Note that for the estimation of this cross-variogram, the samples of different random fields do not have to have the same locations. Only the separation distance from the sample in Zi to the sample in Zj is taken into account. Also we have to consider that the secondary variables are often measured in different units; thus, the above formula makes no sense without modification. Hence, before we can estimate variograms, we need to make the data from different random fields 1 comparable by dividing the i th sample by its estimated variogram .Oi i .h0 // 2 for some fixed h0 2 Rd (Cressie 1991, Chapter 3.2.3).
An Introduction to Prediction Methods in Geostatistics
4.3
2249
Further Kriging
There are many more variants of Kriging such as probability Kriging which is an extension of indicator Kriging, as besides 0–1-variables, also secondary variables are used for the prediction of a desired probability. Other popular names are disjunctive Kriging (i.e., cokriging of indicator functions) or factorial Kriging (i.e., cokriging with independent random fields). However, before using multivariate or non-linear methods, one should always consider the additional effort and additional statistical uncertainties of estimating additional variograms. It might be appropriate in many cases to consider a deterministic, non-linear influence of additional variables and then use universal Kriging and stick with the original variogram.
5
An Example with Synthetic Data
In this section, we demonstrate how the best linear prediction works using synthetic data. For this, we generate a Gaussian random field on the grid Œ1; 100 Œ1; 100 with mesh size 1. That is, we generate 10,000 normally distributed random variables with a given covariance structure that defines the spatial relationships between these random variables. These spatial relationships are fully described by the corresponding variogram function. We cannot choose an arbitrary function for modeling the variogram, since from the definition the function has to be conditionally P negative definite. Thus, for any locations s1 ; : : : ; sn and weights ˛1 ; : : : ; ˛n with niD1 ˛i D 0, we must have n n X X
˛i ˛j .sj si / 0:
i D1 j D1
Only then the definition of the variogram makes sense as it is ensured that the covariance matrix is positive semidefinite. There is a wide range of variogram models of which we mention only the most common ones (Cressie 1991, p. 58ff). Nugget Model. Empirical variograms often have a discontinuity at the origin. This means that even for very small separation distances, the variogram is significantly larger than zero. Mathematically, this cannot happen for random fields with spatial continuity. This effect is called the nugget effect. It can be explained by the measurement errors, but the more realistic explanation is the variation in the data on the scale which is smaller than the minimal distance h used in the calculation. The nugget effect model is given by ( .h/ D
0;
hD0
c0 ; h ¤ 0;
2250
R. Korn and A. Kochendörfer
Fig. 5 Random field without spatial dependency
2 1 0 −1 −2 −3
with some positive constant c0 . The nugget model is usually used in combination with other variogram models. It implies that there is no spatial dependency in the random field, i.e., it models the white noise part in the data (see Fig. 5). Spherical Model. Here, we assume 8 ˆ 0; ˆ ˆ < 3 .h/ D c0 C cs 32 ahs 12 ahs ˆ ˆ ˆ : c0 C cs ;
hD0 0 < h as h as ;
with c0 ; cs ; as 0. The constant as is called the correlation range; it defines the distance below which the data remains correlated. .as / D c0 C cs is called the sill. It is a positive constant at which the function reaches its plateau. The spherical variogram is linear for small separation distances and is flat for large distances. Gaussian Model. This model is used for fields with strong spatial continuity, i.e., the examined quantity varies slowly in space. This effect is caused by the parabolic behavior near the origin. The function is given by
.h/ D
8 Hohenpeißenberg. However, their oscillations s are nearly of equal size (0.8), and so are even the auto-correlations r(1). That is, the correlation between the averages of two consecutive years amounts to 0.29. . . 0.36. We will see below, how much is due to the long-term trend of the series. Discussion of the rows Winter . . . Autumn: The winter data have the largest oscillations s and small auto-correlations r(1). (Even smaller are the r(1) values of the autumn data signalizing practically uncorrelation). The time series plot of the winter series (lower plot of Fig. 1) reflects the s and r(1) values of the table. In comparison with the upper plot of the annual means it shows a high fluctuation, with no distinct trend, coming nearer to the plot of a pure random series.
2.3
Precipitation Series
Again, we have drawn two time series plots for each station: the yearly sums (upper plot) and the winter sums (lower plot). One finds the plots for Hohenpeißenberg (1879–2008) in Fig. 2 Karlsruhe (1876–2008) and Potsdam (1893–2008) under the author’s homepage.
2260
H. Pruscha
Fig. 1 Annual temperature means (top) and winter temperature means (bottom) in (ı C), Hohenpeißenberg, 1781–2008; with a fitted polynomial of fourth order (dashed line), with centered moving (10-years) averages (inner solid line), and with the total average over all 228 years (horizontal dots)
Table 3 offers the outcomes of descriptive statistical measures that are total precipitation amount (h) in (mm) height, standard deviation (s), and auto-correlation of first order (r(1)). The annual precipitation amount at the mountain Hoher Peißenberg is twice the amount in Potsdam. The oscillation values s stand in the same order as the amounts h. That is different to the temperature results, where all three s values were nearly the same. Note that the precipitation scale has a genuine zero point, but the temperature scale has none (which is relevant for us).
Statistical Analysis of Climate Series
2261
Table 2 Descriptive measures of the seasonal and annual temperature data in (ı C) for the three stations Hohenp. n = 228 m s Winter 1:36 1.74 Spring 5:56 1.32 Summer 14:25 1.08 Autumn 6:96 1.30 Year 6:35 0.84
r(1) 0.08 0.16 0.20 0.01 0.29
Karlsruhe n = 210 m s r(1) 1:76 1.89 0.11 10:18 1.08 0.24 18:72 1.07 0.25 10:19 1.03 0.05 10:22 0.80 0.33
Potsdam n = 116 m s r(1) 0:19 2.09 0.13 8:45 1.15 0.19 17:40 1.01 0.15 8:91 1.08 0.08 8:78 0.81 0.36
Fig. 2 Annual precipitation amounts (top) and winter precipitation amounts (bottom) in (dm), Hohenpeißenberg, 1879–2008; with a fitted polynomial of fourth order (dashed line), with centered moving (10-years) averages (inner solid line) and with the total average over all 130 years (horizontal dots)
2262
H. Pruscha
Table 3 Descriptive measures of the seasonal and annual precipitation amount in (mm) for the three stations
Winter Spring Summer Autumn Year
Hohenp. n = 130 h s r(1) 166 54 0.14 264 73 0.23 453 94 0.12 246 79 0.04 1,130 173 0.27
Karlsruhe n = 133 h s r(1) 168 54 0.04 178 56 0.11 228 70 0.20 189 64 0.01 762 135 0.01
Potsdam n = 116 h s r(1) 130 37 0.06 131 42 0.02 196 60 0.03 133 43 0.22 590 97 0.08
The winter precipitation has – compared with the other three seasons – the smallest total h and the smallest oscillation s (winter temperature had the largest s) While the precipitation series of winter and year in Karlsruhe and Potsdam – with their small r(1) coefficients – resemble series of uncorrelated variables (also called pure random series), the series at Hohenpeißenberg, however, do not (see also Sect. 4).
3
Temperature Trends
In this section, we study the long-term trend of temperature over the last two centuries.
3.1
Comparison of the Last Two Centuries
While temperature decreases in the nineteenth century, it increases in the twentieth century, see Fig. 3. We report the following results: 1. The regression coefficients (slopes) b D bTempjYear , of the two – separately fitted – straight lines yOt D aCbt, are tested against p the hypothesis of a zero p slope. The level 0.01 bound for the test statistic T D jrj= 1 r 2 is t98;0:995 = 98 D 0:265. Herein, the correlation coefficient r D b (sYear /STemp ) is the dimension-free version of b. As Table 4 informs us, the negative trend in the nineteenth century and the positive trend in the twentieth century are statistically well confirmed (at Hohenpeißenberg and in Karlsruhe). The test assumes uncorrelated residuals et D yt yOt . This can be substantiated using the auto-correlation function of the et (not shown, but see Sect. 5 for similar analyses). 2. The total means m1 and m2 of the two centuries do not differ very much from each other and from the total mean m of the whole series, see Table 4. The average m3 over the last 20 years is significantly larger than m, m1 , and m2 [0.01 level]; that is immediately confirmed by a two sample test, even after a
Statistical Analysis of Climate Series
2263
Fig. 3 Annual temperature means (ı C) Hohenpeißenberg, 1781–2008 (top), Karlsruhe, 1799– 2008 (bottom); with fitted straight line for each century, compare also Schönwiese et al. (1993). The fitted line for the last 20 years is also shown (dashed line)
correction, discussed in the following. The warming in the last two decades is well established by our data. When applying tests and confidence intervals to time series data, the effect of auto-correlation should be taken into account. To compensate, the sample size n is to be reduced to an effective sample size neff . As an example, we treat the confidence
2264
H. Pruscha
Table 4 Statistical measures for the temperature (ı C) of the last two centuries and of the last 20 years Hohenpeißenberg Mean value Stand dev. 6.129 0:843 6.445 0:747 7.448 0:670 Karlsruhe Mean value Stand dev. Period 19th cent. 10.114 0:845 20th cent. 10.219 0:689 1989–2008 11.357 0:547 Period 19th cent. 20th cent. 1989–2008
Regress. b*100 0:763 1:006 2:561
Correl. r 0:262 0:390 0:226
Test T 0.271 0.423
Regress. b*100 1:079 0:988 2:836
Correl. r 0:370 0:416 0:307
Test T 0.398 0.457
interval for the true mean value of a climate variable, let’s say the long-term temperature mean. On the basis of an observed mean value y, N a standard deviation s and an auto-correlation function r.h/, it is s s yN C u0 p ; yN u0 p neff neff
u0 D u1˛=2 ;
with u being the -quantile of the N (0, 1)-distribution (a large n is assumed), and with von Storch and Zwiers (1999) and Brockwell and Davis (2006) neff D
1C2
Pn1
n
kD1 .1
.k=n// r.k/
:
For an AR(1)-process with an auto-correlation r D r(1) of first order, we have neff D n .1 r/ =.1 C r/. Hohenpeißenberg: n D 228, r D 0:289, yN D 6:352, s D 0:844 lead to neff D 125:76 and thus to a 99 % confidence interval [6. 158, 6. 546]. Karlsruhe: n D 210, r D 0:332, yN D 10:216, s D 0. 802 lead to neff D 105:32 and so to a 99 % confidence interval [10. 015, 10. 417]. In both cases, at least 18 of the last 20 yearly temperature means lie above the upper 99 % confidence limit, reinforcing the result 2, above. The winter temperatures show the same pattern, but in a weakened form: The fall and the rise of the straight lines are no longer significant (see result 1.), 14 of the last 20 winter temperature means lie above the upper 99 % limit.
3.2
Historical Temperature Variation
Statistical results are formal statements; they alone do not allow substantial statements on Earth warming. Especially, a prolongation of the upward lines of Fig. 3 would be dubious. An inspection of temperature variability of the last thousand
Statistical Analysis of Climate Series
2265
years reveals that a trend (on a shorter time scale) could turn out to be as part of the normal variation of climate system (Schönwiese 1979; von Storch and Zwiers 1999).
4
Correlation: From Yearly to Daily Data
Scattergrams and correlation coefficients are defined for a bivariate sample (x1 , y1 ), . . . , (xn , yn ), where two variables, x and y, are measured n-times at comparable objects.
4.1
Auto-correlation Coefficient
How strong is an observation at time point t (named x) correlated with the observation at the succeeding time point t C 1 (named y)? That is, we are dealing now with the case, that x and y are the same variables (e.g., temperature Tp) but observed at different time points, x D Tp.t/;
y D Tp.t C 1/:
The scattergram of Fig. 5 (left) presents the 12 228 monthly temperature means at Hohenpeißenberg. The corresponding correlation coefficient is r D r(1) = 0.79. The large value is due to the seasonal effects, i.e., to the course of the monthly temperatures over the year. It contains, so to say, much redundant information. In order to adjust, we first calculate the seasonal effects by the total averages for each month, mjan ; : : : ; mdec ; together forming the seasonal component: Figure 4 gives the seasonal component for the three stations in form of histograms. Then we build seasonally adjusted data by subtracting from each monthly mean the corresponding seasonal effect. To the scattergram of Fig. 5 (right) belongs the correlation coefficient r D 0:15, which is much smaller than the r D 0:79 from above for the non-adjusted data. Tables 5 and 6 bring auto-correlations r.1/ D r.Yt ; Yt C1 / of climate variables for two successive time points. We deal with the variables Y D yearly, quarterly, monthly, daily temperature and precipitation. The r(1) coefficients for day were gained from the 1,460 consecutive daily temperature and precipitation records of the years 2004–2007. Besides the auto-correlation r(1) of the non-adjusted variables (put in parenthesis) we present the r(1) coefficient for the adjusted variables without parenthesis. Herein adjustment refers to the removal of the trend component (here a polynomial of order 4) in the case of year, quarter, and days. In the latter case, the polynomial
2266
H. Pruscha Temp HohenpeiBenbg
Temp Karlsruhe
Temp Postsdam 20
15 10 5 0
Temperature (°C)
20
Temperature (°C)
Temperature (°C)
20
15 10 5
M M J S Month
N
10 5 0
0 J
15
J
M M J S Month
J
N
M M J S Month
N
Fig. 4 Monthly temperatures; total averages at the three climate stations HohenpeiBenberg, Temp. 1781–2008 r = 0.787
20
r = 0.152 5
15
TP[t+1]
TP[t+1]
10 5
0
–5 0 –10
–10 –10
0
5
10 15
TP[t] not adjusted
20
–10
–5
0
5
TP[t] seasonally adjusted
Fig. 5 Monthly temperature means TP D Y . Scattergrams Y .t C 1/ over Y .t / with n D 12 228 1 points, left: Original (not adjusted) variables, with correlation r D 0:79; right: Seasonally adjusted variables (i.e., removal of monthly total averages), with correlation r D 0:15
was drawn over the 365 days of the year (removal of the seasonal component in the case of month). Note that the non-adjusted temperature variables do not have negative autocorrelations (persistence), but some precipitation variables do (switch over). In the following, the outcomes for the adjusted series that are the figures of Tables 5 and 6 not in parenthesis, are only discussed. Temperature: As to be expected, the auto-correlation of the daily data set is large. Smaller are those in the case of month, season, and year. The seasonal autocorrelations are (with one exception) smaller than the yearly; especially the winter ! succeeding winter correlation is small. Precipitation: Only at the mountain HohenPeißenberg, the auto-correlation of yearly data differs distinctly from zero. Here the precipitation series has more inner structure than the series of Karlsruhe or Potsdam; see also the conclusions
Statistical Analysis of Climate Series
2267
Table 5 Auto-correlation r.1/ D r.Yt , YtC1 / for climate variables (Hohenpeißenberg), without (in parenthesis) and with adjustment Temperature n r(Y t , Y tC1 ) Year ! succeed. Year 227 .0:289/0:118 Winter ! succ. Wi 227 .0:077/0:011 Summer ! succ. Su 227 .0:198/0:104 Winter ! succ. Su 227 .0:162/0:101 Summer ! succ. Wi 227 .0:061/0:016 Month ! succ. Mo 2;735 .0:787/0:152 Day ! succeed. Day 1;460 .0:932/0:825 Succession
Precipitation n 129 129 129 129 129 1;559 1;460
r(Y t , Y tC1 ) (0.273) 0.184 (0.144) -0.006 (0.116) 0.168 (0.219) 0.170 (0.025) 0.098 (0.377) 0.012 (0.271) 0.250
Table 6 Auto-correlation r.1/ D r.Yt , YtC1 / for climate variables (Karlsruhe), without (in parenthesis) and with adjustment Succession Year ! succeed.Year Winter ! succ. Wi Summer ! succ. Su Winter ! succ. Su Summer ! succ. Wi Month ! succeed.Mo Day ! succeed.Day
Temperature n r(Y t , Y tC1 ) 209 (0.332) 0.110 209 (0.113) 0.060 209 (0.250) 0.064 209 (0.175) 0.121 209 (0.119) 0.052 2,519 (0.811) 0.197 1,460 (0.962) 0.867
Precipitation n r(Y t;Y tC1 ) 132 (0.009) 0.005 132 (0.041) 0.082 132 (0.201) 0.230 132 (0.104) 0.127 132 (0.084) 0.067 1,595 (0.071) 0.029 1,460 (0.162) 0.157
(Sect. 7). Completely different to the temperature situation, the correlations of the daily precipitation data are – perhaps against expectations – relatively small and of the monthly data are nearly negligible. What is the meaning of a particular r(1) value when we are at time t and the immediately succeeding observation (at time t C 1) is to be predicted?
4.2
Prediction of Above-Average Values
Assume that we have calculated a certain auto-correlation r(1) D r.Yt , Yt C1 /. Assume further that we have just observed an above-average (or an extreme) value of Yt . What is the probability P that the next observation Yt C1 will be above-average (or extreme), too. To tackle this problem, let X and Y denote two random variables, with the coefficient D X;Y of the true correlation between them. We ask for the probability that an observation X , being greater than a certain threshold value Qx , is followed by an observation Y , exceeding a Qy . If the X -value exceeds Qx , then Table 7 gives (broken up according to the coefficient ) the probabilities P for the event that the Y -value exceeds Qy . As threshold values we choose quantiles Q (also called
2268
H. Pruscha
Table 7 Conditional probabilities for exceeding threshold values Q Conditional probability y
x P.Y > Q0:50 jX > Q0:50 / y x P.Y > Q0:75 jX > Q0:75 / y x P.Y > Q0:90 jX > Q0:90 /
Correlation D X;Y 0:00 0:10 0:20 0:50 0:53 0:56 0:25 0:29 0:34 0:10 0:14 0:17
0:30 0:60 0:40 0:24
0:40 0:63 0:45 0:29
0:50 0:67 0:51 0:39
0:60 0:70 0:57 0:45
0:70 0:75 0:64 0:53
100 % percentiles), for D 0:5; 0:75, 0.90. These threshold values could also be called: average value (more precisely an 50 % value), upper 25 % value, upper 10 % value, respectively. Each entry in Table 7 is calculated by means of 40,000 simulations of a pair (X , Y ) of two-dimensional Gaussian random variables. x Examples: Assume that X turns out to exceed the X-average Q0:50 (X -value being above-averaged). Then the probability that Y is above-averaged, too, equals 50 % for D 0; 60 % for D 0:30; 70 % for D 0:60. x If X exceeds Q0:90 (X being an upper 10 % value) the probability that Y is an upper 10 % value, too, equals 10 % for D 0; 24 % for D 0. 30; 45 % for D 0:60. In the sequel X and Y will denote climate variables, where Y follows X in time.
Application to Climate Data Once again, only the outcome for the adjusted series that are the figures in Tables 5 and 6 not in parenthesis – are discussed. The absolute value jr(1)j of most auto-correlations fall into the interval from 0.0 to 0.2. The ratio of hits – when observing an above-average climate value and predicting the same for the next observation – lies between 50 and 56 % (according to Table 7). This is to compare with the 50 % when pure guessing via “coin tossing” is applied. These modest chances of a successful prediction will find their empirical counterparts in Table 8. The daily temperatures, with r(1) > 0.70, have a ratio above 75 % for the prediction above-average ! above-average. If we have an upper 10 % day, then we can predict the same for the next day with success probability above 53 % (to compare with 10 % when merely guessing). Folk Sayings Folk (or country) sayings about weather relate to a narrow region (probably not covered here), a particular time epoch (here centuries are involved), and to the crop (Malberg 2003). The former weather observers (from the country or from monasteries) without modern measuring, recording, and evaluation equipments were pioneers of weather forecasting. The following sayings are selected from Malberg (2003) and from popular sources. We kept the German language, but we have transformed them in Table 8.
Statistical Analysis of Climate Series
2269
Table 8 Hit ratio of the rules 1–6. Explanations in the text Ex
X!Y
1 1 2
Tp Dec Tp Dec Tp Sep
Tp Jan Tp Feb Tp Oct
r(X, Y) Hohen 0:13 0:11 0:14
3
Tp Nov Pr Nov Tp Aug Pr Aug Tp Sum
Tp May Pr May Tp Feb Pr Feb Tp Win
0:05 0:02 0:08 0:04 0:06
Pr Sum
Pr Win
0:02
Tp Win
Tp Sum
0:16
Pr Win
Pr Sum
0:22
4 5
6
P 0.54 0.53 0.54 0.54 0.51 0.50 0.53 0.51 0.51 0.49 0.50 0.50 0.55 0.55 0.45 0.57 0.43
[>j>] [>j>] [>j>] [] [>j>] [>j>] [>j>] [>j>] [] [>j>] [] [>j>] [] [>j or a < sign
Persistence rules Ex. 1 : Ist Dezember lind ! der ganze Winter ein Kind Ex. 2: Kühler September ! kalter Oktober Six-months rules Ex. 3 : Der Mai kommt gezogen wie der November verflogen Ex. 4: Wie der August war ! wird der künftige Februar Yearly-balance rules Ex. 5: Wenn der Sommer warm ist ! so der Winter kalt Ex. 6: Wenn der Winter kalt ist ! so der Sommer warm The columns of Table 8 present: Transcription of the weather rules 1–6, with Tp standing for temperature and Pr for precipitation, correlation coefficient r from y x Hohenpeißenberg data, conditional probability P.Y > Q0:5 jX > Q0:5 /, belonging to the r-value according to Table 7, Percentage % ŒY > yjX N > x N of cases, in which an above-average X -value is followed by an above-average Y -value. This is given for Berlin-Dahlem 1908–1987 (Malberg 2003), Karlsruhe, Hohenpeißenberg. Rule 2 aims at the percentage % ŒY < yjX N < x , N rule 5 at % ŒY < yjX N > x , N rule 6 at % ŒY > yjX N < x . N These percentages are presented, too, in addition to the percentage % ŒY > yjX N > x . N The hit ratios, gained from the Hohenpeißenberg and from the Karlsruhe data, are rather poor and cannot confirm the rules. At most, the persistence rules find a weak
2270
H. Pruscha
confirmation. In some cases, another version of the rule (Ex. 2) or even the opposite rule (Ex. 5, Ex. 6) are proposed by our data. In connection with summer/winter prognoses in Ex. 5, the figures of the table favor a precipitation rule more than a temperature rule. With one (two) exceptions, the Berlin-Dahlem series brings higher hit ratios than the series from Hohenpeißenberg (Karlsruhe). The reason could be, that the Dahlem series is shorter and is perhaps (climatically) nearer to the place of origin of the rules. Note that the theoretical P values from Table 7 are consistent with the empirical percentages in Table 8 (both evaluated for Hohenpeißenberg). There are two or three exceptions, which relate to the precipitation data.
5
Model and Prediction: Yearly Data
In the following we discuss statistical models, which can reveal (i) the mechanism of how a climate series evolves, and can support (ii) the prediction of climate values in the near future. Time series models of the ARMA-type will stand in the center of our analysis.
5.1
Differences, Prediction, Summation
Let Y be the time series of N yearly climate records; i.e., we have the data Y .t/, t D 1, . . . , N . In connection with time series modeling and prediction, the trend of the series is removed preferably by forming differences of consecutive time series values. From the series Y , we thus arrive at the time series X , with X .t/ D Y .t/ Y .t 1/; t D 2; : : : ; N;
ŒX .1/ D 0 :
(1)
Table 9 shows that the yearly changes X of temperature have mean 0 and an average deviation (from the mean 0) of 1 (ı C), at all three stations. The firstorder auto-correlations r(1) of the differences X lie in the range – 0.4. . . – 0.5. After an increase of temperature follows – as a tendency – an immediate decrease in the next year, and vice versa. We consider now the differenced time series X .t/ as sufficiently “trendfree” and try to fit an ARMA(p,q)-model. Such a model obeys the equation Table 9 Differences X of temperature means [ı C] in consecutive years Station Hohenp. Karlsr. Potsd.
N 228 210 116
Mean 0.002 0.011 0.018
Stand.dev. 1.002 0.921 0.924
r(1) 0.469 0.489 0.420
r(2) 0.019 0.057 0.074
r(3) 0.076 0.052 0.197
Statistical Analysis of Climate Series
2271
X .t/ D˛p X .t p/ C C ˛2 X .t 2/ C ˛1 X .t 1/ Cˇq e.t q/ C C ˇ2 e.t 2/ C ˇ1 e.t 1/ C e.t/;
(2)
with error (residual) variables e.t/. For each time point t, we can calculate a O prognosis X.t/ for the next observation X .t/, called ARMA-prediction. This is done on the basis of the preceding observations X .t 1/; X .t 2/, . . . in the following way. Equation (2) is converted into e.t/ D X .t/ .˛p X .t p/ C C ˛1 X .t 1// .ˇq e.t q/ C C ˇ1 e.t 1//
(3)
for t D 1, . . . , n. Here, the first q error variables e and the first p observation variables X must be predefined. Then the further error variables can be recursively gained from Eq. (3). The prognosis XO .t/ for the next observation X .t/ uses Eq. (2), setting e.t/ zero, while the other variables e.t 1/; e.t 2/, . . . are recursively gained as described under (3). We have then the ARMA-prediction XO .t/ D ˛p X .t p/ C C ˛2 X .t 2/ C ˛1 X .t 1/ Cˇq e.t q/ C C ˇ2 e.t 2/ C ˇ1 e.t 1/:
(4)
The goodness of the prediction and hence the goodness-of-fit of the ARMA-model is assessed by the mean sum of squared errors, more precisely, by r RootMSQ D
.1=N /
XN t D1
2: O .X .t/ X.t//
(5)
From the differenced series X we get back by summation (also called integration) the original series Y . The prediction YO .t/ for Y .t/ is gained by YO .t/ D Y .t 1/ C XO .t/; t D 2; : : : ; N I YO .1/ D Y .1/: Note that the calculation of YO .t/ uses information up to time t 1 only. Due to X .t/ XO .t/ D Y .t/ YO .t/, the prediction YO .t/ for Y .t/ is as good as the prediction XO .t/ for X .t/, namely by (5) r RootMSQ D
.1=N /
XN t D1
.Y .t/ YO .t//2 :
(6)
This procedure is called the ARIMA-method, the variables YO are referred to as ARIMA-predictions for Y (t).
2272
H. Pruscha
Table 10 ARIMA-method for the annual temperature means; coefficients, goodness-of-fit, prediction. H D Hohenpeißenberg, K D Karlsruhe, P D Potsdam
H K P
5.2
Order p, q
ARMA-coefficients ˛i ˇj
Root MSQ
ARIMA-prediction 2006–2008 2009
2,2 2,2 3,1
0.639,0.105 0.266,0.070 0.15,0.04,0.24
0.768 0.707 0.726
7.25, 7.40, 7.49 11.25,11.33,11.42 9.44, 9.69, 9.82
0.177,0.667 0.567,0.281 0.915
7:47 11:44 9:70
Yearly Temperature Means
Y .t/ denotes now the temperature mean of year t and X .t/ – according to Eq. (1) – the differenced series, i.e., the series of the yearly changes. It is X to which an ARMA(p,q)-model is fitted. We choose order numbers (p, q) as small as possible, such that an increase of these numbers brings no essential improvement of the goodness measure RootMSQ. For the Hohenpeißenberg and Karlsruhe data we get p D q D 2, and therefore the ARMA(2,2)-model X .t/ D ˛2 X .t 2/ C ˛1 X .t 1/ C ˇ2 e.t 2/ C ˇ1 e.t 1/ C e.t/
(7)
(for Potsdam we obtain p D 3, q D 1). Table 10 shows the estimated coefficients ˛i and ˇj . As a rule, at least one ˛ and one ˇ is significantly different from zero. Further, the table offers the forecasts for the 3 years 2006–2008 as well as for the year 2009, each time on the basis of the preceding years. The actual observations 2006–2008 are slightly underestimated; compare the data excerpt in the Appendix and Fig. 6 (lower plot). This plot also shows the smoothing character of the predictions. For a clearer presentation, we confine ourselves to the reproduction of the last 50 years (but for calculating the coefficients ˛, ˇ the whole series was used).
5.3
Comparison with Moving Averages
Alternatively, prediction according to the method of left-sided moving averages can be chosen. As prediction YO .t/ – for Y .t/ at time point t – the average of the preceding observations Y .t 1/, Y .t 2/, . . . , Y .t k/ is taken. The depth number k denotes the number of years involved in the average. Once again by Eq. (6) we calculate the goodness of this prediction method. Table 11 demonstrates that for a depth k smaller than 11 the RootMSQ-values of the ARIMA-method are not improved. Note that the latter method only needed p C q D 4 coefficients (but see also the remark in the conclusions).
Statistical Analysis of Climate Series
2273
Fig. 6 Hohenpeißenberg, annual temperature means, 1781–2008. Top: Differenced time series, with ARMA-predictions (dashed line) and with residual values (as circles o). Bottom: Time series of annual temperature means [ı C], together with the ARIMA-prediction (dashed line). The last 50 years are shown
ARIMA-Residuals Having calculated the ARIMA-predictions YO .t/ for Y .t/, t D 1, . . . , N , we then build residuals e.t/ D Y .t/ YO .t/; t D 1; : : : ; N;
(8)
from these predictions; see Fig. 6 (top). Note that we already used residuals in Eq. (6); as stated above we also have e.t/ D X .t/ XO .t/. We ask now for the structure of the residual time series e.t/; t D 1, . . . , N . All values of the autocorrelation function re (h), h D 1, . . . , 8, are close to zero, cf. Table 12. The bound for the maximum of jre .h/j, h D 1, . . . , 8 (i e., the simultaneous bound with respect to the hypothesis of a pure random series) is
2274
H. Pruscha
Table 11 Depth k of the left-sided moving averages and resulting goodness-of-fit RootMSQ. The latter is given for the ARIMA-method, too Depth k 5 8 10 12 14 16 18 20 ARIMA(s. Tab. 10)
RootMSQ Hohenp. 0.819 0.802 0.790 0.784 0.768 0.763 0.765 0.774 0.768
Karlsr. 0.747 0.717 0.707 0.715 0.708 0.699 0.698 0.713 0.707
Potsdam 0.836 0.796 0.780 0.791 0.788 0.782 0.782 0.796 0.726
Table 12 Auto-correlation function re (h/ up to time lag h = 8 of the ARIMA-residuals; annual temperature means. H D Hohenpeißenberg, K D Karlsruhe, P D Potsdam re (1) H 0.014 K 0.005 P 0.007
re (2) re (3) re (4) 0:003 0:012 0:03 0:016 0:022 0:07 0:009 0:004 0:05
re (5) 0:06 0:02 0:01
re (6) 0:04 0:01 0:10
re (7) 0:00 0:04 0:07
re (8) 0:06 0:03 0:01
b1
b8
0.130 0.135 0.182
0.181 0.189 0.254
p b8 D u10:025=8 = N .significance level 0:05/; p and is not exceeded, not even the bound b1 D u0:975 = N for an individual jre .h/j. We can assume, that the series e.t/ consists of uncorrelated variables. Next we ask, whether the (true) variances of the ARIMA-residuals e.t/ are constant over time – or whether periods of (truly) stronger and periods of (truly) weaker oscillation alternate. To this end, we calculate – moving in 5-years time blocks [t 4; t] – the empirical variances O 2 .t/. The roots O .t/, plotted in Fig. 7, form an oscillating line around the value 0.77 (Hohenpeißenberg), but a definite answer to the above question cannot be given.
GARCH-Modeling Denoting by 2 .t/ D Var(e.t/) the true variance of the error variable e.t/, we are going to investigate the structure of the time series 2 .t/, t D 1; 2; : : :, N . By means of GARCH-models we can analyze time series with (possibly) varying variances. For this reason, an ARMA(p, q)-type equation for 2 .t/ is established, namely 2 .t/ D ˛p Z 2 .t p/ C C ˛2 Z 2 .t 2/ C ˛1 Z 2 .t 1/ C ˛0 Cˇq 2 .t q/ C C ˇ1 2 .t 1/; t D 1; 2; : : : ;
(9)
(˛’s, ˇ’s nonnegative). A zero-mean process of uncorrelated variables Z.t/ is called a GARCH(p, q)-process (p, q 0), if the (conditional) variance of Z.t/, given the
Statistical Analysis of Climate Series
2275
Fig. 7 Hohenpeißenberg, annual temperature means, 1781–2008. Time series of ARIMAresiduals (zigzag line), standard deviation O of left-sided moving (5-years) blocks (dashed line), GARCH-prediction for (solid line)
information up to time t 1, equals 2 .t/, where 2 .t/ fulfills Eq. (9) (Kreiß and Neuhaus 2006; Cryer and Chan 2008). Order numbers (p, q) are to be determined (here p D 3, q D 1) and p C q C 1 coefficients ˛, ˇ must be estimated. Then we build predictions O 2 .t/ for the series 2 .t/ in this way: Let the time point t be fixed. Having observed the preceding Z.t 1/, Z.t 2/, . . . (and having already computed O 2 .t 1/; O 2 .t 2/; : : :), then we put O 2 .t/ according to Eq. (9), but with 2 .t s/ replaced by O 2 .t s/. Here the first q O 2 -values must be predefined, for instance by the empirical variance of the time series Z. We are speaking of the GARCH-prediction for the variance 2 .t/. Now we apply this method to our data and put Z.t/ D e.t/, the ARIMAresiduals from Eq. (8). For the Hohenpeißenberg series these GARCH-predictions reproduce in essence the horizontal line 0.77, see Fig. 7. This means that we can consider e.t/ as a series of uncorrelated variables with constant variance 2 .t/ D 2 , i.e., as a white noise process. From there we can state that the differenced sequence X .t/ can sufficiently be fitted well by an ARMA-model, since the latter demands a white noise error process.
5.4
Yearly Precipitation Amounts
Y .t/ denotes now the precipitation amount in the year t. From Y we pass to the series X by building differences, where X .t/ D Y .t/ Y .t 1/; t D 2, . . . , N , X .1/ D 0.
2276
H. Pruscha
Table 13 Differences X of precipitation amounts [dm] in consecutive years Station Hohenp. Karlsr. Potsd.
N 130 133 116
Mean 0:003 0:014 0:007
Stand.Dev. 2.085 1.900 1.417
r(1) 0:460 0:429 0:461
r(2) 0:003 0:114 0:064
r(3) 0:026 0:025 0:040
Table 14 ARIMA-method for the annual precipitation amounts in [dm]. H Hohenpeißenberg, K Karlsruhe, P Potsdam
H K P
Order p, q 3,1 3,1 3,1
Coefficients ˛i 0.106,0.085,0.010 0.046,-0.163,-0.062, 0:128; 0.102,-0.059
ˇ1 0:935 0:943 0:989
Root MSQ 1:673 1:378 0:972
ARIMA-prediction 2006–2008 2009 11.81,11.84,12.12 11:91 7.82, 8.00, 7.59 7:61 6.10, 6.10, 5.86 5:94
Table 13 shows that the yearly changes X equal 0 in the mean and have an average deviation (from the mean 0) of 1. 5. . . 2. 0 [dm]. The auto-correlations r(1) lie in the range 0:4: : : 0:5. An increase of precipitation is immediately followed by a decrease, as a tendency, and vice versa. We fit an ARMA(p, q)-model to the differenced series X . As order numbers we get p D 3, q D 1, and therefore the ARMA(3,1)-model X .t/ D ˛3 X .t 3/ C ˛2 X .t 2/ C ˛1 X .t 1/ C ˇ1 e.t 1/ C e.t/:
(10)
Table 14 presents the estimated coefficients ˛ i and ˇ 1 ; the coefficient ˇ 1 is significantly different from zero for all three stations. Further, prognoses for the three years 2006–2008 as well as for the year 2009 were made, each time on the basis of the preceding years. The predictions lie partly above, partly below the actually observed values, demonstrating their smoothing character, see Fig. 8 (lower plot). The residuals e.t/ from the predictions are shown in the upper plot of Fig. 8. The auto-correlations re (h), h D 1, . . . , 8, of the residuals were calculated (not reproduced in a table). The bound b1 for an individual jre .h/j is not exceeded and thus – all the more – not the simultaneous bound b8 (significance level 0.05). The residual series e.t/ can be comprehended as a pure random series, confirming the applied ARIMA-model. We abstain here from a GARCH application to the residual series.
6
Model and Prediction: Monthly Data
For the investigation of monthly climate data, we confine ourselves to the monthly temperature means. We first estimate a trend by the ARIMA-method of Sect. 5 (as well as by alternative methods) and model the detrended series as an ARMAprocess.
Statistical Analysis of Climate Series
2277
Fig. 8 Hohenpeißenberg, annual precipitation amounts 1879–2008. Top: Differenced time series, together with the ARMA-prediction (dashed line) and with the residual values (as circles o). Bottom: Time series of annual precipitation amounts in (dm), together with the ARIMA-prediction (dashed line). The last 50 years are shown
6.1
Trend+ARMA Method
In order to model the monthly temperature means Y .t/, we start with Y .t/ D m.t/ C X .t/; t D 1; 2; : : : ;
(11)
where t counts the successive months, m.t/ denotes the long-term (yearly) trend, and where X .t/ is the remainder series. We estimate the trend by the ARIMAmethod of Sect. 5: The variable m.t/ is the ARIMA-prediction of the yearly temperature mean (with p D q D 2 for Hohenpeißenberg and Karlsruhe, and with p D 3, q D 1 for Potsdam), called trend(ARIMA), and is the same for all 12 months t of the same year. The detrended series
2278
H. Pruscha
X .t/ D Y .t/ m.t/; t D 1; 2; : : : ; is shown in the upper plot of Fig. 9. We fit an ARMA(p, q)-model to the series X .t/, with p D 3, q D 2 (that turned out to be sufficient). In Table 15 one can find the estimated coefficients ˛i and ˇj , (nearly) all of them being significantly different from zero. The ARMA-prediction XO .t/ for X .t/ is plotted in the upper part of Fig. 9, too. By means of XO .t/ we gain back the original (trend-affected) series, more precisely: the trend(ARIMA)+ARMA-prediction YO .t/ for Y .t/. We put YO .t/ D m.t/ C XO .t/; t D 1; 2; : : : ;
(12)
compare the lower plot of Fig. 9, where the YO .t/ are portrayed, together with the actual observations Y .t/. The goodness-of-fit RootMSQ according to Eq. (6) and the predictions for Oct. 08 to Jan. 09 are presented in Table 15, too. With only 4 C 5 parameters these trend(ARIMA)+ARMA-predictions YO .t/ run close to the actual observed values Y .t/. They cannot, however, follow extremely warm summers or cold winters. To give examples, we point to the “record summer” 2003 (in Fig. 9 around the month no. 55) or to the relatively cold January 2009. For the latter, compare the predictions in the last column of Table 15 with the actual observed values 2:7; 1:3; 2:1 Œı C ], in Hohenpeißenberg, Karlsruhe, and Potsdam, respectively.
6.2
Comparisons with Moving Averages and with Lag-12 Differences
On the basis of approach (11) we can alternatively choose the method of leftsided moving averages, applied for estimating the (yearly) trend m.t/ as well as for predicting the detrended series X .t/. As trend estimation m.t/, we take the average of the preceding observations Y .t 1/, Y .t 2/, . . . , Y .t k 12/. The depth number k indicates the number of the employed years. As prediction XO .t/ for the variable X .t/ we take the average of the preceding detrended observations
Y .t 12/ m.t 12/; Y .t 24/ m.t 24/; : : : ; Y .t k 12/ m.t k 12/: The integer k indicates here the number of the employed months. Again according to Eqs. (12) and (6) we compute the goodness of this prediction method. Table 16 shows that for no depth smaller than k D 21 (Karlsruhe k D 12) the RootMSQ values of the trend(ARIMA) + ARMA-method are attained. Recall that in the latter method only 4 C 5 D 9 parameters are involved. Another alternative procedure resembles the ARIMA-method of Sect. 5. Instead of using differences Y .t/Y .t 1/ of two consecutive variables (lag-1 differences), however, we form lag-12 differences, that are differences X .t/ D Y .t/ Y .t 12/; t D 13; 14; : : : ;
Statistical Analysis of Climate Series
2279
Fig. 9 Hohenpeißenberg, monthly temperature means 1781–2008. Top: Detrended time series, together with the ARMA-prediction (dashed line) and with the residual values (as circles o). Bottom: Monthly temperature means in [ı C], together with the trend (inner solid line) and the trend+ARMA-prediction (dashed line). The last 10 years are shown
Table 15 Trend(ARIMA) C ARMA-method for H D Hohenpeißenberg, K D Karlsruhe, P D Potsdam
H K P
Order p, q 3,2 3,2 3,2
Coefficients ˛i 1.758,-1.045,0.026 1.691,-0.928,-0.041 1.810,-1.136,0.078
ˇj 1:725; 0:992 1:691; 0:953 1:682; 0:951
the Root MSQ 2:149 1:993 1:953
monthly
temperature
Prediction Oct-Dec 2008 8.25, 4.01, 0.63 11.31, 6.67, 3.38 9.65, 5.08, 1.80
means.
Jan 2009 0:94 2:30 0:42
2280
H. Pruscha
Table 16 Depth k of the left-sided moving average for monthly temperature means and resulting goodness-of-fit RootMSQ. The latter is listed for the trend(ARIMA)+ARMA-method and for the ARIMA(lag12)-method, too Depth k 5 10 12 15 20 Trend(ARIMA)+ARMA ARIMA(lag12)
RootMSQ Hohenp. 2:366 2:248 2:227 2:193 2:171 2:149 2:544
Karlsr. 2:137 2:012 1:992 1:971 1:939 1:993 2:301
Potsdam 2:229 2:079 2:077 2:065 2:058 1:953 2:318
Table 17 Auto-correlation function re (h/ up to time lag h D 8 of the trend(ARIMA)+ARMAresiduals; monthly temperature means. H D Hohenpeißenberg, K D Karlsruhe, P D Potsdam H K P
re (1) 0:102 0:204 0:152
re (2) 0:029 0:051 0:063
re (3) 0:013 0:003 0:012
re (4) 0:02 0:02 0:01
re (5) 0:01 0:05 0:00
re (6) 0:04 0:07 0:00
re (7) 0:03 0:05 0:04
re (8) 0:03 0:02 0:04
b1 0:037 0:039 0:052
b8 0:052 0:054 0:073
of two observations being separated by 12 months. We fit an AR(12)-model to this differenced process X .t/, and determine the goodness-of-fit by Eq. (5) or – equivalently – Eq. (6). We will use the short-hand notation ARIMA(lag12). Table 16 shows, that this procedure is inferior to the method trend(ARIMA) + ARMA and to the method of moving averages as well.
6.3
Residual Analysis
Denoting by M D N 12 the total number of months and by YO .t/ the trend(ARIMA)+ARMA-prediction for Y .t/, t D 1, . . . , M , we obtain by e.t/ D Y .t/ YO .t/; t D 1; : : : ; M; the residuals from the prediction; compare the upper plot in Fig. 9. Which structure has this residual time series e.t/, t D 1, . . . , M ? Its auto-correlation function re (h), h D 2, . . . , 8, consists of values more or less near zero, cf Table 17. It is particularly the auto-correlation re (1) of first order (i.e., the correlation between e.t/, e.t C 1/ of two immediately succeeding months), p which turns out to be relatively large. The simultaneous bound b8 D u10:025=8 = M is exceeded by all three re (1) values (at the significance level 0.05). Our chosen prediction method trend(ARIMA)+ARMA leaves behind residuals, which are correlated too strong (at least of order one), and
Statistical Analysis of Climate Series
2281
Table 18 Correlation analysis for annual climate data. Temperature (Temp.) and precipitation (Prec.) with lagged variables Station Hohenp. Karlsr. Potsd.
r(Y, Y 1) Y = Temp. 0.289 0.332 0.358
Y = Prec. 0.273 0.009 0.079
r(Y, (Y 1. . . Y 6)) Y = Temp. Y = Prec. 0.370 0.321 0.455 0.136 0.402 0.277
r(Y, (Y 1. . . Y 6, Z1. . . Z6)) Y = Prec. 0.338 0.230 0.289
thus do not fulfill the demand on residual variables e.t/. A similar statement is to be made with respect to the method of moving averages.
7
Conclusions
First, we state that the separation of the trend/season component on one side and the auto-correlation structure on the other side is crucial in our analysis. To handle the latter, ARMA-type modeling in the detrended series or the differenced series (with subsequent integration: ARIMA) was performed and was worked out in Sects. 5 and 6. The correlation and prediction analysis reveals that precipitation is more irregular and closer to a random phenomenon than temperature is; see also (von Storch and Navarra 1993). This statement is also confirmed by Table 18, where the coefficients of correlation r.Y; Y 1/ and of multiple correlation r.Y; .Y 1: : :Y 6// are presented, with Y D temperature or Y D precipitation. By Y1 to Y6, we denote lagged variables from lag D 1 year to lag D 6 years. In Karlsruhe and in Potsdam, the correlations between temperature variables are distinctly larger than those between precipitation variables. If precipitation (Y D Prec.) is correlated with the set (Y 1: : :Y 6; Z1: : :Z6/, comprising the lagged precipitation variables Y 1. . . Y 6 and the lagged temperature variables Z1 . . . Z6, the coefficient remains – nevertheless – far below that of temperature (Y D Temp.). The exception is Hohenpeißenberg, as already mentioned in Sects. 2 and 4, where the level of correlation for precipitation is closer to that for temperature than it is in the other two stations. For predicting a climate variable Yt at time t, observations only up to time t 1 are allowed: Yt 1 , Yt 2 ,. . . . Therefore polynomials, drawn over the whole time interval t D 1, . . . , N , are not qualified as a (yearly) trend component in Sects. 5 and 6. Our numerical procedures, unfortunately, violate this rule in one aspect: The coefficients of the ARMA-models, the ˛’s and ˇ’s, were estimated from the whole series. Here, in future work the time-consuming amendment should be introduced and the coefficients should be calculated for each time point t anew (see Pruscha 2013).
2282
H. Pruscha
For predicting monthly climate variables one has to tune the estimation of the trend and of the seasonal component. Here further procedures should be tested, since the residual series in Sect. 6 are not sufficiently close to a pure random series. Winter data alone are (only) a weak indicator for the general climate development. This can also be documented by spectral analysis methods (Pruscha 1986).
Appendix: Excerpt from Hohenpeißenberg Data The complete data sets can be found under www.math.lmu.de/~pruscha/ keyword: Climate series. Monthly- [yearly-] temperature means are given in 1/10 ı C [1/100 ı C]. A time series plot of the yearly and the winter means can be found in Fig. 1 and further analysis of these data in Pruscha (2005) (see Table 19).
Table 19 Monthly and yearly temperature data Year 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 . . . .. 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
Jan 18 10 7 53 6 1 35 19 10 6 9
Feb 10 54 3 46 65 30 8 21 5 9 18
Mar 24 0 4 0 60 5 33 23 34 16 20
Apr 87 38 64 21 13 71 38 56 74 38 89
May 122 94 108 128 91 91 72 116 131 120 97
Jun 145 156 131 132 117 139 141 148 110 144 130
Jul 154 176 163 152 131 118 143 176 147 135 148
Aug 166 144 144 136 131 123 161 140 144 157 164
Sep 126 108 118 143 141 94 120 135 110 109 110
Oct Nov 44 15 36 28 82 12 23 12 60 23 35 5 93 18 56 6 65 4 85 28 72 14
Dec 12 23 24 47 19 10 39 105 12 11 10
16 10 5 20 20 0 7 23 21 9 23 22 25
32 27 34 32 19 8 32 38 2 39 25 29 27
12 45 18 36 25 43 48 48 19 23 3 36 16
61 37 64 60 83 43 56 63 71 70 61 112 53
98 110 115 122 126 131 113 127 90 113 109 121 127
141 128 149 125 157 123 167 193 134 154 150 151 150
140 139 152 161 129 163 155 172 153 157 198 155 155
139 170 161 155 172 170 154 207 164 134 121 149 157
80 133 109 146 126 87 99 125 126 132 155 103 102
73 27 62 37 73 10 78 5 86 46 128 0 76 54 43 58 102 16 106 23 117 64 68 10 85 36
21 11 2 1 33 37 12 13 4 28 27 1 4
Tyear 723 531 670 501 474 517 693 618 623 687 704 . . . .. 565 741 727 729 818 716 811 823 717 697 793 796 774
Statistical Analysis of Climate Series
2283
Table 20 Montly and yearly precipitation data Year Jan Feb 1879 254 619 1880 385 253 1881 188 232 1882 186 88 1883 310 127 1884 649 202 1885 100 277 1886 244 171 1887 145 125 1888 431 667 1889 174 1,084 ... 1996 138 314 1997 18 585 1998 398 280 1999 598 1,187 2000 300 802 2001 681 664 2002 103 685 2003 670 513 2004 1,127 431 2005 551 740 2006 400 395 2007 608 520 2008 414 141
Mar 272 315 448 400 421 412 643 436 888 494 502
Apr 1,071 967 809 696 504 1,164 285 828 291 1,438 701
May 1,039 1,203 1,332 952 1,179 440 1,185 625 1,712 690 1,151
Jun 1,009 1,991 1,404 1,565 2,096 1,846 1,437 2,214 435 1,575 1,738
Jul 1,473 1,870 885 1,802 2,020 1,957 1,644 972 1,550 1,288 1,408
Aug 1,457 1,212 1,490 1,314 835 1,130 925 2,288 718 1,733 1,019
Sep Oct Nov Dec 1,645 685 861 393 997 1,784 473 907 1,173 828 263 202 1,275 788 896 501 1,152 526 614 684 534 1,360 268 432 1,366 761 462 968 382 430 436 682 699 647 637 952 1,887 607 211 55 1,610 678 696 260
588 673 1,012 474 1,514 1,162 992 327 677 431 1,025 488 694
597 934 474 927 549 1,107 723 265 547 1,191 1,601 173 1,546
1,353 425 569 3,507 1,537 628 952 817 1,009 1,310 1,123 2,377 942
1,018 1,724 1,389 1,625 1,422 2,183 1,541 800 1,573 693 1,524 1,079 879
1,532 2,404 1,174 1,560 1,743 967 1,509 1,504 1,712 1,840 293 2,117 1,700
1,849 457 666 1,194 2,006 1,626 1,908 815 857 2,522 2,453 2,065 1,709
1,018 1,082 920 417 319 947 189 956 1,632 1,496 923 404 1,357 423 1,341 1,097 1,413 997 551 275 1,621 345 1,018 773 2,203 821 1,342 606 450 1,352 485 371 944 769 509 455 593 454 432 547 690 679 504 433 1,627 421 726 769 701 735 510 448
Pyear 10,778 12,357 9,254 10,463 10,468 10,394 10,053 9,708 8,799 11,076 11,021 ... 10,826 9,631 10,417 15,290 13,109 12,775 13,385 8,369 10,610 11,304 11,120 12,970 10,419
Monthly and yearly precipitation amounts are given in 1/10 mm height. A time series plot of yearly and winter amounts can be found in Fig. 2 (see Table 20).
References Attmannspacher W (1981) 200 Jahre meteorologische Beobachtungen auf dem Hohenpeißenberg 1781–1980. Bericht Nr. 155 des DWD, Offenbach/m Brockwell PJ, Davis RA (2006) Time series: theory and methods, 2nd edn. Springer, New York Cryer JD, Chan KS (2008) Time series analysis, 2nd edn. Springer, New York Grebe H (1957) Temperaturverhältnisse des Observatoriums Hohenpeißenberg. Bericht Nr. 36 des DWD, Offenbach/m Kreiß JP, Neuhaus G (2006) Einführung in die Zeitreihenanalyse. Springer, Berlin Malberg H (2003) Bauernregeln, 4th edn. Springer, Berlin Pruscha H (1986) A note on time series analysis of yearly temperature data. J R Stat Soc A 149:174–185 Pruscha H (2005) Statistisches Methodenbuch. Springer, Berlin
2284
H. Pruscha
Pruscha H (2013) Statistical analysis of climate series. Springer, Berlin Schönwiese CD (1979) Klimaschwankungen. Springer, Berlin Schönwiese CD et al (1993) Klimatrend-Atlas. Bericht Nr. 20 des Zentrums f. Umweltforschung, Frankfurt/m von Storch H, Navarra A (eds) (1993) Analysis of climate variability. Springer, Berlin von Storch H, Zwiers FW (1999) Statistical analysis in climate research. Cambridge University Press, Cambridge
Oblique Stochastic Boundary-Value Problem Martin Grothaus and Thomas Raskop
Contents 1 2 3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scientifically Relevant Domains and Function Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . Poincaré Inequality as Key Issue for the Inner Problem . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The Weak Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Existence and Uniqueness Results for the Weak Solution . . . . . . . . . . . . . . . . . . 3.3 A Regularization Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Ritz-Galerkin Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Stochastic Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Fundamental Results for the Outer Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Transformations to an Inner Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Solution Operator for the Outer Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Ritz-Galerkin Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Stochastic Extensions and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2286 2287 2291 2291 2294 2295 2296 2297 2299 2299 2305 2308 2309 2313 2314 2314
Abstract
The aim of this chapter is to report the current state of the analysis for weak solutions to oblique boundary problems for the Poisson equation. In this chapter, deterministic as well as stochastic inhomogeneities are treated and existence and uniqueness results for corresponding weak solutions are presented. We consider the problem for inner bounded and outer unbounded domains in Rn . The main tools for the deterministic inner problem are a Poincaré inequality and some analysis for Sobolev spaces on submanifolds, in order to use the
M. Grothaus () • T. Raskop Functional Analysis Group, University of Kaiserslautern, Kaiserlautern, Germany e-mail: [email protected] © Springer-Verlag Berlin Heidelberg 2015 W. Freeden et al. (eds.), Handbook of Geomathematics, DOI 10.1007/978-3-642-54551-1_35
2285
2286
M. Grothaus and T. Raskop
Lax-Milgram lemma. The Kelvin transformation enables us to translate the outer problem to a corresponding inner problem. Thus, we can define a solution operator by using the solution operator of the inner problem. The extension to stochastic inhomogeneities is done with the help of tensor product spaces of a probability space with the Sobolev spaces from the deterministic problems. We can prove a regularization result, which shows that the weak solution fulfills the classical formulation for smooth data. A Ritz-Galerkin approximation method for numerical computations is available. Finally, we show that the results are applicable to geomathematical problems.
1
Introduction
The main subject of this chapter is existence results for solutions to oblique boundary problems for the Poisson equation. We start with the deterministic problems. The Poisson equation in the domain † is given by u D f and the oblique boundary condition is given by ha; rui C bu D g: This condition is called regular if the equation jha; ij > C > 0 holds on @† for a constant 0 < C < 1. The problem is called an outer problem if the Poisson equation has to hold on an outer domain † Rn . This is a domain †, having the representation † D Rn nD where 0 2 D is a bounded domain. Consequently, @† divides the Euclidean space Rn into a bounded domain D, called inner domain, and an unbounded domain †, called outer domain. A problem defined on a bounded domain is called an inner problem. A classical solution corresponding to continuous a, b, g, and f of the oblique boundary problem for the Poisson equation is a function u 2 C 2 .†/ \ C 1 .†/ which fulfills the first two equations. For the outer problem, u must be regular at infinity, i.e., u.x/ ! 0 for jxj ! 1. Existence and uniqueness results for a classical solution to regular oblique boundary problems for the Poisson equation are already available; see, e.g., Miranda (1970), Gilbarg and Trudinger (1998), or Rozanov and Sanso (2002a). In order to allow very weak assumptions on boundary, coefficients, and inhomogeneities, we are interested in weak solutions from Sobolev spaces of one time weakly differentiable functions. When facing the deterministic problems, we have to distinguish the inner and the outer setting. The reason is that a Poincaré inequality, namely,
Oblique Stochastic Boundary-Value Problem
2287
Z
Z
Z n
2
hru; ruid C †
u dH
n1
C
@†
Z 2
n
u d C †
hru; ruid
n
;
†
for all u 2 H 1;2 .†/, is only available for bounded †. Thus, we can only use the Lax-Milgram lemma for the inner problem in order to gain a solution operator. For the outer problem, we use the Kelvin transformation to transform the unbounded domain † to a bounded domain †K via †K WD
n
x jx jxj2
o 2 † [ f0g:
Additionally, we transform coefficients as well as inhomogeneities and end up with an inner problem, which possesses a unique weak solution . Finally, we transform this function to the outer space by u.x/ WD
1 jxjn2
x jxj2
;
for all x 2 †. This u is then the weak solution to the outer problem, and it can be shown that, in the case of existence, u is the classical solution. Additionally, the transformations are continuous, and consequently the solution depends continuously on the data. Before we go on with stochastic inhomogeneities and stochastic weak solutions, we want to mention that we have to assume a regular inner problem, while we have a transformed regularity condition for the outer problem resulting from the transformations. Going to a stochastic setting, we have to introduce the spaces of stochastic functions. These are constructed as the tensor product of L2 .˝; dP /, with a suitable probability space .; F ; P/ and the Sobolev spaces used in the deterministic theory. They are again Hilbert spaces, and we have isomorphisms to Hilbert space-valued random variables. For the stochastic inner problem, we again employ the Lax-Milgram lemma, while in the outer setting, we define the solution operator pointwise for almost all ! 2 ˝. For all solutions, deterministic as well as stochastic, a Ritz-Galerkin approximation method is available. Finally, we give some examples from geomathematics, where stochastic inhomogeneities are implemented. Proofs for the results presented in this chapter are given in Grothaus and Raskop (2006, 2009). The examples are taken from Freeden and Maier (2002) and Bauer (2004). We want to mention that the articles Rozanov and Sanso (2001) as well as Rozanov and Sanso (2002b) also deal with solutions to oblique boundaryvalue problems.
2
Scientifically Relevant Domains and Function Spaces
In this section, we consider boundary-value problems for the Poisson equation. This means we are searching for a function which satisfies the Poisson equation in a subset † of Rn and an additional condition on the boundary @† of this set, i.e.
2288
M. Grothaus and T. Raskop
u D f ha; rui C bu D g
in †; on @†:
Here, f and g are called inhomogeneities, a and b are called coefficients, and such a function u is then called the solution. Our analysis is motivated by problems from geomathematics. Here, oblique boundary problems arise frequently, because in general the normal of the Earth’s surface does not coincide with the direction of the gravity vector. Therefore, the oblique boundary condition is more suitable than a Neumann boundary condition. For details, see Bauer (2004) or Gutting (2008). We are dealing with two different types of sets †, namely, bounded and outer C m;˛ -domains, which are introduced by the following definition. In particular, the outer problem is of major interest for applications. Definition 1. @† Rn is called a Cm;˛ -surface, m 2 N and 0 ˛ 1, and † is called a bounded Cm;˛ -domain if and only if • † is a bounded subset of Rn which is a domain, i.e., open and connected • There exists an open cover .Ui /i D1;:::;N of @† and corresponding Cm;˛ n diffeomorhisms ‰i W B1R .0/ ! Ui , i D 1; : : :, N , such that ‰i W B10 .0/ ! Ui \ @†; ‰i W B1C .0/ ! Ui \ †; ‰i W B1 .0/ ! Ui \ Rn n†; where B1R .0/ denotes the open unit ball in Rn , i.e., all x 2 Rn with jxj < 1. B10 n (0) denotes the set of all x2 B1R .0/ with xn D 0, B1C .0/ denotes the set of all n Rn x 2 B1 .0/ with xn > 0, and B1 .0/ denotes the set of all x 2 B1R .0/ with xn < 0. m;˛ On the other hand, † is called an outer C -domain if and only if † Rn is open, connected, and representable as † WD Rn nD, where D is a bounded Cm;˛ domain such that 0 2 D. ‰i is called Cm;˛ -diffeomorphism if and only if it is n bijective, .‰i /j 2 C m;˛ B1R .0/ , .‰i1 /j 2 C m;˛ .UN i /, j D 1; : : :, n, and we have n
for the determinant of the Jacobian matrix of ‰i ; Det.D‰i / ¤ 0 in B1R .0/. n
In Fig. 1, such a C m;˛ -surface is illustrated. For this definition and further details, see, e.g., Dobrowolski (2006). The definition is independent of the mappings chosen. @† is a compact and double-point-free .n 1/-dimensional C m;˛ -submanifold. The outer unit normal vector is a C m1 -vector field. Furthermore, we find a C 1 partition of .wi /1 i N on @† corresponding to the open cover .Ui /1 i N , provided by Alt (2002). H n1 denotes the .n 1/-dimensional Hausdorff measure on @† and n the Lebesgue measure in Rn . Throughout this chapter, we assume Lipschitz boundaries, i.e., C 0;1 -boundaries @†. Then we have 2 L1 .@†I Rn /. Note that
Oblique Stochastic Boundary-Value Problem
2289
Fig. 1 C m;˛ -surface
∂Σ = ∂D
n
Σ D
U C m,α
n–1
B1 (0) = ψ −1 (U )
some geomathematically relevant examples are even C 1 -surfaces, e.g., a sphere or an ellipsoid. We will see in Sect. 3 and 4 that the cases of bounded and outer domains have to be treated differently, because the unboundedness causes problems which do not occur in the bounded setting. Nonetheless, we are searching in both cases for solutions under as weak assumptions as possible. More precisely, we are searching for solutions in Sobolev spaces for inhomogeneities from Banach space duals of Sobolev spaces. These spaces are introduced in the following. Definition 2. Let † be a bounded C0;1 -domain and r 2 N. We define H r;2 .†/ WD fF W † ! Rj@˛1 1 @˛nn F 2 L2 .†/ for all ˛1 C C ˛n rg; !1 2 r N P P jj@˛ F jj2L2 .†/ : jjF jjH r;2 .†/ WD j˛jD0 i D1
Let † be an outer C0;1 -domain and %1 , %2 , %3 be continuous, positive functions N We define defined on †. R
L2%1 .†/
WD fF W †!RjF is measurable with
.†/ H%1;2 1 ;%2
WD fF 2 L2%1 .†/j@i F 2 L2%2 .†/; 1 i ng;
.†/ H%2;2 1 ;%2 ;%3
WD fF 2 L2%1 .†/j@i F 2 L2%2 .†/ and @i @j F 2L2%3 .†/; 1j; i ng;
jjF jjL2%
WD
.†/ 1
jjF jjH%1;2;%
1 2 .†/
jjF jjH%2;2;%
1 2 ;%3 .†/
R †
F 2 .x/%21 .x/d n .x/
WD jjF jj2L2
%1 .†/
WD
jjF jj2L2 .†/ %1
C
n P i D1
C
n P i D1
1 2
†
F 2 .x/%21 .x/d n .x/ C1 > 0;
(1)
for all x2 @†, where 0 < C1 < 1. Finding a function u 2 C 2 .†/ \ C 1 .†/ such that u D f ha; rui C bu D g
in †; on @†;
is called the inner regular oblique boundary problem for the Poisson equation, and u is called the classical solution. Because of the condition in Eq. (1), the problem is called regular. It just means that the vector field a is nontangential to @† for all x 2 @†. Now we derive the weak formulation. The fundamental theorem of the calculus of variations gives u D f
in †
if and only if Z
Z n
u d D †
f d n
for all 2 C01 .†/
f d n
for all 2 C 1 .†/:
†
if and only if Z
Z n
u d D †
†
Additionally on †, the following Green formula is valid: Z
Z
Z
n
n
' d C †
for all
hr'; r i d D †
@†
' @@ dH n1 ;
2 C 2 .†/ \ C 1 .†/ and ' 2 C 1 .†/. This yields for a classical solution Z
@†
@u dH n1 @
Z
Z hr ; rui d n D †
f d n ; †
Oblique Stochastic Boundary-Value Problem
2293
N Now we transform the boundary condition for all 2 C 1 .†/. ha; rui C bu D g
on @†;
to the form ha; i
@ u C ha h.a; /i r@† ui C bu D g @
on @†:
Using Eq. (1) we may divide this by ha, i ¤ 0 to get the equivalent boundary condition @ uC @
g a b ; r@† u C uD ha; i ha; i ha; i
on @†:
Plugging this condition into the equation above, we get the following formulation of the regular oblique boundary problem for the Poisson equation which is equivalent to the formulation given in Definition 3. We want to find a function N such that u 2 C 2 .†/ \ C 1 .†/ g b a u ; r@† u dH n1 @†
ha; i ha; i ha; i R R N † hr ; ruidn † f dn D 0 for all 2 C 1 .†/: R
The transformation of the boundary term is shown in Fig. 2.
n
a v
Tx(∂Σ)
x a (a . v)
Σ
Fig. 2 Transformation of the oblique boundary condition
v
∂Σ
2294
M. Grothaus and T. Raskop
Finally, we are weakening the assumptions on data, coefficients, the test function, and the solution. We give the weak formulation of the inner regular oblique boundary problem to the Poisson equation, summarized in the following definition. Definition 4. Let † be a bounded C1;1 -domain, a 2 H 1;1 .@†I Rn / fulfilling the 1 condition in Eq. (1), b 2 L1 .@†/, g 2 H 2 ; 2 .@†/, and f 2 .H 1;2 .†//0 . We want to find a function u 2 H 1;2 .†/ such that H
1; 2 2 .@†/
R †
;
g ha; i
1
H 2 ;2 .@†/
.r ru/d n
R
@†
n P i D1 H
1 ;2 2 .@†/
ai i ; .r@† u/i ha; i
1
H 2 ;2 .@†/
b udH n1 H 1;2 .†/ h ; f i.H 1;2.†//0 D 0; ha; i
for all 2 H 1;2 .†/. Then u is called a weak solution of the inner regular oblique boundary problem for the Poisson equation.
3.2
Existence and Uniqueness Results for the Weak Solution
It is possible to prove the following existence and uniqueness result for the weak solution to the deterministic inner oblique boundary-value problem for the Poisson equation. Theorem 1. Let † be a bounded C1;1 -domain, a 2 H 1;1 .@†I Rn /, fulfilling the condition in Eq. (1), and b 2 L1 .@†/ such that
1 b ess inf div@† @† ha; i 2
a ha; i
> 0:
(2)
1
Then for all f 2 .H 1;2 .†//0 and g 2 H 2 ;2 .@†/, there exists one and only one weak solution u 2 H 1;2 .†/ of the inner regular oblique boundary problem for the Poisson equation. Additionally, we have for a constant 0 < C2 < 1 jjujjH 1;2.†/ C2 jjf jj.H 1;2.†//0 C jjgjj
H
21 ;2
.@†/
:
In the proof, we apply the Lax-Milgram lemma, which gives us a unique u 2 H 1;2 .†/ fulfilling the variational equation F . / D a. ; u/; for all 2 H 1;2 .†/, provided we have that F and a are continuous and additionally a is a coercive bilinear form. F and a can be obtained easily from the weak formulation as
Oblique Stochastic Boundary-Value Problem
2295
g
; F . / D 12 ;2 H 1;2 .†/ h ; f i.H 1;2.†//0 ; H .@†/ ha; i H 21 ;2 .@†/ n P ai
; .r u/ a. ; u/ D 1 ;2 i @† i 1 2 .@†/ hai ; i i D1 H H 2 ;2 .@†/ R R b u dH n1 : C † .r ru/d n C @†
ha; i
The continuity can be shown by some results about the Sobolev spaces occurring in the weak formulation. In order to prove that a is coercive, i.e., ja.u; u/j C3 jjujj2H 1;2.†/ , the Poincaré inequality Z
Z
Z hrF; rF i d n C †
Z
F 2 dH n1 C4
F 2 d n C
@†
†
hrF; rF i d n ;
†
which is valid for all F 2 H 1;2 .†/ and a constant 0 < C4 < 1, is indispensable. Finally, the condition ess inf @†
1 b div@† ha; i 2
a ha; i
> 0;
is also essential to ensure the coercivity of a. The condition inEq. (2) can be transa formed into the equivalent form ha; ib > 12 .ha; i/2 div@† ha;i H n1 – almost everywhere on@†. a If div@† ha;i D 0 H n1 –almost everywhere on @†, we have for H n1 – almost all x 2 @† the condition from the existence and uniqueness result for the classical solution. Furthermore, for a D , i.e., the Robin problem, the condition reduces to b > 0 H n1 –almost everywhere on @†. Finally, we are able to define for each bounded C 1;1 -domain †; a 2 H 1;1 .@†I Rn / and b 2 L1 .@†/, fulfilling the in condition in Eq. (7) and (8), a continuous invertible linear solution operator Sa;b by 1
in Sa;b W .H 1;2 .†//0 H 2 ;2 .@†/ ! H 1;2 .†/; .f; g/ 7! u;
where u is the weak solution provided by Theorem 1. In addition, this means that the inner weak problem is well posed.
3.3
A Regularization Result
In this section, we will show that the weak solution from the previous section is even an element of H 2;2 .†/ if we choose the inhomogeneities and coefficients smooth enough. The result for the oblique boundary problem is based on a regularization result for the weak solution to the Neumann problem for the Poisson equation.
2296
M. Grothaus and T. Raskop
Theorem 2. Let † Rn be a bounded C2;1 -domain, a 2 H 2;1 .@†I Rn / fulfilling the condition in Eq. (1) and b 2 H 1;1 .@†/. Then for all f 2 L2 .†/ and 1 g 2 H 2 ;2 .@†/, the weak solution u 2 H 1;2 .†/ to the inner regular oblique boundary problem for the Poisson equation, provided in Theorem 1, is even in H 2;2 .†/. Furthermore, we have the a priori estimate jjujjH 2;2.†/ C5 jjf jjL2 .†/ C jjgjj
H
1 ;2 2 .@†/
;
for a constant 0 < C5 < 1. In order to prove the result, it suffices to show that the normal derivative of 1 the weak solution u of the oblique boundary problem is an element of H 2 ;2 .@†/. Therefore, we use some results for Sobolev spaces defined on submanifolds. The weak solution in H 2;2 .†/ is related to the classical solution in the following way. Let u 2 H 2;2 .†/ be the weak solution to the inner regular oblique boundary problem for the Poisson equation, provided by Theorem 2. Then we have u D f n –almost everywhere in†; ha; rui C bu D g H n1 –almost everywhere on @†: We call such a solution a strong solution to the inner regular oblique boundary problem for the Poisson equation.
3.4
Ritz-Galerkin Approximation
In this section, we provide a Ritz-Galerkin method which allows us to approximate the weak solution with the help of a numerical computation. Let a. ; u/ and F . / be defined as above and the conditions of Theorem 4 be satisfied. Furthermore, let .Un /n2N be an increasing sequence of finite-dimensional subspaces of H 1;2 .†/, i.e., Un UnC1 such that [ Un D H 1;2 .†/. Because Un is, as n2N
a finite-dimensional subspace of the Hilbert space H 1;2 .†/, itself a Hilbert space, we find for each n 2 N a unique un 2 Un with a. ; un / D F . /
for all 2 Un :
Moreover, let d WD dim.Un / and .'k /1 k d be a basis of Un . Then un 2 Un has the following unique representation:
un D
d X i D1
hi 'i ;
Oblique Stochastic Boundary-Value Problem
2297
where .hi /1 i d is the solution of the linear system of equations given by d X
a.'j ; 'i /hi D F .'j / 1 j d:
i D1
The following result from Céa proves that the sequence .un /n2N really approximates the weak solution u. Theorem 3. Let u be the weak solution provided by Theorem 1 and .un /n2N taken from above. Then jju un jjH 1;2 .†/
n!1 C6 dist.u; Un / ! 0; C7
where C6 and C7 are the continuity and the coercivity constants of a.
3.5
Stochastic Extensions
First, we define the spaces of stochastic functions. We are choosing a probability space .; F ; P/, arbitrary but fixed, such that L2 (˝, dP) is separable, and define .H 1;2 .†//0 WD L2 .; P / ˝ .H 1;2 .†//0 Š L2 .; P I .H 1;2 .†//0 /; 1 1 1 ;2 H 2 .@†/ WD L2 .; P / ˝ H 2 ;2 .@†/ Š L2 ; P I H 2 ;2 .@†/ ; 1 1 1 ;2 H2 .@†/ WD L2 .; P / ˝ H 2 ;2 .@†/ Š L2 ; P I H 2 ;2 .@†/ ; L2 .†/ H1;2 .†/ H2;2 .†/
WD L2 .; P / ˝ L2 .†/ WD L2 .; P / ˝ H 1;2 .†/ WD L2 .; P / ˝ H 2;2 .†/
Š L2 .; P I L2 .†//; Š L2 .; P I H 1;2 .†//; Š L2 .; dP I H 2;2 .†//;
with the help of the tensor product. Now we can investigate the stochastic inner regular oblique boundary problem for the Poisson equation. We are searching for a solution u 2 H1;2 .†/ of u.x; !/ D f .x; !/ for all x 2 †; P-a:a: ! 2 ; .a ru.x; !// C bu.x; !/ D g.x; Q !/ for all x 2 @†; P-a:a: ! 2 ; C8 > 0 on @†: j.a /j Using the argumentation from Sect. 3.1, we come immediately to the weak formulation of the stochastic boundary problem.
2298
M. Grothaus and T. Raskop
Definition 5. Find u 2 H1;2 .†/ with ! b n1 dP u dH
1 ;2 1 2 .@†/ H @† ha; i H 2 ;2 .@†/ ! Z X n ai dP 1 ;2 ; .r@† u/i
1 2 .@†/ ha; i i D1 H H 2 ;2 .@†/ Z Z .r ru/d n H 1;2 .†/ h ; f i.H 1;2.†//0 dP D 0
Z
g
; ha; i
Z
†
for all 2 H1;2 .†/. u is called the stochastic weak solution of the stochastic inner regular oblique boundary problem for the Poisson equation. Obviously, u 2 H1;2 .†/ is a stochastic weak solution of the stochastic regular oblique boundary problem for the Poisson equation if and only if for P-a.a. ! 2 , u! WD u.; !/ is a weak solution of the deterministic problem D f .; !/ on †; u! ha ru! i C bu! D g.; !/ on @†: The solution operator of the deterministic problem extends to the stochastic setting in the following way. Theorem 4. Let † be a bounded C1;1 -domain, a 2 H 1;1 .@†I Rn /, fulfilling the condition in Eq. (1), and b 2 L1 .@†/ such that ess inf @†
1 b div@† ha; i 2
a ha; i
> 0:
1 ;2
Then for all f 2 .H 1;2 .†//0 and g 2 H 2 .@†/, there exists one and only one stochastic weak solution u 2 H1;2 .†/ of the stochastic inner regular oblique boundary problem for the Poisson equation. Additionally, we have for a constant 0 < C9 < 1 jjujjH 1;2.†/ C9 jjf jj.H 1;2 .†//0 C jjgjj
1 ;2 H 2 .@†/
:
In the proof, we use the results from the deterministic setting in order to prove the requirements of the Lax-Milgram lemma to be fulfilled. Using the isomorphisms of the tensor product spaces to spaces of Hilbert space-valued random variables, also the regularization result translates to the stochastic setting.
Oblique Stochastic Boundary-Value Problem
2299
Theorem 5. Let † Rn be a bounded C2;1 -domain, a 2 H 2;1 .@†I Rn / fulfilling the condition in Eq. (1) and b 2 H 1;1 .@†/. Then for all f 2 L2 .†/ and 1
;2
g 2 H2 .@†/, the weak solution u 2 H1;2 .†/ to the inner regular oblique boundary problem for the Poisson equation, provided in Theorem 1, is even in H2;2 .†/. Furthermore, we have the a priori estimate jjujjH 2;2.†/ C10 .jjf jjL2 .†/ C jjgjj
1 ;2
H2 .@†/
/;
for a constant 0 < C10 < 1. u is called the stochastic strong solution and fulfills the classical problem almost everywhere. At the end of this section, we want to mention that a Ritz-Galerkin approximation is available also for the stochastic weak solution, repeating the procedure from the deterministic problem. For details and proofs of the presented results, we refer the reader to Grothaus and Raskop (2006).
4
Fundamental Results for the Outer Problem
In this section, we provide a solution operator for the outer oblique boundary problem for the Poisson equation. The results presented in this section are taken from Grothaus and Raskop (2009), and further details on the proofs can be found in this reference. The outer problem is defined in an unbounded domain † Rn which is representable as Rn nD, where D is a bounded domain. Additionally, we assume 0 2 D which is necessary for the Kelvin transformation. For unbounded †, a Poincaré inequality is yet missing. Consequently, we cannot use the technique used for the inner problem because we are unable to prove coercivity of the bilinear from a weak formulation corresponding to the outer problem. Thus, we will not derive a weak formulation for the outer problem, and thus, we do not have to consider a regular outer problem. Our approach is to transform the outer problem to a corresponding inner problem for which a solution operator is available from the results of the previous section. In this way, we will construct our weak solution and for this solution, also a Ritz-Galerkin method is available because of the continuity of the Kelvin transformation. Finally, we again extend our results for stochastic inhomogeneities as well as stochastic solutions and present some examples from geomathematics. The procedure is described in the following four subsections.
4.1
Transformations to an Inner Setting
In this section, we define the transformations which will be needed in order to transform the outer oblique boundary problem for the Poisson equation to a corresponding regular inner problem. Then we will apply the solution operator in
2300
M. Grothaus and T. Raskop
Table 1 Transformation procedure Outer problem:
†
# K†
Inner problem:
†K
(f ,g/
out Sa;b
!
!
u
T1 # T2
(T1 .f /,T2 .g//
"K STin .a/;T 3
4 .b/
!
order to get a weak solution in the inner domain. This solution will be transformed with the help of the Kelvin transformation to a function defined in the outer domain. In the next section, we will finally prove that this function solves the outer problem for sufficiently smooth data almost everywhere, which gives the connection to the original problem. The whole procedure is illustrated in Table 1. We proceed in the following way. First, we define the Kelvin transformation K† of the outer domain † to a corresponding bounded domain †K . Next, the Kelvin transformation K of the solution for the inner problem will be presented. Finally, we define the transformations T1 and T2 for the inhomogeneities as well as T3 and T4 for the coefficients. We will also show that the operators K, T1 , and T2 are continuous. The consequence is that our solution operator out Sa;b .f; g/ WD K STin3 .a/;T4 .b/ .T1 .f /; T2 .f // forms a linear and continuous solution operator for the outer problem. Because all main results assume † to be at least an outer C 1;1 -domain, we fix † for the rest of this section as such a domain, if not stated otherwise. At first, we transform the outer domain † to a bounded domain †K . The tool we use is the so-called Kelvin transformation K† for domains. We introduce the Kelvin transformation for outer C 1;1 -domains in the following definition. Definition 6. Let † be an outer C 1;1 -domain and x 2 † be given. Then we define the Kelvin transformation K† (x) of x by K† .x/ WD
x : jxj2
Furthermore, we define †K as the Kelvin transformation of † via †K WD K† .†/ [ f0g D fK† .x/jx 2 †g [ f0g: From this point on, we fix the notation in such a way that †K always means the Kelvin transformation of †. Figure 3 illustrates the Kelvin transformation of †.
Oblique Stochastic Boundary-Value Problem 2
2301
2
1.8 Σ
1.6 1.4 1.2 1 0.8
ΣK 0.6 0.4
x
0.2 KΣ (x) 0 –1.5
–1
∂ΣK –0.5
0
0.5
∂Σ 1
1.5
Fig. 3 Kelvin transformation of †
We have K† 2 C 1 .Rn nf0gI Rn nf0g/ with K†2 D IdRn nf0g . Furthermore, we obtain by standard calculus, using the Leibnitz formula for the determinant, j Det.D.K† //.x/j C11 jxj2n for all x 2 Rn nf0g, 1 i n. This is one of the reasons for the weighted measures of the Sobolev spaces described below. Moreover, the transformation leaves the regularity of the surface invariant. Let † be an outer C 2;1 -domain. Then †K is a bounded C 2;1 -domain. Moreover, we have that @†K D K† .@†/. Furthermore, if † is an outer C 1;1 -domain, we have that †K is a bounded C 1;1 -domain. There are geometric situations in which @†K can be computed easily. For example, if @† is a sphere around the origin with radius R, then @†K is a sphere around the origin with radius R1 . Furthermore, if @† R2 is an ellipse with semiaxes a and b around the origin, then @†K is also an ellipse around the origin with semiaxes b 1 and a1 . Next, we present the transformation for the weak solution of the inner problem back to the outer setting. Therefore, we introduce the operator K. This is the so-called Kelvin transformation for functions. It transforms a given function u, defined on †K , to a function K.u/, defined on †. In addition, it preserves some properties of the original function. We will state some of these properties. So, after the following considerations, it will be clear why we choose exactly this transformation. It will also be clear how we have to choose the transformations T1 ; : : :; T4 in the following. We start with a definition.
2302
M. Grothaus and T. Raskop
Definition 7. Let † be an outer C1;1 -domain and u be a function defined on †K . Then we define the Kelvin transformation K(u) of u, which is a function defined on †, via 1 x K.u/.x/ WD ; u jxjn2 jxj2 for all x2 †. Important is that this transformation acts as a multiplier when applying the Laplace operator. Note that –.n 2/ is the only exponent for jxj which has this property. We have for u 2 C 2 .†K / that K.u/ 2 C 2 .†/ with 1 x ; .K.u//.x/ D .u/ jxjnC2 jxj2 for all x 2 †. As already mentioned above, we will apply K to functions from H 1;2 .†K /. So we want to find a normed function space .V; jj jjV / such that K W H 1;2 .†K / ! V defines a continuous operator. It turns out that the weighted Sobolev space H 1;2 .†/ is a suitable choice. We have the following important result for K 1 ; 1 jxj2 jxj
acting on H 1;2 .†K /. Theorem 6. Let † be an outer C1;1 -domain. For u 2 H 1;2 .†K / let K(u) be defined as above for all x 2 †. Then we have that K W H 1;2 .†K / ! H 1;2 .†/ 1 ; 1 jxj2 jxj
is a continuous linear operator. Moreover, K is injective. It is left to provide the remaining transformations T1 ; : : :; T4 . In the first part, we treat T1 , which transforms the inhomogeneity f of the outer problem in † to an inhomogeneity of the corresponding inner problem in †K . Assume f to be a function defined on †. We want to define the function T1 .f / on †K such that u.x/ D T1 .f /.x/; x 2 †K ;
(3)
.K.u//.y/ D f .y/; y 2 †:
(4)
implies that
Oblique Stochastic Boundary-Value Problem
2303
We are able to define T1 for functions defined on † as follows. Definition 8. Let † be an outer C1;1 -domain and f be a function defined on †. Then we define a function T1 .f / on †K by 1 T1 .f /.x/ WD f jxjnC2
x jxj2
;
for all x 2 †K nf0g and T1 .f /.0/ D 0. T1 is well defined and fulfills the relation described by Eqs. 6 and 7. Furthermore, T1 defines a linear continuous isomorphism T1 W L2jxj2 .†/ ! L2 .†K /; with T11 D T1 . We want to generalize our inhomogeneities in a way similar to the inner problem. This means we have to identify a normed vector space .W; jj jjW / such that T1 W W ! .H 1;2 .†K //0 defines a linear continuous operator. Additionally, we want to end up with a Gelfand triple U L2jxj2 .†/ W: Consequently, L2jxj2 .†/ should be a dense subspace. It is possible to prove that the 0 1;2 is a suitable choice. Recall the Gelfand triple, given by space Hjxj 2 ;jxj3 .†/ 0 1;2 1;2 2 : Hjxj 2 ;jxj3 .†/ Ljxj2 .†/ Hjxj2 ;jxj3 .†/ Theorem 7. We define a continuous linear operator T1 W L2jxj2 .†/ ! .H 1;2 .†K //0 ; by Z .T1 .f //.y/h.y/d n .y/;
.T1 .f //.h/ WD †K
h 2 H 1;2 .†K /;
2304
M. Grothaus and T. Raskop
for f 2 L2jxj2 .†/, where L2jxj2 .†/ is equipped with the norm jj jj
H 1;22
jxj ;jxj3
.†/
0 .
This
extends uniquely to a linear bounded operator 0 1;2 T1 W Hjxj ! .H 1;2 .†K //0 2 ;jxj3 .†/ by the BLT theorem, i.e., the extension theorem for bounded linear transformations; see e.g., Reed and Simon (1972). Next, we define the transformations for the boundary inhomogeneity g and the coefficients a and b. This means we want to find transformations T2 , T3 , and T4 such that h.T3 .a//.x/; ru.x/i C .T4 .b//.x/u.x/ D .T2 .g//.x/;
(5)
for all x 2 @†K , yields that ha.y/; r..K.u//.y//i C b.y/u.x/ D g.y/;
(6)
for all y 2 @†K . We start with the transformation T2 .g/ of g. Definition 9. Let † be an outer C1;1 -domain and g be a function defined on @†. Then we define a function T2 .g/ on @†K by .T2 .g//.x/ WD g
x jxj2
;
x 2 @†K :
Again we use a Gelfand triple, namely, 1
1
H 2 ;2 .@†/ L2 .@†/ H 2 ;2 .@†/: We have that T2 W L2 .@†/ ! L2 .@†K /; 1 1 T2 W H 2 ;2 .@†/ ! H 2 ;2 .@†K /; define linear, bounded isometries with .T2 /1 D T2 . Moreover, we define a continuous linear operator 1
T2 W L2 .@†/ ! H 2 ;2 .@†K /; by Z T2 .g/.y/h.y/dH n1 .y/;
.T2 .g//.h/ WD @†K
1
h 2 H 2 ;2 .@†/
Oblique Stochastic Boundary-Value Problem
2305
for g 2 L2 .@†/, where L2 .@†/ is equipped with the norm jj jj 21 ;2 . Hence, H .@†/ again the BLT theorem gives a unique continuous continuation 1
1
T2 W H 2 ;2 .@†/ ! H 2 ;2 .@†K /: Closing this section, we give the definitions of the transformations T3 and T4 . Definition 10. Let † be an outer C1;1 -domain and a and b be defined on @†. We define the operators T3 and T4 via D E .T3 .a//.x/ WD jxjn a jxjx 2 2 a jxjx 2 ; e x e x ; D E .T4 .b//.x/ WD jxjn2 b jxjx 2 C .2 n/ a jxjx 2 ; x ; for all x 2 @†K , where a x denotes the unit vector in the x direction. Furthermore, we have T3 W H 1;1 .@†/ ! H 1;1 .@†K /; T4 W L1 .@†/ ! L1 .@†K /; if † is an outer C1;1 -domain and a 2 H 1;1 .@†/ for T4 . All operators are well defined and give the relation formulated by Eqs. 5 and 6. These operators have the properties T3 W H 1;1 .@†/ ! H 1;1 .@†K /; T4 W L1 .@†/ ! L1 .@†K /; if † is an outer C 1;1 -domain and a 2 H 1;1 .@†/ for T4 and T3 W H 2;1 .@†/ ! H 2;1 .@†K /; T4 W H 1;1 .@†/ ! H 1;1 .@†K /; if † is an outer C 2;1 -domain and a 2 H 2;1 .@†/ for T4 .
4.2
Solution Operator for the Outer Problem
In this section, we want to apply the solution operator of the inner regular problem in order to get a weak solution of the outer problem. Therefore, we will use a combination of all the operators defined in Sect. 4.1. In order to avoid confusion, we denote the normal vector of @† by and the normal vector of @†K by K .
2306
M. Grothaus and T. Raskop
We start with the classical formulation of the outer oblique boundary problem for the Poisson equation in Definition 11. Definition 11. Let † be an outer C1;1 -domain, f 2 C 0 .†/, b, g 2 C 0 .@†/, and N such that a 2 C 0 .@†I Rn / be given. A function u 2 C 2 .†/ \ C 1 .†/ u.x/
D f .x/; for all x 2 †;
ha.x/ ru.x/i C b u.x/ D g.x/;
for all x 2 @†;
! 0;
u.x/
for jxj ! 1;
is called the classical solution of the outer oblique boundary problem for the Poisson equation. Now we state the main result of this section which can be proved by the results on the transformations above. Theorem 8. Let † be an outer C1;1 -domain, a 2 H 1;1 .@†I Rn /, b 2 L1 .@†/, 0 1 1;2 such that g 2 H 2 ;2 .@†/, and f 2 Hjxj 2 ;jxj3 .†/ jh.T3 .a//.y/; K .y/ij > C > 0; ess inf @†K
1 T4 .b/ div K hT3 .a/; K i 2 @†
(7)
T3 .a/ K hT3 .a/; K i
> 0;
(8)
for all y 2 @†K , where 0 < C < 1. Then we define out .f; g/ WD K STin3 .a/;T4 .b/ .T1 .f /; T2 .g// u WD Sa;b as the weak solution to the outer oblique boundary problem for the Poisson equation out is injective, and we have for a constant 0 < C12 < 1 from Definition 11. Sa;b jjujjH 1;2
1 ; 1 jxj2 jxj
.†/
C12 jjf jj
1;2 .†/ jxj2 ;jxj3
0
H
C jjgjj
1 H 2 ;2 .@†/
:
We are able to prove that the Kelvin transformation for functions is also a continuous operator from H 2;2 .†K / to H 2;2 .†/. So, we can prove the 1 ; 1 ;1 jxj2 jxj
following regularization result, based on the regularization result for the inner problem; see Theorem 2. The following theorem shows that the weak solution, defined by Theorem 8, is really related to the outer problem, given in Definition 6, although it is not derived by its own weak formulation.
Oblique Stochastic Boundary-Value Problem
2307
Theorem 9. Let † be an outer C2;1 -domain, a 2 H 2;1 .@†I Rn /, b 2 H 1;1 .@†/ such that the conditions in Eq. (7) and (8) hold. If f 2 L2jxj2 .†/, and g 2 1
H 2 ;2 .†/ then we have that u provided by Theorem 8 is a strong solution, i.e., u 2 H 2;2 .†/, and 1 ; 1 ;1 jxj2 jxj
u D f; ha; rui C bu D g; almost everywhere on † and @†, respectively. Furthermore, we have an a priori estimate jjujjH 2;2
1 ; 1 ;1 .†/ jxj2 jxj
C13 .jjf jjL2
jxj2
.†/
C jjgjj
1
H 2 ;2 .@†/
/;
with a constant 0 < C13 < 1. As a consequence, we have that if the data in Theorem 4 fulfills the requirements of a classical solution, the weak solution u provided by Theorem 8 coincides with this classical solution. At the end of this section, we investigate the conditions on the oblique vector field. Analogous to the regular inner problem, we have the condition in Eq. (8), which is a transformed version of 2 and gives a relation between a and b, depending on the geometry of the surface @†. Moreover, the condition in Eq. (7) is a transformed version of the condition in Eq. (1) and gives the nonadmissible direction for the oblique vector field a. For the regular inner problem, the condition in Eq. (1) states the tangential directions as nonadmissible for the oblique vector field. For the outer problem, the direction depends as well on the direction of the normal vector .y/ at the point y 2 @† as on the direction of y itself. In this section, we will investigate this dependency in detail. Using the definitions of T3 and T4 , we can rewrite the condition in Eq. (7) into the equivalent form ˇ ˇ ˇ ˇ 2 cos.† ˇcos † ˇ > C13 > 0; a.x/;ex / cos †e ; K x ˇ ˇ a.x/; K x 2 x 2 jxj jxj
(9)
for all x 2 @† and 0 < C13 < 1 independent of x. We use the formula hy; zi DW cos.†y;z / jyj jzj for vectors in Rn , where †y;z denotes the angle 0 †y;z between y and z. Going to R2 and setting C14 .x/ WD cos.†ex ; K .x/ /; C15 .x/ WD sin.†ex ; K .x/ /;
2308 1.8
M. Grothaus and T. Raskop
2
1.6 Σ
∂Σ 1.4 1.2 ex 1
– v (x) 0.8
v K (KΣ(x))
x
a (x) ex
0.6 ΣK
0.4
KΣ(x)
0.2 ∂ΣK 0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Fig. 4 Nonadmissible direction for the outer problem
we can explicitly characterize the nonadmissible direction as ˇ ˇ ˇ C14 .x/ ˇ ˇ; †a.x/;ex D tan1 ˇˇ C15 .x/ ˇ if C15 .x/ ¤ 0, and †a.x/;ex D 2 , if C15 .x/ D 0. Generally, transforming the problem to an inner setting transforms the conditions for the coefficients a and b. There are circumstances in which we have the same nonadmissible direction as for the inner problem, i.e., the tangential directions are nonadmissible. For example, this is the case if @† is a sphere around the origin. In Fig. 4, the situation for † R2 is illustrated; the dashed line indicates the nonadmissible direction, which occurs because of the transformed regularity condition hT3 .a/, K i > C14 > 0, see Eq. (7).
4.3
Ritz-Galerkin Method
In this section, we provide a Ritz-Galerkin method for the weak solution to the outer problem. Therefore, we use the approximation of the weak solution to the corresponding inner problem, provided in chapter Gauss’ and Weber’s “Atlas of Geomagnetism” (1840) Was not the First: The History of the Geomagnetic Atlases.
Oblique Stochastic Boundary-Value Problem
2309
Assume † to be an outer C 1;1 -domain. Furthermore, let a 2 H 1;1 .@†I Rn /, b 2 0 1 .†/ such that the condition in Eq. L1 .@†/, g 2 H 2 ; 2 .@†/, and f 2 H 1;2 jxj2 ;jxj3
(7) and (8) are fulfilled. We want to approximate the weak solution u to the outer oblique boundary problem, provided by Theorem 8. Let a and F be defined by a. ; /
WD
R †
i D1 H
H
1 ;2 2 .@†/
1 ;2 2 .@†/
T3 .a/i iK ; .r@† /i
hT3 .a/; K i
.r ; r/d n
F . / WD
n P
;
T2 .g/ hT3 .a/; K i
R @†
1
1
H 2 ;2 .@†/
T4 .b/ dH n1 hT3 .a/; K i
H 2 ;2 .@†/
H 1;2 .†/ h ; T1 .f /i.H 1;2 .†//0
for , 2 H 1;2 .†K /. Furthermore, let .Vn /n2N be an increasing sequence of finitedimensional subspaces of H 1;2 .†K /, i.e., Vn VnC1 such that [ Vn D H 1;2 .†K /. n2N
Then there exists for each n 2 N a unique n 2 Vn with a. ; n / D F . /
for all 2 Vn I
see Sect. 3.4. Moreover, n can be computed explicitly by solving a linear system of equations. In Sect. 3.4, we have also seen that n!1
jj n jjH 1;2 .†/ C16 dist.; Vn / ! 0: So, using the continuity of the operator K (see Theorem 6), we consequently get the following result. Theorem 10. Let u be the weak solution provided by Theorem 8 to the outer problem and , .n /n2N taken from Theorems 1 and 3, both corresponding to a, b; g; f , and †, given at the beginning of this section. Then n!1
jju K.n /jjH 1;2 .†/ C17 dist.; Vn / ! 0:
4.4
Stochastic Extensions and Examples
In this section, we implement stochastic inhomogeneities as well as stochastic weak solutions for the outer setting. Again we start by defining the spaces of stochastic functions. So, let † be an outer C1;1 -domain and .; F ; P/ a probability space, arbitrary but fixed, such that L2 .,P) is separable. We define
2310
M. Grothaus and T. Raskop
H 2;2 1 ;
1 ;1 jxj2 jxj
2
.†/
WD L .; P / ˝ H
H 1;2 .†/ 1 ; 1
jxj2 jxj
L2jxj2 .†/
.†/ Š L ; P I H 2;2 1 ;1 ;
0
2
1 ; 1 jxj2 jxj
1 ;1 jxj2 jxj
WD L2 .; P / ˝ H 1;2 .†/ 1 ; 1 jxj2 jxj
jxj2 jxj
Š L2 ; P I L2jxj2 .†/ ;
0 0 1;2 1;2 2 ; P I H ; WD L2 .; P /˝ Hjxj .†/ ŠL .†/ 2 ;jxj3 jxj2 ;jxj3
H .@†/
WD L2 .; P / ˝ H 2 ;2 .@†/
Š L2 ; P I L2 .@†/ ;
L2 .@†/
WD L2 .; P / ˝ L2 .@†/
Š L2 ; P I L2 .@†/ ;
1 2 ;2
1
1 ;2
1
WD L2 .; P / ˝ H 2 ;2 .@†/
H 2 .@†/
.†/ ;
Š L2 ; P I H 1;2 .†/ ; 1 ; 1
WD L2 .; P / ˝ L2jxj2 .†/
1;2 Hjxj 2 ;jxj3 .†/
2;2
1 Š L2 ; P I H 2 ;2 .@†/ ;
Because all spaces above are separable, we can again use the isomorphisms to Hilbert space-valued random variables. Thus, we can prove the following main result of this section by defining the stochastic solution operator pointwise. Theorem 11. Let † be an outer C1;1 -domain, a 2 H 1;1 .@†I Rn /, b 2 L1 .@†/, 1 0 ;2 1;2 g 2 H 2 .@†/, and f 2 Hjxj , such that the conditions in Eq. (7) 2 ;jxj3 .†/ and (8) hold. Then we define out u.; !/ WD Sa;b .f .; !/; g.; !//;
for dP–almost all ! 2 . u is called the stochastic weak solution to the outer oblique boundary problem for the Poisson equation. Furthermore, we have for a constant 0 < C18 < 1 jjujj0
1
B 1;2 @H 1 jxj2
;
C 1 .†/A jxj
C18 jjf jj
0
1;2 .†/ jxj2 ;jxj3
H
C jjgjj
21 ;2
H
: .@†/
Moreover, we have the following result for a stochastic strong solution. n 1;1 Theorem 12. Let † be an outer C2;1 -domain, a 2 H 2;1 .@†I .@†/ R /, b 2 H such that the conditions in Eq. (7) and (8) hold. If f 2 L2jxj2 .†/ and g 2 1 2;2 2 ;2 H .†/, then we have u 2 H 1 ; 1 ; .†/ , for u provided by Theorem 11, and jxj2 jxj 1
Oblique Stochastic Boundary-Value Problem
2311
u.x; !/ D f .x; !/; ha.y/; ru.y; !/i C b.y/u.y; !/ D g.y; !/; for n –almost all x 2 †, for H n1 –almost all y 2 @†, and for dP-almost all ! 2 . Furthermore, we have an a priori estimate C19 jjf jjL2
1
jjujj0 B 2;2 @H 1 jxj2
1 ; jxj ;1
C .†/A
jxj2
.†/
C jjgjj
1 ;2 H2 .@†/
;
with a constant 0 < C19 < 1. Again a Ritz-Galerkin method is available also for the stochastic weak solution. It is left to the reader to write down the details. As mentioned, we close the section with examples for stochastic data. These are used in geomathematical applications in order to model noise on measured values. In the following, we give examples for the outer problem. They are also suitable for the inner problem.
Gaussian Inhomogeneities We choose the probability space .; F ; P/ such that Xi , 1 i n1 , are P ˝ n measurable and Yj , 1 j n2 , are P ˝ Hn1 -measurable with Xi .; x/, x 2 †, and Yj .; x/, x 2 @†, Gaussian random variables with expectation value 0, and variance f2i .x/ or variance g2j .x/, respectively. Here, fi 2 L2jxj2 .†/ and gj 2 L2 .@†/. We define f .!; x/ WD f .x/ C
n1 X
Xi .!; x/;
g.!; x/ WD g .x/ C
i D1
n2 X
Yj .!; x/;
j D1
where f 2 L2jxj2 .†/ and g 2 L2 .@†/. To use such kind of inhomogeneities, we must show f 2 L2 . †; P ˝ jxj4 n /
and g 2 L2 . @†; P ˝ H n1 /:
It is easy to see that the inhomogeneities defined in this way fulfill these requirements and the main results are applicable. Such a Gaussian inhomogeneity is shown in Fig. 5.
Gauss-Markov Model Here, we refer to Freeden and Maier (2002), in which an application of the example from the previous section can be found. The authors use a random field h.!; x/ WD H .x/ C Z.!; x/
2312
M. Grothaus and T. Raskop
Fig. 5 Data with Gaussian noise E(h( ., X))
∂Γ
to model an observation noise, where x 2 @B1 .0/ R3 and ! 2 with a probability space .; F ; P/. Here, we have that Z.; x/, x 2 @B1 .0/, is a Gaussian random variable with expectation value 0 and variance 2 > 0. Additionally, H .x/ 2 L2 .@B1 .0// and the covariance is given by cov.Z.; x1 /; Z.; x2 // D K.x1 ; x2 /; where K W @B1 .0/ @B1 .0/ ! R is a suitable kernel. Two geophysically relevant kernels are, e.g., K1 .x1 ; x2 / WD K2 .x1 ; x2 / WD
PM 2nC1 2 nD1 4 Pn .x1 .M C1/2 2 exp.c.x1 x2 //: exp.c/
x2 /;
0 M < 1;
Pn , 1 n M, are the Legendre polynomials defined on R. The noise model corresponding to the second kernel is called the first-degree Gauss-Markov model. If one chooses a P˝Hn1 -measurable random field Z, then h fulfills the requirements. The existence of a corresponding probability measure P is provided in infinitedimensional Gaussian analysis; see, e.g., Berezanskij (1995).
Noise Model for Satellite Data In this section, we give another precise application, which can be found in Bauer (2004). Here, the authors are using stochastic inhomogeneities to implement a noise model for satellite data. Therefore, random fields of the form h.!; x/ WD
m X
hi .x/Zi .!/
i D1
are used, where x 2 @† R3 and ! 2 with a suitable probability space .; F ; P/. Here, @† could be, e.g., the Earth’s surface, and we are searching for harmonic functions in the space outside the Earth. Zi are Gaussian random variables with expectation value 0 and variance i2 > 0 and hi fulfilling the assumptions of
Oblique Stochastic Boundary-Value Problem
2313
0;i , where Sect. 4.4. If one chooses .; F ; P/ as Rm ; B.R/; cov ij 0;i cov WD ij
p
1 .2/m det.A/
e
aij WD cov.Zi ; Zj /;
1 .y; A1 y/ m 2 d ;
1 i; j m;
one has arealizationof Zi as the projection on the ith component in the separable 0;i . space L2 Rm ; cov ij
5
Future Directions
In this section, we want to point out one direction of further investigations. We have seen how to provide the existence of a weak solution to the outer oblique boundary problem for the Poisson equation. Therefore, we introduce several transformations. In Theorem 3, we proved for the transformation of the space inhomogeneity f 0 0 1;2 T1 W Hjxj ! H 1;2 .†K / : 2 ;jxj3 .†/ 0 1;2 This transformation is not bijective, i.e., T1 Hjxj2 ;jxj3 .†/ ¤ .H 1;2 .†K //0 . 0 Finding a Hilbert space V such that T1 W V ! H1;2 .†K / is bijective would lead to the existence of a weak solution for an even larger class of inhomogeneities. Moreover, we have for the transformation K of the weak solution to the inner problem K W H 1;2 .†K / ! H 1;21 jxj2
1 ; jxj
.†/; where again K.H 1;2 .†K // ¤ H 1;21 jxj2
1 ; jxj
.†/I
see Theorem 6. Finding a Hilbert space W such that K W H 1;2 .†K / ! W is bijective would give us uniqueness of the solution and more detailed information about the behavior of u and its weak derivatives, when x is tending to infinity. Additionally, we would be able to define a bijective solution operator for the outer problem. This could be used to find the right Hilbert spaces such that a Poincaré inequality is available. Consequently, the Lax-Milgram lemma would be applicable directly to a weak formulation for the outer setting, which can be derived similar to the inner problem. Then we might have to consider a regular outer problem, because the tangential direction is forbidden for the oblique vector field if we want to derive a weak formulation. In turn, we get rid of the transformed regularity condition on a. The results presented in this chapter are then still an alternative in order to get weak solutions for tangential a. Moreover, the availability of a Poincaré inequality would lead to existence results for weak solutions to a broader class of
2314
M. Grothaus and T. Raskop
second-order elliptic partial differential operators in outer domains. See, e.g., Alt (2002) for such second-order elliptic partial differential operators for inner domains. Instead of using the Ritz-Galerkin approximation, it is possible to approximate solutions to oblique boundary-value problems for harmonic functions with the help of geomathematical function systems, e.g., spherical harmonics. For such an approach, see, e.g., Freeden and Michel (2004).
6
Conclusion
The analysis of inner oblique boundary-value problems is rather well understood, and we reached the limit when searching for weak solutions under as weak assumptions as possible. The outer problem still causes problems because of the unboundedness of the domain. As mentioned in Sect. 5, finding the right distribution spaces such that a Poincaré inequality holds might lead to bijective solution operators for an even broader class of inhomogeneities. Nevertheless, we are already able to provide weak solutions to the outer problem, as presented in the previous sections for very general inhomogeneities. Also stochastic weak solutions for stochastic inhomogeneities as used in geomathematical applications can be provided, and approximation methods for the weak solutions are available.
References Adams RA (1975) Sobolev spaces. Academic, New York Alt HW (2002) Lineare Funktionalanalysis. Springer, Berlin Bauer F (2004) An alternative approach to the oblique derivative problem in potential theory. Shaker, Aachen Berezanskij YM (1995) Spectral methods in infinite dimensional analysis. Kluwer, Dordrecht Dautray R, Lions JL (1988) Mathematical analysis and numerical methods for science and technology. Functional and variational methods, vol 2. Springer, Berlin Dobrowolski M (2006) Angewandte Funktionalanalysis. Springer, Berlin Freeden W, Maier T (2002) On multiscale denoising of spherical functions: basic theory and numerical aspects. Electron Trans Numer Anal 14:56–78 Freeden W, Michel V (2004) Multiscale potential theory (with applications to geoscience). Birkhäuser, Boston Gilbarg D,